Revision history for Langertha 0.500 2026-04-26 18:50:51Z !!! Heads-up for callers upgrading from 0.404 — items marked [BREAKING] !!! below may need code changes; everything else is additive. [BREAKING] Langertha::Response->tool_calls is now ArrayRef[Langertha:: ToolCall] (was ArrayRef[HashRef]). Code that read $r->tool_calls->[0] ->{name}/->{arguments}/->{id}/->{synthetic} as hash keys must switch to ->name / ->arguments / ->id / ->synthetic method calls. The Response constructor still accepts the old HashRef form and upgrades transparently (BUILDARGS), so passing tool_calls in is unchanged; only consumption changed. tool_call_args() is unchanged. [BREAKING] Langertha::Engine::Whisper no longer extends Langertha::Engine::OpenAI. It now extends the new Langertha::Engine::TranscriptionBase, so a Whisper instance no longer has simple_chat / chat_f / chat_with_tools_f / embedding / simple_image / Tools / ImageGeneration / Embedding methods. Existing code that called only transcription methods is unaffected. To get a Whisper handle from an OpenAI engine without restating credentials use the new $openai->whisper attribute. [BREAKING] Langertha::Role::ResponseFormat::decode_loose_json is now a method on the role, not a free function. Code that called Langertha::Role::ResponseFormat::decode_loose_json($text) directly must switch to $engine->decode_loose_json($text). This makes it overridable per engine for providers that need a custom strategy. The standalone Langertha::Util that briefly existed has been removed for the same reason. - New Langertha::Engine::TranscriptionBase: slim base class for OpenAI-shape transcription-only engines (composes OpenAICompatible, OpenAPI, Models, Transcription, Capabilities — no Chat / Tools / Embedding / ImageGeneration). Whisper now extends it. - Langertha::Engine::OpenAI gained a `whisper` lazy attribute that returns a Langertha::Engine::TranscriptionBase configured with the parent's api_key/url and `whisper-1` as transcription_model. `$openai->whisper->simple_transcription($file)` is the canonical way to use OpenAI's hosted Whisper from a chat-side engine. - New Langertha::Role::Capabilities, composed by Langertha::Role:: Chat (and therefore present on every engine via composition). One central role-to-flag map drives engine_capabilities; engines override via `around engine_capabilities` for wire-reality corrections. Capabilities reported by each role: Chat -> chat Streaming -> streaming Tools -> tools_native + tool_choice_{auto,any,none,named} HermesTools -> tools_hermes ResponseFormat -> response_format_json_object/json_schema Embedding -> embedding Transcription -> transcription ImageGeneration -> image_generation Temperature -> temperature Seed -> seed ContextSize -> context_size ResponseSize -> response_size SystemPrompt -> system_prompt ParallelToolUse -> parallel_tool_use The earlier `does()`-based heuristic in Role::Chat is gone; `$engine->supports($cap)` is the canonical query. - Langertha::Tool gained from_mcp (camelCase inputSchema), from_gemini (flat `parameters`), to_gemini, to_mcp, to_json_schema. from_hash now auto-detects MCP / Anthropic / Gemini shapes in addition to OpenAI. This kills the input_schema/inputSchema/parameters/ function.parameters chaos that used to live in chat_f. - Langertha::ToolCall gained a `synthetic` boolean attribute (false by default) and a from_gemini constructor; ToolCall->extract now pulls Gemini functionCall parts out of candidates[0].content.parts. - Langertha::Response.tool_calls is now populated by every native tool-calling engine (OpenAICompatible, AnthropicBase, Gemini, Ollama) as well as the chat_f synthetic-tool fallback path. Single source of truth — same shape regardless of provider. Langertha::Response gained tool_call($name) returning the matching Langertha::ToolCall object (vs. tool_call_args returning args). - Langertha::Stream::Chunk gained an optional tool_calls attribute (ArrayRef[Langertha::ToolCall]). Langertha::Role::Chat got aggregate_tool_calls($chunks) for collecting them after a stream ends. Per-engine streaming tool-call delta accumulation will land incrementally; the structures are in place. - Langertha::Engine::AnthropicBase, Langertha::Engine::Gemini, and Langertha::Engine::Ollama now compose Langertha::Role:: ResponseFormat. Anthropic emulates response_format via a synthesized tool plus forced tool_choice (the chat_response parser lifts the resulting tool_use input back into Response.content as JSON). Gemini translates response_format into generationConfig (responseMimeType + responseSchema). Ollama translates into the `format` parameter (string 'json' for json_object, schema HashRef for json_schema). The legacy Ollama json_format attribute still works as a fallback when response_format isn't set. - Langertha::Engine::OpenAIBase now composes Langertha::Role::ResponseFormat, so every OpenAI-compatible engine (Perplexity, DeepSeek, Groq, Mistral, MiniMax, Cerebras, OpenRouter, Replicate, HuggingFace, AKIOpenAI, TSystems, Scaleway, Ollama-OpenAI, vLLM, SGLang, LlamaCpp, NousResearch) accepts a response_format constructor argument. Removed redundant individual ResponseFormat composition from those engines. - Langertha::ToolChoice gained to_perplexity (string-only API: auto/none/ required, named coerces to required) and to_gemini (toolConfig. functionCallingConfig with mode AUTO/ANY/NONE plus allowed_function_names for named forcing) serializers. - Langertha::Engine::Gemini chat_request and chat_stream_request now translate tool_choice in any input shape (canonical / OpenAI / Anthropic) into Gemini's toolConfig payload. - Langertha::Role::Chat got chat_f, a named-arguments async entry point: $engine->chat_f(messages => [...], tools => [...], tool_choice => ..., response_format => ...). simple_chat_f delegates to it; existing @messages-style call sites are unchanged. Forced-named tool calls on engines that lack native named-tool-forcing but support json_schema response_format (currently Perplexity) are auto-rewritten through the response_format path; the response text is loose-parsed (handles ```json fences and prose-wrapped JSON) and a synthetic tool_calls entry is attached so callers see the same shape regardless of provider. - Langertha::Response gained a tool_calls attribute and tool_call_args accessor; clone_with carries tool_calls through. - Langertha::Role::Chat exposes engine_capabilities (default derived from role composition) and a supports($cap) helper so software can query what the engine can honour before sending parameters. - Langertha::Role::ResponseFormat gained decode_loose_json($text), a tolerant decoder for structured-output responses that may be wrapped in code fences or prose. - New Langertha::Engine::TSystems for the T-Systems AI Foundation Services / LLM Hub OpenAI-compatible endpoint (https://llm-server.llmhub.t-systems.net/v2). Bearer auth via LANGERTHA_TSYSTEMS_API_KEY, default model gpt-oss-120b (T-Cloud, Germany; reliable tool calling), supports chat, streaming, tool calling, embeddings (default text-embedding-bge-m3) and structured output. GDPR-compliant; T-Cloud models are processed in Germany, hyperscaler models in the EU. - New Langertha::Engine::Scaleway for Scaleway Generative APIs (https://api.scaleway.ai/v1) — EU-hosted, drop-in OpenAI-compatible replacement. Bearer auth via LANGERTHA_SCALEWAY_API_KEY, default model llama-3.1-8b-instruct, supports chat, streaming, tool calling, embeddings and structured output. 0.404 2026-04-21 14:06:44Z - New Langertha::Content role and Langertha::Content::Image value object for provider-agnostic vision input. Mirrors the Langertha::ToolChoice pattern: one canonical block (from_url / from_file / from_data / from_base64) serializes to OpenAI image_url, Anthropic image source (URL or base64), and Gemini inline_data via to_openai / to_anthropic / to_gemini. Gemini auto-downloads URL-only images on first call because it has no URL source equivalent; media_type is sniffed from the extension or the fetched Content-Type header. - Langertha::Role::Chat gained content_format ('openai' by default, 'anthropic' on AnthropicBase, 'gemini' on Gemini) and a normalization pass in chat_messages: a user message whose content is an arrayref containing Langertha::Content objects is converted to the engine's native wire format (bare strings in the array are wrapped as text blocks, and Gemini messages are rebuilt into role/parts with assistant -> model). Messages without Langertha::Content objects are passed through untouched, so existing callers are unaffected. - Fixes the "messages.0.content.1: Input tag 'image_url' ... does not match 'image'" 400 from Anthropic when the same [text + image] prompt was reused across engines: the canonical block is what callers author, each engine produces its own format. 0.403 2026-04-21 12:04:54Z - Fixed "Wide character in subroutine entry" crash on non-ASCII JSON responses. Role::JSON's shared instance is configured with utf8=>1 (bytes in/out), but parse_response and execute_streaming_request were feeding it Perl-Unicode via $response->decoded_content, which blew up the first time a response body contained a non-ASCII byte (Umlaut, em-dash, CJK, emoji). Both entry points now use $response->content (raw bytes), keeping the pipeline consistent with the outgoing side. The two spots that re-decode JSON substrings out of an already-decoded tree (OpenAICompatible's extract_tool_call for tool_call.function.arguments, and HermesTools' response_tool_calls for XML bodies) now go through a new Role::JSON::decode_json_text helper that centralizes the encode_utf8 bridge. - format_tools in OpenAICompatible, AnthropicBase, Gemini, and Ollama now accept input_schema, inputSchema, or parameters as the schema key (snake_case preferred, camelCase for MCP spec compatibility, parameters as OpenAI-style fallback). Matches the defensive lookup already done by Langertha::Tool::from_hash and Raider::tools_as_mcp, which mix both styles internally. 0.402 2026-04-20 22:07:40Z - [BREAKING] Langertha::Engine::MiniMax now talks to MiniMax's native OpenAI-compatible endpoint (https://api.minimax.io/v1) instead of the Anthropic-compatible shim. The previous behavior is preserved as a new class Langertha::Engine::MiniMaxAnthropic (URL corrected to /anthropic/v1 as MiniMax actually documents). Background: MiniMax's /anthropic endpoint does not reliably re-parse stringified tool-call arguments, causing intermittent tool-calling failures where the Anthropic SDK sees a wrapper object whose key rotates between 'result', 'arguments', and the tool name. MiniMax's native OpenAI endpoint avoids the shim entirely. Users who need the Anthropic wire format should switch from Langertha::Engine::MiniMax to Langertha::Engine::MiniMaxAnthropic. The default model is now MiniMax-M2.7 (was MiniMax-M2.5) on both classes. - Automatic tool_choice normalization. chat_request in OpenAICompatible and AnthropicBase now runs any tool_choice passed via %extra through Langertha::ToolChoice and emits the target-engine's native format. Callers can pass Anthropic-style (type+name), OpenAI-style (type:function + function.name), or string shorthands ('auto', 'none', 'required', 'any') to any engine — no more engine-specific branching needed. - New Langertha::Role::ParallelToolUse with a canonical `parallel_tool_use` boolean attribute. Constructor also accepts the provider-native alias names: `parallel_tool_calls` (OpenAI) and `disable_parallel_tool_use` (Anthropic, inverted). The attribute is translated per-engine to the native request parameter — OpenAI sends `parallel_tool_calls`, Anthropic folds `disable_parallel_tool_use` into the tool_choice block. Automatically composed by Langertha::Role::Tools so every tool-capable engine gets it. - Langertha::ToolChoice accepts 'any' as a string shorthand and {type:'required'} as a hash form (both normalize to canonical type 'any'). - Added MiniMax-M2.7 to the static model list and made it the default. 0.401 2026-04-12 21:24:49Z - Guard list_models in OpenAICompatible against engines that do not support listModels operation; use StaticModels for Perplexity and NousResearch instead of hitting a 404. - Fix Moose warning in ToolChoice by importing only enum from Moose::Util::TypeConstraints. 0.400 2026-04-07 23:01:11Z - New value object Langertha::Usage for token counting with from_hash / from_response constructors and to_openai/anthropic/ollama_format serializers. - New value object Langertha::Cost for the monetary cost of a single LLM call (input_usd / output_usd / total_usd / currency). - New Langertha::Pricing — model→rule catalog with cost_for(usage, model) returning a Cost. - New Langertha::UsageRecord — Usage + Cost + tagged metadata (provider, engine, model, route, api_key_id, duration_ms, tool counts) with to_hash for ledger storage. - New value object Langertha::Tool for canonical tool definitions with from_openai / from_anthropic / from_list constructors and to_openai / to_anthropic / to_ollama / to_hash serializers. - New value object Langertha::ToolCall for canonical tool invocations with from_openai / from_anthropic / from_ollama / extract / extract_hermes_from_text constructors and to_openai / to_anthropic_block / to_ollama serializers. - New value object Langertha::ToolChoice with enum-typed canonical type ('auto' / 'any' / 'none' / 'tool'), auto/any/none/specific shortcut constructors, and to_openai / to_anthropic conversions. - Refactor Langertha::Metrics, Langertha::Input, Langertha::Output, Langertha::Input::Tools, Langertha::Output::Tools into thin backwards-compatibility facades over the new value objects. External APIs unchanged; existing callers keep working. - The five facade modules now emit a one-time Carp::carp at load time pointing callers at the new value objects. - dist.ini sets irc = #langertha so PodWeaver injects an IRC support block in every module's POD. 0.309 2026-04-05 16:37:32Z - Fix Moose role composition: consolidate all separate `with` calls into single `with map { 'Langertha::Role::'.$_ } qw(...)` form across all engines; this exposed a real role conflict between Role::OpenAICompatible and Role::OpenAPI on `_build_openapi_operations`. - Fix role conflict: remove `_build_openapi_operations` from Role::OpenAICompatible (wrong place), define it in Engine::OpenAIBase (the consuming class) using `use_module` instead of `require` hack. - Apply same `use_module('Langertha::Spec::*')->data` pattern to Engine::Ollama, Engine::Mistral, Engine::LMStudio. - Add missing `make_immutable` to Engine::Whisper and Request::HTTP. - Remove unused `namespace::autoclean` from Stream and Stream::Chunk. 0.308 2026-04-04 15:03:20Z 0.307 2026-03-10 17:42:28Z - Add new OpenAI-compatible self-hosted engine: Langertha::Engine::SGLang. - Add engine-scope module discovery via Module::Pluggable in Langertha: `available_engine_classes`, `available_engine_ids`, and generic `discover_modules_in_scope`. - Update `resolve_engine_class` to use discovered module scope (`Langertha::Engine::*` + `LangerthaX::Engine::*`) with deterministic core-first lookup. - Add `Langertha->new_engine($name_or_class, %args)` helper for resolve+load+construct in one call. - Document third-party custom engines under `LangerthaX::Engine::*` and include resolver behavior in docs. - Add tests for discovered engine classes/ids and LangerthaX fallback (`t/99-engine-resolution.t` + `t/lib` fixture module). - Extend load/hierarchy/readme coverage for the new SGLang engine. - Add `Module::Pluggable` as a direct runtime dependency. 0.306 2026-03-10 13:37:01Z - Add new shared core modules for cross-format normalization: Langertha::Input(+::Tools), Langertha::Output(+::Tools), and Langertha::Metrics. - Core modules centralize tool schema conversion (OpenAI/Anthropic/Ollama), Hermes XML extraction/normalization, and usage/cost metric normalization. - Add core tests t/97_input_output.t and t/98_metrics.t and extend t/00_load.t. 0.305 2026-03-08 21:51:01Z - New engine base class: Langertha::Engine::AnthropicBase for Anthropic-compatible APIs (shared /v1/messages chat/streaming/tool/model handling and Anthropic rate-limit parsing). Anthropic now extends this base, and MiniMax + LMStudioAnthropic were migrated to extend it too. - New engine: Langertha::Engine::LMStudio — native LM Studio local REST API adapter (POST /api/v1/chat, SSE streaming with message.delta/chat.end, GET /api/v1/models). Supports optional bearer auth via LANGERTHA_LMSTUDIO_API_KEY, plus basic auth via URL userinfo. Includes openai() helper returning a Langertha::Engine::LMStudioOpenAI instance for LM Studio's /v1 endpoint. - New engine: Langertha::Engine::LMStudioOpenAI for LM Studio's OpenAI-compatible /v1 endpoint (defaults api_key to C). - New engine: Langertha::Engine::LMStudioAnthropic for LM Studio's Anthropic-compatible /v1/messages endpoint. Includes LMStudio->anthropic helper for easy conversion from native engine instances; defaults api_key to C. - New OpenAPI spec: share/lmstudio.yaml with operationIds for LM Studio native chat and model listing, plus Langertha::Spec::LMStudio for pre-computed operation lookup. - Tests: extend t/00_load.t, t/10_engine_hierarchy.t, and t/11_basic_auth.t to cover LMStudio loading, inheritance/roles, request mapping, and auth behavior. Extend t/83_live_chat.t with optional LM Studio live coverage via TEST_LANGERTHA_LMSTUDIO_URL, TEST_LANGERTHA_LMSTUDIO_MODEL, and TEST_LANGERTHA_LMSTUDIO_API_KEY. - Documentation: add POD for LMStudio, LMStudioOpenAI, and LMStudioAnthropic helpers/attributes and expand README examples to include explicit LMStudioOpenAI/LMStudioAnthropic class usage. - Orchestration foundation on top of Raider: add Langertha::Role::Runnable (run_f contract), Langertha::RunContext (input/state/artifacts/metadata/trace + branch/merge), Langertha::Raid base class, and concrete orchestrators Langertha::Raid::Sequential, Langertha::Raid::Parallel, Langertha::Raid::Loop. Supports nested composition of Raider and Raid nodes. - Unified result model: add Langertha::Result as common result abstraction (final/question/pause/abort), and make Langertha::Raider::Result a backward-compatible subclass so Raider and Raid share the same result semantics. - Raider compatibility + interface: Raider now composes Langertha::Role::Runnable and exposes run_f($ctx) as an orchestration-friendly wrapper around raid_f while keeping existing public raid_f/respond_f behavior intact. - Raider fixes: _gather_tools_f now uses the active engine (not always the default engine), and Langfuse model parameters are recalculated after engine/tool dirtiness refresh during runtime engine switching. - Raider respond_f consistency: plugin_after_tool_call hooks are now applied to remaining tool calls during continuation flow (self-tools and MCP tools), matching main loop behavior. - Tests: add t/96_raid_orchestration.t covering Runnable compatibility, sequential/ parallel/loop orchestration, nested Raid trees, context propagation and parallel isolation/merge semantics, result propagation (final/question/pause/abort), and error paths for all orchestrator types. Extend t/00_load.t to include new modules. - Documentation: add inline POD for all new orchestration/result/context modules and refresh Raider::Result POD to reflect shared result inheritance. Extend README with a new "Raid — Workflow Orchestration" section (RunContext, Sequential/Parallel/Loop, unified results, nesting), plus a top-level table of contents, architecture overview, and a minimal sequential orchestration example. 0.304 2026-03-07 02:05:27Z - New role: Langertha::Role::HermesTools — extracted Hermes-style XML tool calling into a dedicated role. Engines compose this role instead of setting a hermes_tools flag. Cleaner polymorphic dispatch: Role::Tools provides the tool loop and default native API path, HermesTools overrides build_tool_chat_request to inject tools into the system prompt. - Role::Tools cleaned up: removed all hermes branching, private _hermes_* methods, and hermes_tools attribute. Five polymorphic methods (format_tools, response_tool_calls, extract_tool_call, format_tool_results, response_text_content) are now provided by either the engine (native) or HermesTools (XML). - AKI.pm (native API): added tool calling support via HermesTools role with hermes_extract_content override for AKI's response format. - AKIOpenAI.pm: composes HermesTools role (replaces hermes_tools flag). - NousResearch.pm: composes HermesTools role (replaces hermes_tools flag). - Raider and Chat: simplified tool loop — removed all hermes if/else branching, uses polymorphic build_tool_chat_request. 0.303 2026-03-01 03:24:11Z 0.302 2026-02-27 03:48:44Z - Fix list_models URL construction: add overridable list_models_path method to Role::OpenAICompatible (default: /models). Mistral overrides to /v1/models. Fixes broken URL for engines whose base URL does not include /v1. - New Role::StaticModels: provides list_models from a hardcoded model list without HTTP requests. Used by MiniMax. - HuggingFace: list_models now queries the Hub API (huggingface.co/api/models) with search, pipeline_tag, and inference_provider filters. Only returns models with active inference providers. 0.301 2026-02-27 01:57:13Z - Rate limit extraction from HTTP response headers: new Langertha::RateLimit data class with normalized requests_limit, requests_remaining, tokens_limit, tokens_remaining, and reset fields plus raw provider-specific headers. Supported providers: OpenAI/Groq/Cerebras/OpenRouter/Replicate/HuggingFace (x-ratelimit-*) and Anthropic (anthropic-ratelimit-*). Engine stores latest rate_limit, Response carries per-response rate_limit with requests_remaining/tokens_remaining convenience methods. - New engine: HuggingFace — HuggingFace Inference Providers (OpenAI-compatible, org/model format, chat + streaming + tool calling) 0.300 2026-02-26 21:03:33Z - Plugin system: Langertha::Plugin base class with lifecycle hooks (plugin_before_raid, plugin_build_conversation, plugin_before_llm_call, plugin_after_llm_response, plugin_before_tool_call, plugin_after_tool_call, plugin_after_raid) and self_tools support. Plugins can be specified by short name (resolved to Langertha::Plugin::* or LangerthaX::Plugin::*). - Langertha::Plugin::Langfuse: Langfuse observability as a plugin (alternative to engine-level Role::Langfuse), with cascading traces, generations, and tool call spans in the Raider loop. - Role::PluginHost: shared plugin hosting for engines and Raider, with plugin resolution, instantiation, and _plugin_instances caching. - Wrapper classes: Langertha::Chat, Langertha::Embedder, Langertha::ImageGen for wrapping engines with optional overrides (model, system_prompt, temperature, etc.) and plugin lifecycle hooks. - Class sugar: `use Langertha qw( Raider )` and `use Langertha qw( Plugin )` for quick subclass setup with auto-import of Moose and Future::AsyncAwait. - Image generation: Role::ImageGeneration with image_model attribute, OpenAICompatible image_request/image_response/simple_image methods, OpenAI now composes ImageGeneration role (default: gpt-image-1). - Role::KeepAlive: extracted keep_alive attribute from Ollama into a reusable role with get_keep_alive accessor. - Ollama: update to current API — use operationIds chat/embed/list/ps (was generateChat/generateEmbeddings/getModels/getRunningModels), embedding response uses embeddings[0] (was embedding). - NousResearch: reasoning_prompt is now a configurable attribute (was hardcoded string). - Groq, Mistral, OpenAI: consolidate `with 'Langertha::Role::Tools'` into the main role composition block. - Log::Any debug/trace logging in Role::Chat, Role::Embedding, Role::HTTP, Role::Tools, and Role::OpenAPI for request lifecycle visibility. - Add Log::Any to cpanfile runtime dependencies. - Update OpenAPI specs: openai.yaml, mistral.yaml, ollama.yaml to latest upstream versions. - Pre-computed OpenAPI lookup tables: ship Langertha::Spec::OpenAI (148 ops), Langertha::Spec::Mistral (67 ops), and Langertha::Spec::Ollama (12 ops) as static Perl data instead of parsing YAML + constructing OpenAPI::Modern at runtime. Startup cost drops from ~16s to <1ms. - New openapi_operations attribute in Role::OpenAPI with automatic fallback: engines that override _build_openapi_operations get the fast path; custom engines using openapi_file still work via the slow YAML/OpenAPI::Modern path. - Add maint/generate_spec_data.pl to regenerate Spec modules from share/*.yaml when specs are updated. - New tests: t/84_live_imagegen.t, t/87_raider_plugins.t, t/89_langertha_sugar.t, t/91_plugin_config.t, t/92_embedder.t, t/93_chat.t, t/94_plugin_langfuse.t, t/95_imagegen.t. 0.202 2026-02-25 03:50:44Z - Engine base class hierarchy: introduce Engine::Remote (JSON + HTTP + url required) and Engine::OpenAIBase (+ OpenAICompatible, OpenAPI, Models, Temperature, ResponseSize, SystemPrompt, Streaming, Chat). All 15 engines now extend these base classes instead of repeating 10+ role composition statements. New engines need only 2-3 lines. - Migrate non-OpenAI engines to extend Engine::Remote: Anthropic, Gemini, Ollama, AKI - Migrate OpenAI-compatible engines to extend Engine::OpenAIBase: OpenAI, DeepSeek, Groq, Perplexity, Mistral, MiniMax, NousResearch, AKIOpenAI, OllamaOpenAI, vLLM (Whisper inherits via OpenAI) - New engine: Cerebras — fastest inference platform (llama-3.3-70b) - New engine: OpenRouter — unified gateway for 300+ models - New engine: Replicate — thousands of open-source models - New engine: LlamaCpp — llama.cpp server with embeddings - OpenAICompatible: api_key is now optional (undef = no Authorization header), enabling local engines (vLLM, llama.cpp) without dummy keys - OpenAICompatible: model is now optional in requests, enabling single-model servers (vLLM, llama.cpp) without explicit model names - Add comprehensive engine hierarchy test (t/10_engine_hierarchy.t) verifying inheritance, role composition, instantiation, and request generation for all 19 engines - Raider self-tools: raider_mcp => 1 enables LLM-controlled tools: raider_ask_user, raider_pause, raider_abort, raider_wait, raider_wait_for, raider_session_history, raider_manage_mcps, raider_switch_engine - Raider engine_catalog: runtime engine switching via self-tool or API - Raider mcp_catalog: dynamic MCP server activation/deactivation - Raider inline tools: quick tool definitions without MCP server setup - Raider::Result: typed result objects (final, question, pause, abort) with backward-compatible stringification - AKI: openai() no longer carries over native model name (different naming between native and /v1 API), uses default model and warns - Add live embedding test (t/82_live_embedding.t) with semantic similarity verification via Math::Vector::Similarity for OpenAI, Mistral, Ollama, OllamaOpenAI, and LlamaCpp - Add live chat test (t/83_live_chat.t) for all 16 engines including Cerebras, OpenRouter, Perplexity, MiniMax, and LlamaCpp 0.201 2026-02-23 03:50:17Z - Add Response.thinking attribute for chain-of-thought reasoning: - Native extraction: DeepSeek/OpenAI-compatible reasoning_content, Anthropic thinking blocks, Gemini thought parts — automatically populated on Response.thinking, no configuration needed - Think tag filter: tag stripping enabled by default on all engines. Handles both closed (...) and unclosed (...) tags. Configurable tag name via think_tag (default: 'think'). Disable with think_tag_filter => 0. Filtering applied across all text paths: simple_chat, streaming, tool calling, and Raider. - Add NousResearch reasoning attribute — enables chain-of-thought reasoning for Hermes 4 and DeepHermes 3 models by prepending the standard Nous reasoning system prompt - Langfuse cascading traces — Raider now creates proper hierarchical Trace → Span (iteration) → Generation (llm-call) / Span (tool) structure instead of flat trace → generation. Iteration spans group the LLM call and its tool calls. Tool spans capture per-tool timing, input, and output. Trace is updated with final output at raid end. - Langfuse: add langfuse_span() for creating span events - Langfuse: add langfuse_update_trace(), langfuse_update_span(), langfuse_update_generation() for updating observations after creation - Langfuse: langfuse_trace() now supports tags, user_id, session_id, release, version, public, and environment fields - Langfuse: langfuse_generation() now supports parent_observation_id, model_parameters, level, status_message, and version fields - Langfuse: Raider generations now include token usage data and model parameters (temperature, max_tokens) when available - Raider: add langfuse_trace_name, langfuse_user_id, langfuse_session_id, langfuse_tags, langfuse_release, langfuse_version, langfuse_metadata attributes for customizing Langfuse trace creation - Refactor all OpenAI-compatible engines to compose Langertha::Role::OpenAICompatible directly instead of extending Langertha::Engine::OpenAI. Each engine now only includes the roles it actually supports (e.g. DeepSeek gets Chat but not Embedding). Removes all "doesn't support X" croak overrides. Affected engines: DeepSeek, Groq, Mistral, MiniMax, NousResearch, Perplexity, vLLM, AKIOpenAI, OllamaOpenAI. - Add Raider context compression — when prompt token usage exceeds a configurable threshold (max_context_tokens * context_compress_threshold), history is automatically summarized via LLM before the next raid. Supports separate compression_engine for using cheaper models. Manual compression via compress_history/compress_history_f. - Add Raider session_history — full chronological archive of ALL messages including tool calls and results, persisted across clear_history and reset. Queryable by the LLM via MCP tool registered with register_session_history_tool(). - Add MiniMax to live tool calling test (t/80_live_tool_calling.t) and live raider test (t/82_live_raider.t) - Add t/83_live_minimax.t: dedicated MiniMax live test covering simple_chat, list_models, and Raider with Coding Plan web search - Add Raider inject() method for mid-raid context injection — queue messages from async callbacks, timers, or other tasks that get picked up at the next iteration naturally - Add Raider on_iteration callback — called before each LLM call (iterations 2+) with ($raider, $iteration), returns messages to inject. Injected messages are persisted in history. - Add Langertha::Engine::MiniMax for MiniMax AI API (chat, streaming, tool calling via OpenAI-compatible API) - Rewrite all POD to inline style across all modules — =attr directly after has, =method directly after sub. Add POD to all previously undocumented modules. - Improve =seealso cross-links: remove redundant main module links, add meaningful related module references 0.200 2026-02-22 21:53:36Z - Add Langertha::Response: metadata container wrapping LLM text content with id, model, finish_reason, usage (token counts), timing, and created fields. Uses overload stringification for backward compatibility — existing code treating responses as strings continues to work. - All chat_response methods now return Langertha::Response objects: - Role::OpenAICompatible: extracts id, model, created, finish_reason, usage - Engine::Anthropic: extracts id, model, stop_reason, input/output_tokens - Engine::Gemini: extracts modelVersion, finishReason, usageMetadata (normalized to prompt_tokens/completion_tokens/total_tokens) - Engine::Ollama: extracts model, done_reason, eval counts, timing fields - Engine::AKI: extracts model_name, total_duration - Add Langertha::Raider: autonomous agent with conversation history and MCP tool calling. Features mission (system prompt), persistent history across raids, cumulative metrics (raids, iterations, tool_calls, time_ms), clear_history and reset methods. Supports Hermes tool calling. Auto-instruments raids with Langfuse traces and per-iteration generation events when Langfuse is enabled on the engine. - Add Langertha::Role::Langfuse: observability integration with Langfuse REST API. Composed into Role::Chat — every engine has Langfuse support built in. Auto-instruments simple_chat with trace and generation events. Batched ingestion via POST /api/public/ingestion with Basic Auth. Disabled by default — active when langfuse_public_key and langfuse_secret_key are set (via constructor or LANGFUSE_PUBLIC_KEY / LANGFUSE_SECRET_KEY / LANGFUSE_URL env vars). - Add ex/response.pl: Response metadata showcase (tokens, model, timing) - Add ex/raider.pl: autonomous file explorer agent example - Add ex/langfuse.pl: Langfuse observability example - Add ex/langfuse-k8s.yaml: Kubernetes manifest for self-hosted Langfuse with pre-configured project and API keys (zero setup) - Add t/70_response.t: Response unit tests across all engine formats - Add t/72_langfuse.t: Langfuse integration tests with mock HTTP - Add t/82_live_raider.t: live Raider integration test - Add Langertha::Role::OpenAICompatible: extracted OpenAI API format methods into a reusable role. Engines that use the OpenAI-compatible API format now compose this role instead of duplicating methods. Engine::OpenAI and all subclasses continue to work unchanged. - Add Langertha::Engine::OllamaOpenAI: first-class engine for Ollama's OpenAI-compatible /v1 endpoint. Ollama's openai() method now returns this engine instead of a raw Engine::OpenAI instance. - Add Langertha::Engine::AKI for AKI.IO native API (chat completions with key-in-body auth, synchronous mode, dynamic endpoint listing via list_models and endpoint_details) - Add Langertha::Engine::AKIOpenAI for AKI.IO via OpenAI-compatible API (chat, streaming, tool calling via Role::OpenAICompatible) - Add Langertha::Engine::NousResearch for Nous Research Inference API with Hermes-native tool calling via XML tags - Add Langertha::Engine::Perplexity for Perplexity Sonar API (chat and streaming only, no tool calling) - Add hermes_tools feature flag to Langertha::Role::Tools for Hermes-native tool calling via / XML tags; enables MCP tool calling on any model that supports the Hermes prompt format, even without API-level tool support - Add hermes_call_tag, hermes_response_tag attributes for custom XML tag names (default: tool_call, tool_response) - Add hermes_tool_instructions attribute for customizing the instruction text without changing the structural XML template - Add hermes_tool_prompt attribute for full system prompt override - Add hermes_extract_content() method for engines to override response content extraction in Hermes mode - MCP tool calling now supported on ALL engines: - OpenAI (inherited by Groq, vLLM, Mistral, DeepSeek) - Anthropic (with Anthropic-native tool format) - Gemini (with Gemini-native functionDeclarations format) - Ollama (OpenAI-compatible tool format) - NousResearch (Hermes-native via XML tags) - Add extract_tool_call() to Role::Tools for engine-agnostic tool call parsing across all provider formats - Fix Gemini tool calling: pass-through native message formats, convert MCP tool results to Gemini's functionResponse object - Fix Gemini chat_request to preserve native parts in messages from tool result round-trips - Remove hardcoded all_models() lists from all engines; model discovery is now exclusively dynamic via list_models() - Update default models: - Anthropic: claude-sonnet-4-6 (short alias) - Gemini: gemini-2.5-flash (2.0-flash deprecated for new users) - Add Hermes tool calling unit test with mock round-trip (t/66_tool_calling_hermes.t) - Add vLLM tool calling unit test (t/65_tool_calling_vllm.t) - Add live integration test for all engines including Ollama, vLLM, and NousResearch (t/80_live_tool_calling.t) with multi-model support - Add mock round-trip test for Ollama tool calling (t/64_tool_calling_ollama_mock.t) using fixture data - Add shared Test::MockAsyncHTTP test helper (t/lib/) for mocking async HTTP in engine tests - Normalize test API key env vars to TEST_LANGERTHA_*_API_KEY prefix to prevent accidental use of production keys - Add TEST_LANGERTHA_OLLAMA_URL and TEST_LANGERTHA_OLLAMA_MODELS env vars for Ollama live testing - Add TEST_LANGERTHA_VLLM_URL, TEST_LANGERTHA_VLLM_MODEL, and TEST_LANGERTHA_VLLM_TOOL_CALL_PARSER env vars for vLLM live testing - Add AKI.IO native API unit test (t/25_aki_requests.t) with mock response parsing for chat, list_models, and endpoint_details - Add AKI.IO live integration test (t/81_live_aki.t) for list_models, endpoint_details, and simple_chat - Add AKI.IO to live tool calling test (t/80_live_tool_calling.t) via OpenAI-compatible API - Add TEST_LANGERTHA_AKI_API_KEY and TEST_LANGERTHA_AKI_MODEL env vars for AKI.IO live testing - Use RFC 2606 test.invalid domain for dummy URLs in unit tests - Add ex/hermes_tools.pl example for Hermes-native tool calling - Rewrite all POD to inline style across all 37 modules — =attr directly after has, =method directly after sub. Add POD to 18 previously undocumented modules. 0.100 2026-02-20 05:33:44Z - Add MCP (Model Context Protocol) tool calling support - New Langertha::Role::Tools for engine-agnostic tool calling - Anthropic engine: full tool calling support (format_tools, response_tool_calls, format_tool_results, response_text_content) - Async chat_with_tools_f() method for automatic multi-round tool-calling loop with configurable max iterations - Requires Net::Async::MCP for MCP server communication - Add Future::AsyncAwait support for async/await syntax - All _f methods (simple_chat_f, simple_chat_stream_f, etc.) - Streaming with real-time async callbacks - Add streaming support - Synchronous callback, iterator, and Future-based APIs - SSE parsing for OpenAI/Anthropic/Groq/Mistral/DeepSeek - NDJSON parsing for Ollama - Add Gemini engine (Google AI Studio) - Add dynamic model listing via provider APIs with caching - Add Anthropic extended parameters (effort, inference_geo) - Improve POD documentation across all modules 0.008 2025-03-30 04:55:38Z - Add Mistral engine integration - Adapt Mistral OpenAPI spec for our parser 0.007 2025-01-25 19:29:51Z - Add DeepSeek engine 0.006 2024-09-30 14:07:25Z - Add Structured Output support - Add Groq engine and Groq Whisper support - Add TEST_WITHOUT_STRUCTURED_OUTPUT env variable 0.005 2024-08-22 13:43:31Z - Fix data type on keep_alive and remove POSIX round usage 0.004 2024-08-13 23:10:57Z - Fix interpretation of max_tokens on Anthropic (response size, not context) 0.003 2024-08-11 00:21:01Z - Add context size and temperature controls 0.002 2024-08-10 02:22:12Z - Add Whisper Transcription API - Add more engines - Fix encoding issues 0.001 2024-08-03 22:47:33Z - Initial release - Unified Perl interface for LLM APIs - Engines: OpenAI, Anthropic, Ollama - Role-based architecture (Chat, HTTP, Models, JSON, Embedding) - OpenAPI spec-driven request generation - Embedding support