Revision history for Langertha

0.500     2026-04-26 18:50:51Z

    !!! Heads-up for callers upgrading from 0.404 — items marked [BREAKING]
    !!! below may need code changes; everything else is additive.

    [BREAKING] Langertha::Response->tool_calls is now ArrayRef[Langertha::
      ToolCall] (was ArrayRef[HashRef]). Code that read $r->tool_calls->[0]
      ->{name}/->{arguments}/->{id}/->{synthetic} as hash keys must switch
      to ->name / ->arguments / ->id / ->synthetic method calls. The
      Response constructor still accepts the old HashRef form and upgrades
      transparently (BUILDARGS), so passing tool_calls in is unchanged;
      only consumption changed. tool_call_args() is unchanged.

    [BREAKING] Langertha::Engine::Whisper no longer extends
      Langertha::Engine::OpenAI. It now extends the new
      Langertha::Engine::TranscriptionBase, so a Whisper instance no
      longer has simple_chat / chat_f / chat_with_tools_f / embedding /
      simple_image / Tools / ImageGeneration / Embedding methods.
      Existing code that called only transcription methods is unaffected.
      To get a Whisper handle from an OpenAI engine without restating
      credentials use the new $openai->whisper attribute.

    [BREAKING] Langertha::Role::ResponseFormat::decode_loose_json is now
      a method on the role, not a free function. Code that called
      Langertha::Role::ResponseFormat::decode_loose_json($text) directly
      must switch to $engine->decode_loose_json($text). This makes it
      overridable per engine for providers that need a custom strategy.
      The standalone Langertha::Util that briefly existed has been
      removed for the same reason.

    - New Langertha::Engine::TranscriptionBase: slim base class for
      OpenAI-shape transcription-only engines (composes OpenAICompatible,
      OpenAPI, Models, Transcription, Capabilities — no Chat / Tools /
      Embedding / ImageGeneration). Whisper now extends it.

    - Langertha::Engine::OpenAI gained a `whisper` lazy attribute that
      returns a Langertha::Engine::TranscriptionBase configured with the
      parent's api_key/url and `whisper-1` as transcription_model.
      `$openai->whisper->simple_transcription($file)` is the canonical
      way to use OpenAI's hosted Whisper from a chat-side engine.

    - New Langertha::Role::Capabilities, composed by Langertha::Role::
      Chat (and therefore present on every engine via composition). One
      central role-to-flag map drives engine_capabilities; engines
      override via `around engine_capabilities` for wire-reality
      corrections. Capabilities reported by each role:
        Chat            -> chat
        Streaming       -> streaming
        Tools           -> tools_native + tool_choice_{auto,any,none,named}
        HermesTools     -> tools_hermes
        ResponseFormat  -> response_format_json_object/json_schema
        Embedding       -> embedding
        Transcription   -> transcription
        ImageGeneration -> image_generation
        Temperature     -> temperature
        Seed            -> seed
        ContextSize     -> context_size
        ResponseSize    -> response_size
        SystemPrompt    -> system_prompt
        ParallelToolUse -> parallel_tool_use
      The earlier `does()`-based heuristic in Role::Chat is gone;
      `$engine->supports($cap)` is the canonical query.

    - Langertha::Tool gained from_mcp (camelCase inputSchema), from_gemini
      (flat `parameters`), to_gemini, to_mcp, to_json_schema. from_hash
      now auto-detects MCP / Anthropic / Gemini shapes in addition to
      OpenAI. This kills the input_schema/inputSchema/parameters/
      function.parameters chaos that used to live in chat_f.

    - Langertha::ToolCall gained a `synthetic` boolean attribute (false
      by default) and a from_gemini constructor; ToolCall->extract now
      pulls Gemini functionCall parts out of candidates[0].content.parts.

    - Langertha::Response.tool_calls is now populated by every native
      tool-calling engine (OpenAICompatible, AnthropicBase, Gemini,
      Ollama) as well as the chat_f synthetic-tool fallback path. Single
      source of truth — same shape regardless of provider.
      Langertha::Response gained tool_call($name) returning the matching
      Langertha::ToolCall object (vs. tool_call_args returning args).

    - Langertha::Stream::Chunk gained an optional tool_calls attribute
      (ArrayRef[Langertha::ToolCall]). Langertha::Role::Chat got
      aggregate_tool_calls($chunks) for collecting them after a stream
      ends. Per-engine streaming tool-call delta accumulation will land
      incrementally; the structures are in place.

    - Langertha::Engine::AnthropicBase, Langertha::Engine::Gemini, and
      Langertha::Engine::Ollama now compose Langertha::Role::
      ResponseFormat. Anthropic emulates response_format via a
      synthesized tool plus forced tool_choice (the chat_response parser
      lifts the resulting tool_use input back into Response.content as
      JSON). Gemini translates response_format into generationConfig
      (responseMimeType + responseSchema). Ollama translates into the
      `format` parameter (string 'json' for json_object, schema HashRef
      for json_schema). The legacy Ollama json_format attribute still
      works as a fallback when response_format isn't set.

    - Langertha::Engine::OpenAIBase now composes Langertha::Role::ResponseFormat,
      so every OpenAI-compatible engine (Perplexity, DeepSeek, Groq, Mistral,
      MiniMax, Cerebras, OpenRouter, Replicate, HuggingFace, AKIOpenAI,
      TSystems, Scaleway, Ollama-OpenAI, vLLM, SGLang, LlamaCpp, NousResearch)
      accepts a response_format constructor argument. Removed redundant
      individual ResponseFormat composition from those engines.
    - Langertha::ToolChoice gained to_perplexity (string-only API: auto/none/
      required, named coerces to required) and to_gemini (toolConfig.
      functionCallingConfig with mode AUTO/ANY/NONE plus allowed_function_names
      for named forcing) serializers.
    - Langertha::Engine::Gemini chat_request and chat_stream_request now
      translate tool_choice in any input shape (canonical / OpenAI / Anthropic)
      into Gemini's toolConfig payload.
    - Langertha::Role::Chat got chat_f, a named-arguments async entry point:
      $engine->chat_f(messages => [...], tools => [...], tool_choice => ...,
      response_format => ...). simple_chat_f delegates to it; existing
      @messages-style call sites are unchanged. Forced-named tool calls on
      engines that lack native named-tool-forcing but support json_schema
      response_format (currently Perplexity) are auto-rewritten through the
      response_format path; the response text is loose-parsed (handles
      ```json fences and prose-wrapped JSON) and a synthetic tool_calls
      entry is attached so callers see the same shape regardless of provider.
    - Langertha::Response gained a tool_calls attribute and tool_call_args
      accessor; clone_with carries tool_calls through.
    - Langertha::Role::Chat exposes engine_capabilities (default derived from
      role composition) and a supports($cap) helper so software can query
      what the engine can honour before sending parameters.
    - Langertha::Role::ResponseFormat gained decode_loose_json($text), a
      tolerant decoder for structured-output responses that may be wrapped
      in code fences or prose.
    - New Langertha::Engine::TSystems for the T-Systems AI Foundation
      Services / LLM Hub OpenAI-compatible endpoint
      (https://llm-server.llmhub.t-systems.net/v2). Bearer auth via
      LANGERTHA_TSYSTEMS_API_KEY, default model gpt-oss-120b (T-Cloud,
      Germany; reliable tool calling), supports chat, streaming, tool
      calling, embeddings (default text-embedding-bge-m3) and structured
      output. GDPR-compliant; T-Cloud models are processed in Germany,
      hyperscaler models in the EU.
    - New Langertha::Engine::Scaleway for Scaleway Generative APIs
      (https://api.scaleway.ai/v1) — EU-hosted, drop-in OpenAI-compatible
      replacement. Bearer auth via LANGERTHA_SCALEWAY_API_KEY, default
      model llama-3.1-8b-instruct, supports chat, streaming, tool
      calling, embeddings and structured output.

0.404     2026-04-21 14:06:44Z

    - New Langertha::Content role and Langertha::Content::Image value object
      for provider-agnostic vision input. Mirrors the Langertha::ToolChoice
      pattern: one canonical block (from_url / from_file / from_data /
      from_base64) serializes to OpenAI image_url, Anthropic image source
      (URL or base64), and Gemini inline_data via to_openai / to_anthropic
      / to_gemini. Gemini auto-downloads URL-only images on first call
      because it has no URL source equivalent; media_type is sniffed from
      the extension or the fetched Content-Type header.
    - Langertha::Role::Chat gained content_format ('openai' by default,
      'anthropic' on AnthropicBase, 'gemini' on Gemini) and a normalization
      pass in chat_messages: a user message whose content is an arrayref
      containing Langertha::Content objects is converted to the engine's
      native wire format (bare strings in the array are wrapped as text
      blocks, and Gemini messages are rebuilt into role/parts with
      assistant -> model). Messages without Langertha::Content objects are
      passed through untouched, so existing callers are unaffected.
    - Fixes the "messages.0.content.1: Input tag 'image_url' ... does not
      match 'image'" 400 from Anthropic when the same [text + image] prompt
      was reused across engines: the canonical block is what callers
      author, each engine produces its own format.

0.403     2026-04-21 12:04:54Z

    - Fixed "Wide character in subroutine entry" crash on non-ASCII JSON
      responses. Role::JSON's shared instance is configured with utf8=>1
      (bytes in/out), but parse_response and execute_streaming_request
      were feeding it Perl-Unicode via $response->decoded_content, which
      blew up the first time a response body contained a non-ASCII byte
      (Umlaut, em-dash, CJK, emoji). Both entry points now use
      $response->content (raw bytes), keeping the pipeline consistent
      with the outgoing side. The two spots that re-decode JSON
      substrings out of an already-decoded tree (OpenAICompatible's
      extract_tool_call for tool_call.function.arguments, and
      HermesTools' response_tool_calls for <tool_call> XML bodies) now
      go through a new Role::JSON::decode_json_text helper that
      centralizes the encode_utf8 bridge.
    - format_tools in OpenAICompatible, AnthropicBase, Gemini, and Ollama
      now accept input_schema, inputSchema, or parameters as the schema
      key (snake_case preferred, camelCase for MCP spec compatibility,
      parameters as OpenAI-style fallback). Matches the defensive lookup
      already done by Langertha::Tool::from_hash and
      Raider::tools_as_mcp, which mix both styles internally.

0.402     2026-04-20 22:07:40Z

    - [BREAKING] Langertha::Engine::MiniMax now talks to MiniMax's native
      OpenAI-compatible endpoint (https://api.minimax.io/v1) instead of
      the Anthropic-compatible shim. The previous behavior is preserved
      as a new class Langertha::Engine::MiniMaxAnthropic (URL corrected
      to /anthropic/v1 as MiniMax actually documents). Background:
      MiniMax's /anthropic endpoint does not reliably re-parse
      stringified tool-call arguments, causing intermittent tool-calling
      failures where the Anthropic SDK sees a wrapper object whose key
      rotates between 'result', 'arguments', and the tool name.
      MiniMax's native OpenAI endpoint avoids the shim entirely. Users
      who need the Anthropic wire format should switch from
      Langertha::Engine::MiniMax to Langertha::Engine::MiniMaxAnthropic.
      The default model is now MiniMax-M2.7 (was MiniMax-M2.5) on both
      classes.
    - Automatic tool_choice normalization. chat_request in OpenAICompatible
      and AnthropicBase now runs any tool_choice passed via %extra through
      Langertha::ToolChoice and emits the target-engine's native format.
      Callers can pass Anthropic-style (type+name), OpenAI-style
      (type:function + function.name), or string shorthands ('auto',
      'none', 'required', 'any') to any engine — no more engine-specific
      branching needed.
    - New Langertha::Role::ParallelToolUse with a canonical
      `parallel_tool_use` boolean attribute. Constructor also accepts the
      provider-native alias names: `parallel_tool_calls` (OpenAI) and
      `disable_parallel_tool_use` (Anthropic, inverted). The attribute is
      translated per-engine to the native request parameter — OpenAI sends
      `parallel_tool_calls`, Anthropic folds `disable_parallel_tool_use`
      into the tool_choice block. Automatically composed by
      Langertha::Role::Tools so every tool-capable engine gets it.
    - Langertha::ToolChoice accepts 'any' as a string shorthand and
      {type:'required'} as a hash form (both normalize to canonical type
      'any').
    - Added MiniMax-M2.7 to the static model list and made it the default.

0.401     2026-04-12 21:24:49Z

    - Guard list_models in OpenAICompatible against engines that do not
      support listModels operation; use StaticModels for Perplexity and
      NousResearch instead of hitting a 404.
    - Fix Moose warning in ToolChoice by importing only enum from
      Moose::Util::TypeConstraints.

0.400     2026-04-07 23:01:11Z

    - New value object Langertha::Usage for token counting with from_hash /
      from_response constructors and to_openai/anthropic/ollama_format
      serializers.
    - New value object Langertha::Cost for the monetary cost of a single
      LLM call (input_usd / output_usd / total_usd / currency).
    - New Langertha::Pricing — model→rule catalog with cost_for(usage,
      model) returning a Cost.
    - New Langertha::UsageRecord — Usage + Cost + tagged metadata
      (provider, engine, model, route, api_key_id, duration_ms, tool
      counts) with to_hash for ledger storage.
    - New value object Langertha::Tool for canonical tool definitions
      with from_openai / from_anthropic / from_list constructors and
      to_openai / to_anthropic / to_ollama / to_hash serializers.
    - New value object Langertha::ToolCall for canonical tool
      invocations with from_openai / from_anthropic / from_ollama /
      extract / extract_hermes_from_text constructors and to_openai /
      to_anthropic_block / to_ollama serializers.
    - New value object Langertha::ToolChoice with enum-typed canonical
      type ('auto' / 'any' / 'none' / 'tool'), auto/any/none/specific
      shortcut constructors, and to_openai / to_anthropic conversions.
    - Refactor Langertha::Metrics, Langertha::Input, Langertha::Output,
      Langertha::Input::Tools, Langertha::Output::Tools into thin
      backwards-compatibility facades over the new value objects.
      External APIs unchanged; existing callers keep working.
    - The five facade modules now emit a one-time Carp::carp at load
      time pointing callers at the new value objects.
    - dist.ini sets irc = #langertha so PodWeaver injects an IRC
      support block in every module's POD.

0.309     2026-04-05 16:37:32Z

    - Fix Moose role composition: consolidate all separate `with` calls into
      single `with map { 'Langertha::Role::'.$_ } qw(...)` form across all
      engines; this exposed a real role conflict between Role::OpenAICompatible
      and Role::OpenAPI on `_build_openapi_operations`.
    - Fix role conflict: remove `_build_openapi_operations` from
      Role::OpenAICompatible (wrong place), define it in Engine::OpenAIBase
      (the consuming class) using `use_module` instead of `require` hack.
    - Apply same `use_module('Langertha::Spec::*')->data` pattern to
      Engine::Ollama, Engine::Mistral, Engine::LMStudio.
    - Add missing `make_immutable` to Engine::Whisper and Request::HTTP.
    - Remove unused `namespace::autoclean` from Stream and Stream::Chunk.

0.308     2026-04-04 15:03:20Z

0.307     2026-03-10 17:42:28Z

    - Add new OpenAI-compatible self-hosted engine:
      Langertha::Engine::SGLang.
    - Add engine-scope module discovery via Module::Pluggable in Langertha:
      `available_engine_classes`, `available_engine_ids`,
      and generic `discover_modules_in_scope`.
    - Update `resolve_engine_class` to use discovered module scope
      (`Langertha::Engine::*` + `LangerthaX::Engine::*`) with deterministic
      core-first lookup.
    - Add `Langertha->new_engine($name_or_class, %args)` helper for
      resolve+load+construct in one call.
    - Document third-party custom engines under `LangerthaX::Engine::*`
      and include resolver behavior in docs.
    - Add tests for discovered engine classes/ids and LangerthaX fallback
      (`t/99-engine-resolution.t` + `t/lib` fixture module).
    - Extend load/hierarchy/readme coverage for the new SGLang engine.
    - Add `Module::Pluggable` as a direct runtime dependency.

0.306     2026-03-10 13:37:01Z

    - Add new shared core modules for cross-format normalization:
      Langertha::Input(+::Tools), Langertha::Output(+::Tools),
      and Langertha::Metrics.
    - Core modules centralize tool schema conversion (OpenAI/Anthropic/Ollama),
      Hermes XML extraction/normalization, and usage/cost metric normalization.
    - Add core tests t/97_input_output.t and t/98_metrics.t and extend t/00_load.t.

0.305     2026-03-08 21:51:01Z
    - New engine base class: Langertha::Engine::AnthropicBase for
      Anthropic-compatible APIs (shared /v1/messages chat/streaming/tool/model
      handling and Anthropic rate-limit parsing). Anthropic now extends this
      base, and MiniMax + LMStudioAnthropic were migrated to extend it too.
    - New engine: Langertha::Engine::LMStudio — native LM Studio local REST
      API adapter (POST /api/v1/chat, SSE streaming with message.delta/chat.end,
      GET /api/v1/models). Supports optional bearer auth via
      LANGERTHA_LMSTUDIO_API_KEY, plus basic auth via URL userinfo.
      Includes openai() helper returning a Langertha::Engine::LMStudioOpenAI
      instance for LM Studio's /v1 endpoint.
    - New engine: Langertha::Engine::LMStudioOpenAI for LM Studio's
      OpenAI-compatible /v1 endpoint (defaults api_key to C<lmstudio>).
    - New engine: Langertha::Engine::LMStudioAnthropic for LM Studio's
      Anthropic-compatible /v1/messages endpoint. Includes LMStudio->anthropic
      helper for easy conversion from native engine instances; defaults api_key
      to C<lmstudio>.
    - New OpenAPI spec: share/lmstudio.yaml with operationIds for LM Studio
      native chat and model listing, plus Langertha::Spec::LMStudio for
      pre-computed operation lookup.
    - Tests: extend t/00_load.t, t/10_engine_hierarchy.t, and
      t/11_basic_auth.t to cover LMStudio loading, inheritance/roles,
      request mapping, and auth behavior.
      Extend t/83_live_chat.t with optional LM Studio live coverage via
      TEST_LANGERTHA_LMSTUDIO_URL, TEST_LANGERTHA_LMSTUDIO_MODEL, and
      TEST_LANGERTHA_LMSTUDIO_API_KEY.
    - Documentation: add POD for LMStudio, LMStudioOpenAI, and
      LMStudioAnthropic helpers/attributes and expand README examples
      to include explicit LMStudioOpenAI/LMStudioAnthropic class usage.
    - Orchestration foundation on top of Raider:
      add Langertha::Role::Runnable (run_f contract),
      Langertha::RunContext (input/state/artifacts/metadata/trace + branch/merge),
      Langertha::Raid base class, and concrete orchestrators
      Langertha::Raid::Sequential, Langertha::Raid::Parallel, Langertha::Raid::Loop.
      Supports nested composition of Raider and Raid nodes.
    - Unified result model:
      add Langertha::Result as common result abstraction (final/question/pause/abort),
      and make Langertha::Raider::Result a backward-compatible subclass so Raider
      and Raid share the same result semantics.
    - Raider compatibility + interface:
      Raider now composes Langertha::Role::Runnable and exposes run_f($ctx)
      as an orchestration-friendly wrapper around raid_f while keeping existing
      public raid_f/respond_f behavior intact.
    - Raider fixes:
      _gather_tools_f now uses the active engine (not always the default engine),
      and Langfuse model parameters are recalculated after engine/tool dirtiness
      refresh during runtime engine switching.
    - Raider respond_f consistency:
      plugin_after_tool_call hooks are now applied to remaining tool calls during
      continuation flow (self-tools and MCP tools), matching main loop behavior.
    - Tests:
      add t/96_raid_orchestration.t covering Runnable compatibility, sequential/
      parallel/loop orchestration, nested Raid trees, context propagation and
      parallel isolation/merge semantics, result propagation (final/question/pause/abort),
      and error paths for all orchestrator types.
      Extend t/00_load.t to include new modules.
    - Documentation:
      add inline POD for all new orchestration/result/context modules and
      refresh Raider::Result POD to reflect shared result inheritance.
      Extend README with a new "Raid — Workflow Orchestration" section
      (RunContext, Sequential/Parallel/Loop, unified results, nesting),
      plus a top-level table of contents, architecture overview, and a
      minimal sequential orchestration example.

0.304     2026-03-07 02:05:27Z
    - New role: Langertha::Role::HermesTools — extracted Hermes-style XML
      tool calling into a dedicated role. Engines compose this role instead
      of setting a hermes_tools flag. Cleaner polymorphic dispatch: Role::Tools
      provides the tool loop and default native API path, HermesTools overrides
      build_tool_chat_request to inject tools into the system prompt.
    - Role::Tools cleaned up: removed all hermes branching, private _hermes_*
      methods, and hermes_tools attribute. Five polymorphic methods
      (format_tools, response_tool_calls, extract_tool_call,
      format_tool_results, response_text_content) are now provided by either
      the engine (native) or HermesTools (XML).
    - AKI.pm (native API): added tool calling support via HermesTools role
      with hermes_extract_content override for AKI's response format.
    - AKIOpenAI.pm: composes HermesTools role (replaces hermes_tools flag).
    - NousResearch.pm: composes HermesTools role (replaces hermes_tools flag).
    - Raider and Chat: simplified tool loop — removed all hermes if/else
      branching, uses polymorphic build_tool_chat_request.

0.303     2026-03-01 03:24:11Z

0.302     2026-02-27 03:48:44Z
    - Fix list_models URL construction: add overridable list_models_path
      method to Role::OpenAICompatible (default: /models). Mistral
      overrides to /v1/models. Fixes broken URL for engines whose base
      URL does not include /v1.
    - New Role::StaticModels: provides list_models from a hardcoded model
      list without HTTP requests. Used by MiniMax.
    - HuggingFace: list_models now queries the Hub API
      (huggingface.co/api/models) with search, pipeline_tag, and
      inference_provider filters. Only returns models with active
      inference providers.

0.301     2026-02-27 01:57:13Z
    - Rate limit extraction from HTTP response headers: new
      Langertha::RateLimit data class with normalized requests_limit,
      requests_remaining, tokens_limit, tokens_remaining, and reset
      fields plus raw provider-specific headers. Supported providers:
      OpenAI/Groq/Cerebras/OpenRouter/Replicate/HuggingFace
      (x-ratelimit-*) and Anthropic (anthropic-ratelimit-*). Engine
      stores latest rate_limit, Response carries per-response rate_limit
      with requests_remaining/tokens_remaining convenience methods.
    - New engine: HuggingFace — HuggingFace Inference Providers
      (OpenAI-compatible, org/model format, chat + streaming + tool calling)

0.300     2026-02-26 21:03:33Z
    - Plugin system: Langertha::Plugin base class with lifecycle hooks
      (plugin_before_raid, plugin_build_conversation, plugin_before_llm_call,
      plugin_after_llm_response, plugin_before_tool_call,
      plugin_after_tool_call, plugin_after_raid) and self_tools support.
      Plugins can be specified by short name (resolved to
      Langertha::Plugin::* or LangerthaX::Plugin::*).
    - Langertha::Plugin::Langfuse: Langfuse observability as a plugin
      (alternative to engine-level Role::Langfuse), with cascading traces,
      generations, and tool call spans in the Raider loop.
    - Role::PluginHost: shared plugin hosting for engines and Raider,
      with plugin resolution, instantiation, and _plugin_instances caching.
    - Wrapper classes: Langertha::Chat, Langertha::Embedder,
      Langertha::ImageGen for wrapping engines with optional overrides
      (model, system_prompt, temperature, etc.) and plugin lifecycle hooks.
    - Class sugar: `use Langertha qw( Raider )` and
      `use Langertha qw( Plugin )` for quick subclass setup with
      auto-import of Moose and Future::AsyncAwait.
    - Image generation: Role::ImageGeneration with image_model attribute,
      OpenAICompatible image_request/image_response/simple_image methods,
      OpenAI now composes ImageGeneration role (default: gpt-image-1).
    - Role::KeepAlive: extracted keep_alive attribute from Ollama into
      a reusable role with get_keep_alive accessor.
    - Ollama: update to current API — use operationIds chat/embed/list/ps
      (was generateChat/generateEmbeddings/getModels/getRunningModels),
      embedding response uses embeddings[0] (was embedding).
    - NousResearch: reasoning_prompt is now a configurable attribute
      (was hardcoded string).
    - Groq, Mistral, OpenAI: consolidate `with 'Langertha::Role::Tools'`
      into the main role composition block.
    - Log::Any debug/trace logging in Role::Chat, Role::Embedding,
      Role::HTTP, Role::Tools, and Role::OpenAPI for request lifecycle
      visibility.
    - Add Log::Any to cpanfile runtime dependencies.
    - Update OpenAPI specs: openai.yaml, mistral.yaml, ollama.yaml to
      latest upstream versions.
    - Pre-computed OpenAPI lookup tables: ship Langertha::Spec::OpenAI (148
      ops), Langertha::Spec::Mistral (67 ops), and Langertha::Spec::Ollama
      (12 ops) as static Perl data instead of parsing YAML + constructing
      OpenAPI::Modern at runtime. Startup cost drops from ~16s to <1ms.
    - New openapi_operations attribute in Role::OpenAPI with automatic
      fallback: engines that override _build_openapi_operations get the
      fast path; custom engines using openapi_file still work via the
      slow YAML/OpenAPI::Modern path.
    - Add maint/generate_spec_data.pl to regenerate Spec modules from
      share/*.yaml when specs are updated.
    - New tests: t/84_live_imagegen.t, t/87_raider_plugins.t,
      t/89_langertha_sugar.t, t/91_plugin_config.t, t/92_embedder.t,
      t/93_chat.t, t/94_plugin_langfuse.t, t/95_imagegen.t.

0.202     2026-02-25 03:50:44Z
    - Engine base class hierarchy: introduce Engine::Remote (JSON + HTTP
      + url required) and Engine::OpenAIBase (+ OpenAICompatible, OpenAPI,
      Models, Temperature, ResponseSize, SystemPrompt, Streaming, Chat).
      All 15 engines now extend these base classes instead of repeating
      10+ role composition statements. New engines need only 2-3 lines.
    - Migrate non-OpenAI engines to extend Engine::Remote:
      Anthropic, Gemini, Ollama, AKI
    - Migrate OpenAI-compatible engines to extend Engine::OpenAIBase:
      OpenAI, DeepSeek, Groq, Perplexity, Mistral, MiniMax, NousResearch,
      AKIOpenAI, OllamaOpenAI, vLLM (Whisper inherits via OpenAI)
    - New engine: Cerebras — fastest inference platform (llama-3.3-70b)
    - New engine: OpenRouter — unified gateway for 300+ models
    - New engine: Replicate — thousands of open-source models
    - New engine: LlamaCpp — llama.cpp server with embeddings
    - OpenAICompatible: api_key is now optional (undef = no Authorization
      header), enabling local engines (vLLM, llama.cpp) without dummy keys
    - OpenAICompatible: model is now optional in requests, enabling
      single-model servers (vLLM, llama.cpp) without explicit model names
    - Add comprehensive engine hierarchy test (t/10_engine_hierarchy.t)
      verifying inheritance, role composition, instantiation, and request
      generation for all 19 engines
    - Raider self-tools: raider_mcp => 1 enables LLM-controlled tools:
      raider_ask_user, raider_pause, raider_abort, raider_wait,
      raider_wait_for, raider_session_history, raider_manage_mcps,
      raider_switch_engine
    - Raider engine_catalog: runtime engine switching via self-tool or API
    - Raider mcp_catalog: dynamic MCP server activation/deactivation
    - Raider inline tools: quick tool definitions without MCP server setup
    - Raider::Result: typed result objects (final, question, pause, abort)
      with backward-compatible stringification
    - AKI: openai() no longer carries over native model name (different
      naming between native and /v1 API), uses default model and warns
    - Add live embedding test (t/82_live_embedding.t) with semantic
      similarity verification via Math::Vector::Similarity for OpenAI,
      Mistral, Ollama, OllamaOpenAI, and LlamaCpp
    - Add live chat test (t/83_live_chat.t) for all 16 engines including
      Cerebras, OpenRouter, Perplexity, MiniMax, and LlamaCpp

0.201     2026-02-23 03:50:17Z
    - Add Response.thinking attribute for chain-of-thought reasoning:
      - Native extraction: DeepSeek/OpenAI-compatible reasoning_content,
        Anthropic thinking blocks, Gemini thought parts — automatically
        populated on Response.thinking, no configuration needed
      - Think tag filter: <think> tag stripping enabled by default on
        all engines. Handles both closed (<think>...</think>) and
        unclosed (<think>...) tags. Configurable tag name via
        think_tag (default: 'think'). Disable with
        think_tag_filter => 0. Filtering applied across all text
        paths: simple_chat, streaming, tool calling, and Raider.
    - Add NousResearch reasoning attribute — enables chain-of-thought
      reasoning for Hermes 4 and DeepHermes 3 models by prepending
      the standard Nous reasoning system prompt
    - Langfuse cascading traces — Raider now creates proper hierarchical
      Trace → Span (iteration) → Generation (llm-call) / Span (tool)
      structure instead of flat trace → generation. Iteration spans group
      the LLM call and its tool calls. Tool spans capture per-tool timing,
      input, and output. Trace is updated with final output at raid end.
    - Langfuse: add langfuse_span() for creating span events
    - Langfuse: add langfuse_update_trace(), langfuse_update_span(),
      langfuse_update_generation() for updating observations after creation
    - Langfuse: langfuse_trace() now supports tags, user_id, session_id,
      release, version, public, and environment fields
    - Langfuse: langfuse_generation() now supports parent_observation_id,
      model_parameters, level, status_message, and version fields
    - Langfuse: Raider generations now include token usage data and
      model parameters (temperature, max_tokens) when available
    - Raider: add langfuse_trace_name, langfuse_user_id, langfuse_session_id,
      langfuse_tags, langfuse_release, langfuse_version, langfuse_metadata
      attributes for customizing Langfuse trace creation
    - Refactor all OpenAI-compatible engines to compose
      Langertha::Role::OpenAICompatible directly instead of extending
      Langertha::Engine::OpenAI. Each engine now only includes the roles
      it actually supports (e.g. DeepSeek gets Chat but not Embedding).
      Removes all "doesn't support X" croak overrides. Affected engines:
      DeepSeek, Groq, Mistral, MiniMax, NousResearch, Perplexity, vLLM,
      AKIOpenAI, OllamaOpenAI.
    - Add Raider context compression — when prompt token usage exceeds
      a configurable threshold (max_context_tokens * context_compress_threshold),
      history is automatically summarized via LLM before the next raid.
      Supports separate compression_engine for using cheaper models.
      Manual compression via compress_history/compress_history_f.
    - Add Raider session_history — full chronological archive of ALL
      messages including tool calls and results, persisted across
      clear_history and reset. Queryable by the LLM via MCP tool
      registered with register_session_history_tool().
    - Add MiniMax to live tool calling test (t/80_live_tool_calling.t)
      and live raider test (t/82_live_raider.t)
    - Add t/83_live_minimax.t: dedicated MiniMax live test covering
      simple_chat, list_models, and Raider with Coding Plan web search
    - Add Raider inject() method for mid-raid context injection —
      queue messages from async callbacks, timers, or other tasks
      that get picked up at the next iteration naturally
    - Add Raider on_iteration callback — called before each LLM call
      (iterations 2+) with ($raider, $iteration), returns messages
      to inject. Injected messages are persisted in history.
    - Add Langertha::Engine::MiniMax for MiniMax AI API
      (chat, streaming, tool calling via OpenAI-compatible API)
    - Rewrite all POD to inline style across all modules —
      =attr directly after has, =method directly after sub.
      Add POD to all previously undocumented modules.
    - Improve =seealso cross-links: remove redundant main module
      links, add meaningful related module references

0.200     2026-02-22 21:53:36Z
    - Add Langertha::Response: metadata container wrapping LLM text content
      with id, model, finish_reason, usage (token counts), timing, and created
      fields. Uses overload stringification for backward compatibility —
      existing code treating responses as strings continues to work.
    - All chat_response methods now return Langertha::Response objects:
      - Role::OpenAICompatible: extracts id, model, created, finish_reason, usage
      - Engine::Anthropic: extracts id, model, stop_reason, input/output_tokens
      - Engine::Gemini: extracts modelVersion, finishReason, usageMetadata
        (normalized to prompt_tokens/completion_tokens/total_tokens)
      - Engine::Ollama: extracts model, done_reason, eval counts, timing fields
      - Engine::AKI: extracts model_name, total_duration
    - Add Langertha::Raider: autonomous agent with conversation history and
      MCP tool calling. Features mission (system prompt), persistent history
      across raids, cumulative metrics (raids, iterations, tool_calls, time_ms),
      clear_history and reset methods. Supports Hermes tool calling.
      Auto-instruments raids with Langfuse traces and per-iteration
      generation events when Langfuse is enabled on the engine.
    - Add Langertha::Role::Langfuse: observability integration with Langfuse
      REST API. Composed into Role::Chat — every engine has Langfuse support
      built in. Auto-instruments simple_chat with trace and generation events.
      Batched ingestion via POST /api/public/ingestion with Basic Auth.
      Disabled by default — active when langfuse_public_key and
      langfuse_secret_key are set (via constructor or LANGFUSE_PUBLIC_KEY /
      LANGFUSE_SECRET_KEY / LANGFUSE_URL env vars).
    - Add ex/response.pl: Response metadata showcase (tokens, model, timing)
    - Add ex/raider.pl: autonomous file explorer agent example
    - Add ex/langfuse.pl: Langfuse observability example
    - Add ex/langfuse-k8s.yaml: Kubernetes manifest for self-hosted Langfuse
      with pre-configured project and API keys (zero setup)
    - Add t/70_response.t: Response unit tests across all engine formats
    - Add t/72_langfuse.t: Langfuse integration tests with mock HTTP
    - Add t/82_live_raider.t: live Raider integration test
    - Add Langertha::Role::OpenAICompatible: extracted OpenAI API format
      methods into a reusable role. Engines that use the OpenAI-compatible
      API format now compose this role instead of duplicating methods.
      Engine::OpenAI and all subclasses continue to work unchanged.
    - Add Langertha::Engine::OllamaOpenAI: first-class engine for Ollama's
      OpenAI-compatible /v1 endpoint. Ollama's openai() method now returns
      this engine instead of a raw Engine::OpenAI instance.
    - Add Langertha::Engine::AKI for AKI.IO native API
      (chat completions with key-in-body auth, synchronous mode,
      dynamic endpoint listing via list_models and endpoint_details)
    - Add Langertha::Engine::AKIOpenAI for AKI.IO via OpenAI-compatible API
      (chat, streaming, tool calling via Role::OpenAICompatible)
    - Add Langertha::Engine::NousResearch for Nous Research Inference API
      with Hermes-native tool calling via <tool_call> XML tags
    - Add Langertha::Engine::Perplexity for Perplexity Sonar API
      (chat and streaming only, no tool calling)
    - Add hermes_tools feature flag to Langertha::Role::Tools for
      Hermes-native tool calling via <tool_call>/<tool_response> XML tags;
      enables MCP tool calling on any model that supports the Hermes
      prompt format, even without API-level tool support
    - Add hermes_call_tag, hermes_response_tag attributes for custom
      XML tag names (default: tool_call, tool_response)
    - Add hermes_tool_instructions attribute for customizing the
      instruction text without changing the structural XML template
    - Add hermes_tool_prompt attribute for full system prompt override
    - Add hermes_extract_content() method for engines to override
      response content extraction in Hermes mode
    - MCP tool calling now supported on ALL engines:
      - OpenAI (inherited by Groq, vLLM, Mistral, DeepSeek)
      - Anthropic (with Anthropic-native tool format)
      - Gemini (with Gemini-native functionDeclarations format)
      - Ollama (OpenAI-compatible tool format)
      - NousResearch (Hermes-native via <tool_call> XML tags)
    - Add extract_tool_call() to Role::Tools for engine-agnostic
      tool call parsing across all provider formats
    - Fix Gemini tool calling: pass-through native message formats,
      convert MCP tool results to Gemini's functionResponse object
    - Fix Gemini chat_request to preserve native parts in messages
      from tool result round-trips
    - Remove hardcoded all_models() lists from all engines; model
      discovery is now exclusively dynamic via list_models()
    - Update default models:
      - Anthropic: claude-sonnet-4-6 (short alias)
      - Gemini: gemini-2.5-flash (2.0-flash deprecated for new users)
    - Add Hermes tool calling unit test with mock round-trip
      (t/66_tool_calling_hermes.t)
    - Add vLLM tool calling unit test (t/65_tool_calling_vllm.t)
    - Add live integration test for all engines including Ollama, vLLM,
      and NousResearch (t/80_live_tool_calling.t) with multi-model support
    - Add mock round-trip test for Ollama tool calling
      (t/64_tool_calling_ollama_mock.t) using fixture data
    - Add shared Test::MockAsyncHTTP test helper (t/lib/)
      for mocking async HTTP in engine tests
    - Normalize test API key env vars to TEST_LANGERTHA_*_API_KEY
      prefix to prevent accidental use of production keys
    - Add TEST_LANGERTHA_OLLAMA_URL and TEST_LANGERTHA_OLLAMA_MODELS
      env vars for Ollama live testing
    - Add TEST_LANGERTHA_VLLM_URL, TEST_LANGERTHA_VLLM_MODEL, and
      TEST_LANGERTHA_VLLM_TOOL_CALL_PARSER env vars for vLLM live testing
    - Add AKI.IO native API unit test (t/25_aki_requests.t) with mock
      response parsing for chat, list_models, and endpoint_details
    - Add AKI.IO live integration test (t/81_live_aki.t) for
      list_models, endpoint_details, and simple_chat
    - Add AKI.IO to live tool calling test (t/80_live_tool_calling.t)
      via OpenAI-compatible API
    - Add TEST_LANGERTHA_AKI_API_KEY and TEST_LANGERTHA_AKI_MODEL
      env vars for AKI.IO live testing
    - Use RFC 2606 test.invalid domain for dummy URLs in unit tests
    - Add ex/hermes_tools.pl example for Hermes-native tool calling
    - Rewrite all POD to inline style across all 37 modules —
      =attr directly after has, =method directly after sub.
      Add POD to 18 previously undocumented modules.

0.100     2026-02-20 05:33:44Z
    - Add MCP (Model Context Protocol) tool calling support
      - New Langertha::Role::Tools for engine-agnostic tool calling
      - Anthropic engine: full tool calling support (format_tools,
        response_tool_calls, format_tool_results, response_text_content)
      - Async chat_with_tools_f() method for automatic multi-round
        tool-calling loop with configurable max iterations
      - Requires Net::Async::MCP for MCP server communication
    - Add Future::AsyncAwait support for async/await syntax
      - All _f methods (simple_chat_f, simple_chat_stream_f, etc.)
      - Streaming with real-time async callbacks
    - Add streaming support
      - Synchronous callback, iterator, and Future-based APIs
      - SSE parsing for OpenAI/Anthropic/Groq/Mistral/DeepSeek
      - NDJSON parsing for Ollama
    - Add Gemini engine (Google AI Studio)
    - Add dynamic model listing via provider APIs with caching
    - Add Anthropic extended parameters (effort, inference_geo)
    - Improve POD documentation across all modules

0.008     2025-03-30 04:55:38Z
    - Add Mistral engine integration
    - Adapt Mistral OpenAPI spec for our parser

0.007     2025-01-25 19:29:51Z
    - Add DeepSeek engine

0.006     2024-09-30 14:07:25Z
    - Add Structured Output support
    - Add Groq engine and Groq Whisper support
    - Add TEST_WITHOUT_STRUCTURED_OUTPUT env variable

0.005     2024-08-22 13:43:31Z
    - Fix data type on keep_alive and remove POSIX round usage

0.004     2024-08-13 23:10:57Z
    - Fix interpretation of max_tokens on Anthropic (response size, not context)

0.003     2024-08-11 00:21:01Z
    - Add context size and temperature controls

0.002     2024-08-10 02:22:12Z
    - Add Whisper Transcription API
    - Add more engines
    - Fix encoding issues

0.001     2024-08-03 22:47:33Z
    - Initial release
    - Unified Perl interface for LLM APIs
    - Engines: OpenAI, Anthropic, Ollama
    - Role-based architecture (Chat, HTTP, Models, JSON, Embedding)
    - OpenAPI spec-driven request generation
    - Embedding support