<?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>AgentEnsemble | Blog</title><description/><link>https://agentensemble.net/</link><language>en</language><item><title>A Control Plane for Long-Running Agent Services</title><link>https://agentensemble.net/blog/ensemble-control-api/</link><guid isPermaLink="true">https://agentensemble.net/blog/ensemble-control-api/</guid><description>The Ensemble Control API adds HTTP run submission, state queries, capabilities discovery, run control, SSE streaming, and REST review decisions to the live dashboard -- giving external systems a complete interface to trigger and observe agent runs without compiling Java.

</description><pubDate>Thu, 04 Jun 2026 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;An earlier post in this series covered running agent ensembles as long-running services — always-on processes that accept work over WebSocket, HTTP, queues, or topics instead of running once and exiting. Once an ensemble is a service, a new category of problem appears: how do external systems interact with it?&lt;/p&gt;
&lt;p&gt;The existing WebSocket dashboard streams execution events and handles review decisions. That covers observability and human review. What it doesn’t cover is run submission. There’s no way for a CI pipeline, orchestrator, or custom UI to kick off a run, pass runtime parameters, query what’s currently executing, or cancel something that’s gone wrong — without a WebSocket connection and custom client code.&lt;/p&gt;
&lt;p&gt;The Ensemble Control API fills that gap.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;the-control-plane-vs-the-data-plane&quot;&gt;The Control Plane vs. the Data Plane&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;Before getting into the API itself, a design distinction worth stating explicitly.&lt;/p&gt;
&lt;p&gt;The v3 network module handles ensemble-to-ensemble communication: tasks delegating work to remote peers, capability registries, federation across namespaces. That’s the &lt;strong&gt;data plane&lt;/strong&gt; — ensemble-internal traffic, designed for ensemble peers.&lt;/p&gt;
&lt;p&gt;The Control API is the &lt;strong&gt;control plane&lt;/strong&gt;: CI pipelines, orchestrators, and custom UIs talking to an ensemble service. Different audience, different semantics. External systems shouldn’t need a WebSocket client, shouldn’t need to understand the ensemble networking protocol, and shouldn’t be treated as ensemble peers. The REST-first design reflects that distinction.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;phase-1-core-rest-endpoints&quot;&gt;Phase 1: Core REST Endpoints&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;Four endpoints on the same Javalin server as the WebSocket dashboard — no new port, no new process:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;POST /api/runs          Submit a run with input variables&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;GET  /api/runs          List recent runs (filterable by status, tag)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;GET  /api/runs/{runId}  Get full run detail (status, task outputs, metrics)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;GET  /api/capabilities  List registered tools, models, and preconfigured tasks&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;div&gt;&lt;h3 id=&quot;setup&quot;&gt;Setup&lt;/h3&gt;&lt;/div&gt;
&lt;p&gt;The API is activated by adding catalogs to &lt;code dir=&quot;auto&quot;&gt;WebDashboard.builder()&lt;/code&gt;:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;ToolCatalog&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;tools&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;ToolCatalog&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;tool&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;web_search&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, webSearchTool&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;tool&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;calculator&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, calculatorTool&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;ModelCatalog&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;models&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;ModelCatalog&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;model&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;sonnet&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, claudeSonnetModel&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;model&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;haiku&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, claudeHaikuModel&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;WebDashboard&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;dashboard&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;WebDashboard&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;port&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;7329&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;toolCatalog&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;tools&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;modelCatalog&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;models&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;maxConcurrentRuns&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;5&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;maxRetainedCompletedRuns&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;100&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;The ensemble wires in the dashboard:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;Ensemble&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;chatLanguageModel&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;claudeSonnetModel&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;webDashboard&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;dashboard&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;task&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;Task&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;description&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Research {topic} focusing on recent developments in {year}&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;tools&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;webSearchTool&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;())&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;task&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;Task&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;description&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Write a concise executive summary of the research&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;())&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;start&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;7329&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;&lt;code dir=&quot;auto&quot;&gt;ToolCatalog&lt;/code&gt; and &lt;code dir=&quot;auto&quot;&gt;ModelCatalog&lt;/code&gt; serve two purposes. They make the API transport-agnostic (JSON refers to tools and models by name, not class). And they act as allowlists — only registered tools and models can be used. Dynamic task creation in later phases cannot instantiate arbitrary code.&lt;/p&gt;
&lt;div&gt;&lt;h3 id=&quot;submitting-a-run&quot;&gt;Submitting a run&lt;/h3&gt;&lt;/div&gt;
&lt;p&gt;&lt;code dir=&quot;auto&quot;&gt;POST /api/runs&lt;/code&gt; submits the pre-configured ensemble tasks with variable substitution:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;{&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;  &lt;/span&gt;&lt;span&gt;&quot;inputs&quot;&lt;/span&gt;&lt;span&gt;: {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;&quot;topic&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;AI safety&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;&quot;year&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;2025&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;  &lt;/span&gt;&lt;/span&gt;&lt;span&gt;},&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;  &lt;/span&gt;&lt;span&gt;&quot;tags&quot;&lt;/span&gt;&lt;span&gt;: {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;&quot;triggeredBy&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;ci-pipeline&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;&quot;environment&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;staging&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;  &lt;/span&gt;&lt;/span&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;Response (&lt;code dir=&quot;auto&quot;&gt;202 Accepted&lt;/code&gt;):&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;{&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;  &lt;/span&gt;&lt;span&gt;&quot;runId&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;run-7f3a2b&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;  &lt;/span&gt;&lt;span&gt;&quot;status&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;ACCEPTED&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;  &lt;/span&gt;&lt;span&gt;&quot;tasks&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;2&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;  &lt;/span&gt;&lt;span&gt;&quot;workflow&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;SEQUENTIAL&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;The run executes asynchronously — the response is immediate. Poll &lt;code dir=&quot;auto&quot;&gt;GET /api/runs/{runId}&lt;/code&gt; for completion. Tags are arbitrary metadata for filtering and auditing. An empty body submits the template ensemble with no substitution. If &lt;code dir=&quot;auto&quot;&gt;maxConcurrentRuns&lt;/code&gt; is reached, the response is &lt;code dir=&quot;auto&quot;&gt;429&lt;/code&gt; with a &lt;code dir=&quot;auto&quot;&gt;retryAfterMs&lt;/code&gt; hint.&lt;/p&gt;
&lt;div&gt;&lt;h3 id=&quot;querying-capabilities&quot;&gt;Querying capabilities&lt;/h3&gt;&lt;/div&gt;
&lt;p&gt;&lt;code dir=&quot;auto&quot;&gt;GET /api/capabilities&lt;/code&gt; exposes what’s registered:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;{&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;  &lt;/span&gt;&lt;span&gt;&quot;tools&quot;&lt;/span&gt;&lt;span&gt;: [&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;{ &lt;/span&gt;&lt;span&gt;&quot;name&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;web_search&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;description&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Search the web using Google&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt; },&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;{ &lt;/span&gt;&lt;span&gt;&quot;name&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;calculator&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;description&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Evaluate mathematical expressions&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt; }&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;  &lt;/span&gt;&lt;/span&gt;&lt;span&gt;],&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;  &lt;/span&gt;&lt;span&gt;&quot;models&quot;&lt;/span&gt;&lt;span&gt;: [&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;{ &lt;/span&gt;&lt;span&gt;&quot;alias&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;sonnet&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;provider&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;anthropic&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt; },&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;{ &lt;/span&gt;&lt;span&gt;&quot;alias&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;haiku&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;provider&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;anthropic&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt; }&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;  &lt;/span&gt;&lt;/span&gt;&lt;span&gt;],&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;  &lt;/span&gt;&lt;span&gt;&quot;preconfiguredTasks&quot;&lt;/span&gt;&lt;span&gt;: [&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;{ &lt;/span&gt;&lt;span&gt;&quot;description&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Research {topic} focusing on recent developments in {year}&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt; },&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;{ &lt;/span&gt;&lt;span&gt;&quot;description&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Write a concise executive summary of the research&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt; }&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;  &lt;/span&gt;&lt;/span&gt;&lt;span&gt;]&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;&lt;code dir=&quot;auto&quot;&gt;GET /api/runs/{runId}&lt;/code&gt; returns full run detail including task outputs and metrics. &lt;code dir=&quot;auto&quot;&gt;GET /api/runs&lt;/code&gt; lists recent runs filterable by &lt;code dir=&quot;auto&quot;&gt;?status=RUNNING&lt;/code&gt;, &lt;code dir=&quot;auto&quot;&gt;?status=COMPLETED&lt;/code&gt;, or &lt;code dir=&quot;auto&quot;&gt;?tag=triggeredBy:ci-pipeline&lt;/code&gt;.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;phase-2-the-three-level-run-submission-model&quot;&gt;Phase 2: The Three-Level Run Submission Model&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;The most interesting design decision in the Control API is the graduated run submission model. There are three levels, each more dynamic than the last.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Level 1&lt;/strong&gt; (covered above): substitute template variables into the pre-configured ensemble. The simplest and most constrained option — the Java code defines what runs.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Level 2&lt;/strong&gt;: override specific fields of individual tasks at runtime.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Level 3&lt;/strong&gt;: define a new task list entirely in the POST body, without changing any Java code.&lt;/p&gt;
&lt;p&gt;This graduated approach keeps the simple case simple while making the more dynamic cases possible without abandoning the safety properties of the catalog model.&lt;/p&gt;
&lt;div&gt;&lt;h3 id=&quot;task-naming&quot;&gt;Task naming&lt;/h3&gt;&lt;/div&gt;
&lt;p&gt;To use Levels 2 and 3 effectively, tasks can be given logical names:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;Task&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;name&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;researcher&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;description&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Research {topic} focusing on recent developments in {year}&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;tools&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;webSearchTool&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;&lt;code dir=&quot;auto&quot;&gt;GET /api/capabilities&lt;/code&gt; returns task names alongside descriptions. Level 2 override keys match by exact name first, then by description prefix (first 50 characters, case-insensitive) as a fallback.&lt;/p&gt;
&lt;div&gt;&lt;h3 id=&quot;level-2-per-task-overrides&quot;&gt;Level 2: Per-task overrides&lt;/h3&gt;&lt;/div&gt;
&lt;p&gt;&lt;code dir=&quot;auto&quot;&gt;taskOverrides&lt;/code&gt; lets a caller change a specific task’s description, model, tools, or context without recompilation:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;{&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;  &lt;/span&gt;&lt;span&gt;&quot;inputs&quot;&lt;/span&gt;&lt;span&gt;: { &lt;/span&gt;&lt;span&gt;&quot;topic&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;AI safety&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt; },&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;  &lt;/span&gt;&lt;span&gt;&quot;taskOverrides&quot;&lt;/span&gt;&lt;span&gt;: {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;&quot;researcher&quot;&lt;/span&gt;&lt;span&gt;: {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;      &lt;/span&gt;&lt;span&gt;&quot;description&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Research {topic} focusing on EU AI Act compliance&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;      &lt;/span&gt;&lt;span&gt;&quot;expectedOutput&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;A regulatory analysis report with citations&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;      &lt;/span&gt;&lt;span&gt;&quot;model&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;sonnet&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;      &lt;/span&gt;&lt;span&gt;&quot;maxIterations&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;15&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;      &lt;/span&gt;&lt;span&gt;&quot;additionalContext&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;The EU AI Act was formally adopted in March 2024.&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;      &lt;/span&gt;&lt;span&gt;&quot;tools&quot;&lt;/span&gt;&lt;span&gt;: {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;&quot;add&quot;&lt;/span&gt;&lt;span&gt;: [&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;web_search&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;],&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;&quot;remove&quot;&lt;/span&gt;&lt;span&gt;: [&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;calculator&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;]&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;      &lt;/span&gt;&lt;/span&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;  &lt;/span&gt;&lt;/span&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;The override key (&lt;code dir=&quot;auto&quot;&gt;&quot;researcher&quot;&lt;/code&gt;) is matched against the template ensemble’s task names. If no matching task exists, the request is rejected with 400. The original task objects are never mutated — &lt;code dir=&quot;auto&quot;&gt;Task.toBuilder()&lt;/code&gt; creates modified copies.&lt;/p&gt;
&lt;p&gt;All tool references are resolved against the &lt;code dir=&quot;auto&quot;&gt;ToolCatalog&lt;/code&gt; and all model references against the &lt;code dir=&quot;auto&quot;&gt;ModelCatalog&lt;/code&gt;. A caller cannot inject a tool or model that was not pre-registered.&lt;/p&gt;
&lt;div&gt;&lt;h3 id=&quot;level-3-dynamic-task-creation&quot;&gt;Level 3: Dynamic task creation&lt;/h3&gt;&lt;/div&gt;
&lt;p&gt;When &lt;code dir=&quot;auto&quot;&gt;tasks&lt;/code&gt; is provided in the request body, the template ensemble’s task list is replaced entirely. The template’s model, catalogs, and configuration are preserved — only the task list changes:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;{&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;  &lt;/span&gt;&lt;span&gt;&quot;tasks&quot;&lt;/span&gt;&lt;span&gt;: [&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;{&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;      &lt;/span&gt;&lt;span&gt;&quot;name&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;researcher&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;      &lt;/span&gt;&lt;span&gt;&quot;description&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Research the competitive landscape for {product}&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;      &lt;/span&gt;&lt;span&gt;&quot;expectedOutput&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;A competitive analysis identifying 5 key competitors&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;      &lt;/span&gt;&lt;span&gt;&quot;tools&quot;&lt;/span&gt;&lt;span&gt;: [&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;web_search&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;],&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;      &lt;/span&gt;&lt;span&gt;&quot;model&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;sonnet&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;      &lt;/span&gt;&lt;span&gt;&quot;maxIterations&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;20&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;},&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;{&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;      &lt;/span&gt;&lt;span&gt;&quot;name&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;writer&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;      &lt;/span&gt;&lt;span&gt;&quot;description&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Write an executive brief based on the research&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;      &lt;/span&gt;&lt;span&gt;&quot;expectedOutput&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;A 1-page executive summary suitable for C-suite&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;      &lt;/span&gt;&lt;span&gt;&quot;context&quot;&lt;/span&gt;&lt;span&gt;: [&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;$researcher&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;],&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;      &lt;/span&gt;&lt;span&gt;&quot;model&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;sonnet&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;  &lt;/span&gt;&lt;/span&gt;&lt;span&gt;],&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;  &lt;/span&gt;&lt;span&gt;&quot;inputs&quot;&lt;/span&gt;&lt;span&gt;: { &lt;/span&gt;&lt;span&gt;&quot;product&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;AgentEnsemble&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt; }&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;The &lt;code dir=&quot;auto&quot;&gt;context&lt;/code&gt; field declares dependencies between tasks. &lt;code dir=&quot;auto&quot;&gt;$researcher&lt;/code&gt; references the task named &lt;code dir=&quot;auto&quot;&gt;&quot;researcher&quot;&lt;/code&gt;; &lt;code dir=&quot;auto&quot;&gt;$0&lt;/code&gt; references the task at index 0. The scheduler infers the workflow type from these dependencies — if context references exist and no workflow is explicitly set, &lt;code dir=&quot;auto&quot;&gt;PARALLEL&lt;/code&gt; (DAG-based) is used. Circular dependencies and unknown references are rejected at submission time.&lt;/p&gt;
&lt;div&gt;&lt;h3 id=&quot;websocket-run-submission&quot;&gt;WebSocket run submission&lt;/h3&gt;&lt;/div&gt;
&lt;p&gt;REST isn’t the only submission channel. WebSocket clients can submit runs using the &lt;code dir=&quot;auto&quot;&gt;run_request&lt;/code&gt; message — useful for browser-based UIs that already have a dashboard connection:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;{&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;  &lt;/span&gt;&lt;span&gt;&quot;type&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;run_request&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;  &lt;/span&gt;&lt;span&gt;&quot;requestId&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;req-1&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;  &lt;/span&gt;&lt;span&gt;&quot;inputs&quot;&lt;/span&gt;&lt;span&gt;: { &lt;/span&gt;&lt;span&gt;&quot;topic&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;AI safety&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt; },&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;  &lt;/span&gt;&lt;span&gt;&quot;tags&quot;&lt;/span&gt;&lt;span&gt;: { &lt;/span&gt;&lt;span&gt;&quot;env&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;staging&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt; }&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;The server acknowledges immediately with &lt;code dir=&quot;auto&quot;&gt;run_ack&lt;/code&gt;. On completion it sends &lt;code dir=&quot;auto&quot;&gt;run_result&lt;/code&gt; to the originating session only — the existing &lt;code dir=&quot;auto&quot;&gt;ensemble_completed&lt;/code&gt; broadcast continues to go to all connected clients unchanged.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;phase-3-run-control&quot;&gt;Phase 3: Run Control&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;Two operations that apply to in-flight runs.&lt;/p&gt;
&lt;div&gt;&lt;h3 id=&quot;cancellation&quot;&gt;Cancellation&lt;/h3&gt;&lt;/div&gt;
&lt;p&gt;&lt;code dir=&quot;auto&quot;&gt;POST /api/runs/{runId}/cancel&lt;/code&gt; cancels a running or accepted run. This is cooperative cancellation — the current in-flight task completes normally; cancellation takes effect before the next task starts.&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;{ &lt;/span&gt;&lt;span&gt;&quot;runId&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;run-abc&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;status&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;CANCELLING&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt; }&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;The same operation is available over WebSocket: &lt;code dir=&quot;auto&quot;&gt;{ &quot;type&quot;: &quot;run_control&quot;, &quot;runId&quot;: &quot;run-abc&quot;, &quot;action&quot;: &quot;cancel&quot; }&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The cooperative model is intentional. A task mid-execution is mid-LLM-call. Interrupting that immediately would leave the ensemble in an undefined state. Completing the current task and stopping cleanly at the boundary gives deterministic behavior without losing progress already made.&lt;/p&gt;
&lt;div&gt;&lt;h3 id=&quot;mid-run-model-switching&quot;&gt;Mid-run model switching&lt;/h3&gt;&lt;/div&gt;
&lt;p&gt;&lt;code dir=&quot;auto&quot;&gt;POST /api/runs/{runId}/model&lt;/code&gt; switches which LLM subsequent tasks will use:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;{ &lt;/span&gt;&lt;span&gt;&quot;model&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;haiku&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt; }&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;The switch takes effect on the next LLM call; the in-flight call completes with the previous model. The model alias must be registered in the &lt;code dir=&quot;auto&quot;&gt;ModelCatalog&lt;/code&gt;. This is useful when a long-running ensemble is partway through and you want subsequent tasks to use a cheaper or faster model.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;phase-4-event-streaming&quot;&gt;Phase 4: Event Streaming&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;The existing WebSocket dashboard broadcasts all execution events to all connected sessions. Phase 4 adds filtering and an HTTP-native alternative.&lt;/p&gt;
&lt;div&gt;&lt;h3 id=&quot;subscription-filtering&quot;&gt;Subscription filtering&lt;/h3&gt;&lt;/div&gt;
&lt;p&gt;WebSocket clients can subscribe to a specific subset of events:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;{ &lt;/span&gt;&lt;span&gt;&quot;type&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;subscribe&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;events&quot;&lt;/span&gt;&lt;span&gt;: [&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;task_started&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;task_completed&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;run_result&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;] }&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;Or filter to a specific run:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;{ &lt;/span&gt;&lt;span&gt;&quot;type&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;subscribe&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;events&quot;&lt;/span&gt;&lt;span&gt;: [&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;run_result&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;], &lt;/span&gt;&lt;span&gt;&quot;runId&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;run-abc&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt; }&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;Reset to all events with &lt;code dir=&quot;auto&quot;&gt;&quot;events&quot;: [&quot;*&quot;]&lt;/code&gt;. The server responds with a &lt;code dir=&quot;auto&quot;&gt;subscribe_ack&lt;/code&gt; confirming the effective subscription.&lt;/p&gt;
&lt;div&gt;&lt;h3 id=&quot;sse-streaming&quot;&gt;SSE streaming&lt;/h3&gt;&lt;/div&gt;
&lt;p&gt;For HTTP-only clients — curl scripts, serverless functions, server-side integrations — a WebSocket connection is awkward. The SSE endpoint offers the same event stream over a regular HTTP connection:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;GET /api/runs/{runId}/events&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;Accept: text/event-stream&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;For completed runs, stored events replay immediately and the connection closes. For in-progress runs, events stream until the run completes. A &lt;code dir=&quot;auto&quot;&gt;from&lt;/code&gt; parameter supports reconnection by resuming from a specific position in the stored output.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;phase-5-completing-the-control-loop&quot;&gt;Phase 5: Completing the Control Loop&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;Phase 5 rounds out the API with three operations that were previously only available through the WebSocket dashboard or by interacting with a running Java process directly.&lt;/p&gt;
&lt;div&gt;&lt;h3 id=&quot;rest-review-decisions&quot;&gt;REST review decisions&lt;/h3&gt;&lt;/div&gt;
&lt;p&gt;The human-in-the-loop system generates review gates where a reviewer approves, edits, or rejects task output before the ensemble proceeds. Phase 5 exposes this over REST, so server-side systems (Slack bots, CI pipelines) can automate or route review decisions:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;POST /api/reviews/{reviewId}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;{ &quot;decision&quot;: &quot;CONTINUE&quot; }&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;For edits:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;{ &lt;/span&gt;&lt;span&gt;&quot;decision&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;EDIT&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;revisedOutput&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Updated output...&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt; }&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;Discover pending reviews:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;GET /api/reviews&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;GET /api/reviews?runId=run-abc&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;div&gt;&lt;h3 id=&quot;context-injection&quot;&gt;Context injection&lt;/h3&gt;&lt;/div&gt;
&lt;p&gt;Inject a directive into a running ensemble’s &lt;code dir=&quot;auto&quot;&gt;DirectiveStore&lt;/code&gt;. The directive is picked up on the next LLM iteration of any agent in the ensemble:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;POST /api/runs/{runId}/inject&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;{ &quot;content&quot;: &quot;Focus on EU AI Act compliance&quot;, &quot;target&quot;: &quot;researcher&quot; }&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;This is the REST equivalent of what the dashboard allows through the live run view — useful for server-side automation that needs to steer a run mid-execution.&lt;/p&gt;
&lt;div&gt;&lt;h3 id=&quot;direct-tool-invocation&quot;&gt;Direct tool invocation&lt;/h3&gt;&lt;/div&gt;
&lt;p&gt;Execute a registered tool from the &lt;code dir=&quot;auto&quot;&gt;ToolCatalog&lt;/code&gt; without running a full ensemble:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;POST /api/tools/calculator/invoke&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;{ &quot;input&quot;: &quot;What is 42 * 17?&quot; }&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;Response:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;{ &lt;/span&gt;&lt;span&gt;&quot;tool&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;calculator&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;status&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;SUCCESS&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;output&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;714&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;durationMs&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;2&lt;/span&gt;&lt;span&gt; }&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;This is useful for integration testing, for validating tool configuration, and for pipeline steps that need a single tool call without the overhead of an ensemble run.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;the-design-tension&quot;&gt;The Design Tension&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;The interesting question in a feature like this is where the boundary sits between the control plane and the data plane.&lt;/p&gt;
&lt;p&gt;The v3 network module already has capability queries (&lt;code dir=&quot;auto&quot;&gt;CapabilityQueryMessage&lt;/code&gt;), task delegation (&lt;code dir=&quot;auto&quot;&gt;NetworkTask&lt;/code&gt;/&lt;code dir=&quot;auto&quot;&gt;NetworkTool&lt;/code&gt;), and directives (&lt;code dir=&quot;auto&quot;&gt;DirectiveMessage&lt;/code&gt;). The Control API exposes similar operations — but over HTTP, for a different audience, with different security and access semantics.&lt;/p&gt;
&lt;p&gt;The key distinction is the audience. External systems that should not need a WebSocket client and should not need to understand the ensemble networking protocol are not ensemble peers — they’re operators. The REST-first design, catalog-enforced allowlists, and graduated Level 1/2/3 submission model reflect that distinction throughout.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;The Ensemble Control API is documented in the &lt;a href=&quot;https://agentensemble.net/guides/ensemble-control-api/&quot;&gt;control API guide&lt;/a&gt;. The underlying design doc is &lt;a href=&quot;https://agentensemble.net/design/28-ensemble-control-api/&quot;&gt;design/28&lt;/a&gt;. Source is on &lt;a href=&quot;https://github.com/AgentEnsemble/agentensemble&quot;&gt;GitHub&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I’d be interested in where the three-level submission model feels right or falls short. The boundary between Level 2 (override existing tasks) and Level 3 (define new tasks) is where the most design tension sits — curious whether that separation is useful or whether most real use cases collapse to one or the other.&lt;/p&gt;</content:encoded><category>java</category><category>ai</category><category>agents</category><category>architecture</category></item><item><title>Running Agent Tasks as Temporal Activities</title><link>https://agentensemble.net/blog/executor-integration/</link><guid isPermaLink="true">https://agentensemble.net/blog/executor-integration/</guid><description>The agentensemble-executor module lets you call AgentEnsemble directly in-process from Temporal, Step Functions, or any external workflow engine -- no HTTP server, no second orchestration layer.

</description><pubDate>Tue, 02 Jun 2026 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;If you’re running Temporal in production, you’ve already solved the hard parts of long-running workflow orchestration: durable execution, activity retries, heartbeating, workflow history, and cross-service coordination. The question is how agent tasks fit into that model.&lt;/p&gt;
&lt;p&gt;The obvious answer — run AgentEnsemble as a separate service and call it over HTTP from Temporal activities — introduces latency, network failure modes, and another process to operate. A less obvious answer is that the two systems don’t need to be separated at all.&lt;/p&gt;
&lt;p&gt;The &lt;code dir=&quot;auto&quot;&gt;agentensemble-executor&lt;/code&gt; module lets you call AgentEnsemble tasks &lt;strong&gt;directly in-process&lt;/strong&gt; from any Temporal activity. No HTTP server. No Temporal SDK dependency inside the library. Just a Java call.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;two-execution-modes&quot;&gt;Two Execution Modes&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;The module provides two executors with different granularity:&lt;/p&gt;




















&lt;table&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Class&lt;/th&gt;&lt;th&gt;Granularity&lt;/th&gt;&lt;th&gt;When to use&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;TaskExecutor&lt;/code&gt;&lt;/td&gt;&lt;td&gt;One task = one external activity&lt;/td&gt;&lt;td&gt;Per-task Temporal retry, timeout, and heartbeat&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;EnsembleExecutor&lt;/code&gt;&lt;/td&gt;&lt;td&gt;One ensemble = one external activity&lt;/td&gt;&lt;td&gt;Simpler pipelines; AgentEnsemble handles internal orchestration inside a single activity&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;&lt;code dir=&quot;auto&quot;&gt;TaskExecutor&lt;/code&gt; is the recommended pattern when you want Temporal to own the retry and timeout semantics for individual AI steps. &lt;code dir=&quot;auto&quot;&gt;EnsembleExecutor&lt;/code&gt; is simpler when the pipeline is short and internal retry is not a concern.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;heartbeats-work&quot;&gt;Heartbeats Work&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;A common concern when embedding long-running work inside a Temporal activity is heartbeating. If the activity doesn’t heartbeat frequently enough, Temporal marks it as timed out.&lt;/p&gt;
&lt;p&gt;&lt;code dir=&quot;auto&quot;&gt;HeartbeatEnsembleListener&lt;/code&gt; bridges &lt;code dir=&quot;auto&quot;&gt;EnsembleListener&lt;/code&gt; lifecycle events to any &lt;code dir=&quot;auto&quot;&gt;Consumer&amp;#x3C;Object&gt;&lt;/code&gt;. Passing Temporal’s heartbeat method as the consumer is one line:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;return&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;executor&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;execute&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;request, &lt;/span&gt;&lt;span&gt;Activity&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;getExecutionContext&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;::&lt;/span&gt;&lt;span&gt;heartbeat&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;The consumer fires on &lt;code dir=&quot;auto&quot;&gt;task_started&lt;/code&gt;, &lt;code dir=&quot;auto&quot;&gt;task_completed&lt;/code&gt;, &lt;code dir=&quot;auto&quot;&gt;tool_call&lt;/code&gt;, and &lt;code dir=&quot;auto&quot;&gt;llm_iteration_started&lt;/code&gt; — frequently enough that a 2-minute heartbeat window is generous for typical agent workloads. The heartbeat payload is a &lt;code dir=&quot;auto&quot;&gt;HeartbeatDetail&lt;/code&gt; record serializable by Temporal’s default Jackson &lt;code dir=&quot;auto&quot;&gt;DataConverter&lt;/code&gt;, so it’s visible in the Temporal UI and accessible via &lt;code dir=&quot;auto&quot;&gt;Activity.getLastHeartbeatDetails()&lt;/code&gt;.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;a-full-temporal-integration&quot;&gt;A Full Temporal Integration&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;The recommended pattern wraps each AgentEnsemble task as a separate &lt;code dir=&quot;auto&quot;&gt;@ActivityMethod&lt;/code&gt;:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;@&lt;/span&gt;&lt;span&gt;ActivityInterface&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;public&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;interface&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;ResearchPipelineActivity&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;@&lt;/span&gt;&lt;span&gt;ActivityMethod&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;TaskResult&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;research&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;TaskRequest&lt;/span&gt;&lt;span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;request&lt;/span&gt;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;@&lt;/span&gt;&lt;span&gt;ActivityMethod&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;TaskResult&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;write&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;TaskRequest&lt;/span&gt;&lt;span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;request&lt;/span&gt;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;public&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;class&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;ResearchPipelineActivityImpl&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;implements&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;ResearchPipelineActivity&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;private&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;final&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;TaskExecutor&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;executor&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;/** Production constructor. */&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;public&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;ResearchPipelineActivityImpl&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;this&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;new&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;TaskExecutor&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;            &lt;/span&gt;&lt;/span&gt;&lt;span&gt;SimpleModelProvider&lt;/span&gt;&lt;span&gt;.of&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;OpenAiChatModel&lt;/span&gt;&lt;span&gt;.builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.apiKey&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&lt;span&gt;System&lt;/span&gt;&lt;span&gt;.getenv&lt;/span&gt;&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;))&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.modelName&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;gpt-4o-mini&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.build&lt;/span&gt;&lt;span&gt;())&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;            &lt;/span&gt;&lt;/span&gt;&lt;span&gt;SimpleToolProvider&lt;/span&gt;&lt;span&gt;.builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.tool&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;web-search&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;new&lt;/span&gt;&lt;span&gt; WebSearchTool&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&lt;span&gt;System&lt;/span&gt;&lt;span&gt;.getenv&lt;/span&gt;&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;SEARCH_API_KEY&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)))&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.build&lt;/span&gt;&lt;span&gt;())&lt;/span&gt;&lt;span&gt;);&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;/** Package-private constructor for testing -- accepts FakeTaskExecutor. */&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;ResearchPipelineActivityImpl&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;TaskExecutor&lt;/span&gt;&lt;span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;executor&lt;/span&gt;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;this&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;executor&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; executor;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;@&lt;/span&gt;&lt;span&gt;Override&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;public&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;TaskResult&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;research&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;TaskRequest&lt;/span&gt;&lt;span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;request&lt;/span&gt;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;return&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;executor&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;execute&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;request, &lt;/span&gt;&lt;span&gt;Activity&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;getExecutionContext&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;::&lt;/span&gt;&lt;span&gt;heartbeat&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;@&lt;/span&gt;&lt;span&gt;Override&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;public&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;TaskResult&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;write&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;TaskRequest&lt;/span&gt;&lt;span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;request&lt;/span&gt;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;return&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;executor&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;execute&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;request, &lt;/span&gt;&lt;span&gt;Activity&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;getExecutionContext&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;::&lt;/span&gt;&lt;span&gt;heartbeat&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;The workflow sequences activities and passes upstream outputs as context entries:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;public&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;class&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;ResearchWorkflowImpl&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;implements&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;ResearchWorkflow&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;private&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;final&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;ResearchPipelineActivity&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;activity&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;Workflow&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;newActivityStub&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;ResearchPipelineActivity&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;class&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;            &lt;/span&gt;&lt;span&gt;ActivityOptions&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;newBuilder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;setScheduleToCloseTimeout&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;Duration&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;ofMinutes&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;30&lt;/span&gt;&lt;span&gt;))&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;setHeartbeatTimeout&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;Duration&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;ofMinutes&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;2&lt;/span&gt;&lt;span&gt;))&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;setRetryOptions&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;RetryOptions&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;newBuilder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;setMaximumAttempts&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;3&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;())&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;())&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;@&lt;/span&gt;&lt;span&gt;Override&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;public&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;String&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;run&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;String&lt;/span&gt;&lt;span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;topic&lt;/span&gt;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;TaskResult&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;research&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;activity&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;research&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;            &lt;/span&gt;&lt;span&gt;TaskRequest&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;description&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Research the latest developments in {topic}&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;expectedOutput&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;A comprehensive, accurate research summary&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;agent&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;AgentSpec&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;role&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Research Analyst&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;goal&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Find accurate, up-to-date information on any topic&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;toolNames&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;List&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;of&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;web-search&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;))&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;())&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;inputs&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;Map&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;of&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;topic&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, topic&lt;/span&gt;&lt;span&gt;))&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;())&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;TaskResult&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;article&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;activity&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;write&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;            &lt;/span&gt;&lt;span&gt;TaskRequest&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;description&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Write a blog post about {topic} using this research: {research}&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;expectedOutput&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;A well-structured, engaging 500-word blog post&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;agent&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;AgentSpec&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;of&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Technical Writer&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Write clear, compelling content&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;))&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;context&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;Map&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;of&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;research&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;research&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;output&lt;/span&gt;&lt;span&gt;()))&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;inputs&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;Map&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;of&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;topic&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, topic&lt;/span&gt;&lt;span&gt;))&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;())&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;return&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;article&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;output&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;Temporal handles sequencing, retry, and timeout. AgentEnsemble handles LLM calls, tool execution, and the ReAct loop. Each concern stays in the system designed for it.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;testing-without-llm-calls&quot;&gt;Testing Without LLM Calls&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;Both executors ship with test doubles — &lt;code dir=&quot;auto&quot;&gt;FakeTaskExecutor&lt;/code&gt; and &lt;code dir=&quot;auto&quot;&gt;FakeEnsembleExecutor&lt;/code&gt; — that can be injected without any LLM calls:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;FakeTaskExecutor&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;fake&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;FakeTaskExecutor&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;whenDescriptionContains&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Research&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;AI is advancing rapidly in 2026.&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;whenDescriptionContains&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Write&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Article: AI reshapes every industry.&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;ResearchPipelineActivityImpl&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;activity&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;new&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;ResearchPipelineActivityImpl&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;fake&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;Combined with Temporal’s &lt;code dir=&quot;auto&quot;&gt;TestWorkflowEnvironment&lt;/code&gt;, this lets you run the full workflow in fast deterministic tests:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;@&lt;/span&gt;&lt;span&gt;BeforeEach&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;void&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;setUp&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;testEnv &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;TestWorkflowEnvironment&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;newInstance&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;FakeTaskExecutor&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;fake&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;FakeTaskExecutor&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;whenDescriptionContains&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Research&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Research done: AI grows 40% YoY.&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;whenDescriptionContains&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Write&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Article: AI is reshaping every industry.&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;Worker&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;worker&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;testEnv&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;newWorker&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;TASK_QUEUE&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;worker&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;registerWorkflowImplementationTypes&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;ResearchWorkflowImpl&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;&lt;span&gt;class&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;worker&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;registerActivitiesImplementations&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;new&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;ResearchPipelineActivityImpl&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;fake&lt;/span&gt;&lt;span&gt;))&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;testEnv&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;start&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;@&lt;/span&gt;&lt;span&gt;Test&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;void&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;run_sequencesResearchThenWrite_returnsArticleOutput&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;ResearchWorkflow&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;workflow&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;testEnv&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;newWorkflowStub&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;ResearchWorkflow&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;class&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;WorkflowOptions&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;newBuilder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;setTaskQueue&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;TASK_QUEUE&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;())&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;String&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;result&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;workflow&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;run&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Artificial Intelligence&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;assertThat&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;result&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;isEqualTo&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Article: AI is reshaping every industry.&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;div&gt;&lt;h2 id=&quot;model-selection-at-request-time&quot;&gt;Model Selection at Request Time&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;Models and tools are configured on the worker side and never serialized into workflow history. A &lt;code dir=&quot;auto&quot;&gt;modelName&lt;/code&gt; in a &lt;code dir=&quot;auto&quot;&gt;TaskRequest&lt;/code&gt; selects a specific model at request time:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;ModelProvider&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;models&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;SimpleModelProvider&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;model&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;gpt-4o-mini&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, cheapModel&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;model&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;gpt-4o&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, premiumModel&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;defaultModel&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;cheapModel&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;// In the workflow:&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;TaskRequest&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;description&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Synthesize the final executive summary&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;modelName&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;gpt-4o&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;// resolved by the worker&apos;s ModelProvider at run time&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;agent&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;AgentSpec&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;of&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Executive Synthesizer&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Produce board-level summaries&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;))&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;div&gt;&lt;h2 id=&quot;not-temporal-specific&quot;&gt;Not Temporal-Specific&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;The heartbeat consumer is a plain &lt;code dir=&quot;auto&quot;&gt;Consumer&amp;#x3C;Object&gt;&lt;/code&gt;. The &lt;code dir=&quot;auto&quot;&gt;agentensemble-executor&lt;/code&gt; module has no Temporal SDK dependency. The same executors work with any external orchestrator:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;AWS Step Functions&lt;/strong&gt; — pass a heartbeat callback to a state machine activity poller&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Kafka Streams&lt;/strong&gt; — call &lt;code dir=&quot;auto&quot;&gt;execute()&lt;/code&gt; inside a &lt;code dir=&quot;auto&quot;&gt;Processor&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Spring Batch&lt;/strong&gt; — wrap in a &lt;code dir=&quot;auto&quot;&gt;Tasklet&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Plain threads&lt;/strong&gt; — pass &lt;code dir=&quot;auto&quot;&gt;null&lt;/code&gt; for no heartbeating&lt;/li&gt;
&lt;/ul&gt;
&lt;div&gt;&lt;h2 id=&quot;the-design-tradeoff&quot;&gt;The Design Tradeoff&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;Task-per-activity gives you more operational visibility — each task is a separate entry in the Temporal UI, with its own retry history and timeout. Ensemble-per-activity is simpler to write but treats the entire pipeline as a black box from Temporal’s perspective.&lt;/p&gt;
&lt;p&gt;The deeper tradeoff is about where you want the orchestration intelligence to live. If your Temporal workflows are already sophisticated — routing between task types, branching on outcomes, passing context between many steps — then task-per-activity is the natural fit. If AgentEnsemble’s phase grouping, DAG parallelism, or phase review gates are doing the interesting coordination work, then ensemble-per-activity keeps that logic inside the framework and Temporal handles only the outer lifecycle.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;The executor module is documented in the &lt;a href=&quot;https://agentensemble.net/guides/executor-integration/&quot;&gt;integration guide&lt;/a&gt;. Source is on &lt;a href=&quot;https://github.com/AgentEnsemble/agentensemble&quot;&gt;GitHub&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I’d be interested in whether the two-mode design maps cleanly to your Temporal workflows, or whether there are integration patterns that don’t fit either executor.&lt;/p&gt;</content:encoded><category>java</category><category>ai</category><category>agents</category><category>architecture</category></item><item><title>Error Handling in Agent Systems: Exception Hierarchies, Partial Results, and Exit Reasons</title><link>https://agentensemble.net/blog/error-handling-and-resilience/</link><guid isPermaLink="true">https://agentensemble.net/blog/error-handling-and-resilience/</guid><description>A structured exception hierarchy, partial result preservation, and explicit exit reasons give you the operational handles needed to run agent pipelines reliably.

</description><pubDate>Sun, 31 May 2026 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;Agent systems fail in ways that traditional software does not. An LLM might return an unparseable response. A tool call might timeout. An agent might enter an infinite ReAct loop. A human reviewer might walk away from an approval gate. A task might succeed but produce output that a downstream task cannot use.&lt;/p&gt;
&lt;p&gt;The interesting problem is not preventing these failures — some are inherent to non-deterministic systems. The interesting problem is giving operators enough information to handle them gracefully: what failed, what succeeded before the failure, and what the system’s terminal state actually is.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;the-exception-hierarchy&quot;&gt;The Exception Hierarchy&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;&lt;a href=&quot;https://github.com/AgentEnsemble/agentensemble&quot;&gt;AgentEnsemble&lt;/a&gt; uses a hierarchy of unchecked exceptions rooted at &lt;code dir=&quot;auto&quot;&gt;AgentEnsembleException&lt;/code&gt;. Every exception the framework throws extends this base, so you can catch everything with a single catch block or handle specific cases individually.&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;AgentEnsembleException (base)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;  &lt;/span&gt;&lt;/span&gt;&lt;span&gt;ValidationException             -- invalid configuration at build/run time&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;  &lt;/span&gt;&lt;/span&gt;&lt;span&gt;TaskExecutionException          -- a task failed during execution&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;  &lt;/span&gt;&lt;/span&gt;&lt;span&gt;AgentExecutionException         -- an LLM call failed&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;  &lt;/span&gt;&lt;/span&gt;&lt;span&gt;MaxIterationsExceededException  -- agent exceeded its tool-call limit&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;  &lt;/span&gt;&lt;/span&gt;&lt;span&gt;PromptTemplateException         -- unresolved template variables&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;  &lt;/span&gt;&lt;/span&gt;&lt;span&gt;ToolExecutionException          -- a tool call failed&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;  &lt;/span&gt;&lt;/span&gt;&lt;span&gt;ConstraintViolationException    -- required workers were not called&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;  &lt;/span&gt;&lt;/span&gt;&lt;span&gt;GuardrailViolationException     -- a guardrail blocked execution&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;The hierarchy matters because different failure types require different responses. A &lt;code dir=&quot;auto&quot;&gt;ValidationException&lt;/code&gt; means your configuration is wrong — no LLM was ever called, and the fix is in the code. A &lt;code dir=&quot;auto&quot;&gt;TaskExecutionException&lt;/code&gt; means the pipeline started but a task failed — partial results may be available. A &lt;code dir=&quot;auto&quot;&gt;MaxIterationsExceededException&lt;/code&gt; means an agent got stuck in a tool-calling loop — the fix might be fewer tools or a higher iteration limit.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;partial-results-on-failure&quot;&gt;Partial Results on Failure&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;When a multi-task pipeline fails partway through, the work completed before the failure is not discarded. &lt;code dir=&quot;auto&quot;&gt;TaskExecutionException&lt;/code&gt; carries a list of &lt;code dir=&quot;auto&quot;&gt;TaskOutput&lt;/code&gt; objects for tasks that completed before the failure:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;try&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;EnsembleOutput&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;output&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;ensemble&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;run&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;inputs&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;saveResults&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;output&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;} &lt;/span&gt;&lt;span&gt;catch&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;TaskExecutionException&lt;/span&gt;&lt;span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;e&lt;/span&gt;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;// Save whatever was completed before the failure&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;for&lt;/span&gt;&lt;span&gt; (&lt;/span&gt;&lt;span&gt;TaskOutput&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;partial&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;:&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;e&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;getCompletedTaskOutputs&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;) {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;savePartialResult&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;partial&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;alertOnFailure&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&lt;span&gt;e&lt;/span&gt;&lt;span&gt;.getTaskDescription&lt;/span&gt;&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;e&lt;/span&gt;&lt;span&gt;.getAgentRole&lt;/span&gt;&lt;/span&gt;&lt;span&gt;())&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;This is operationally significant. In a five-task pipeline where task four fails, you still have the outputs of tasks one through three. You can save them, display them to a user, or use them to resume the pipeline from where it left off.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;exit-reasons&quot;&gt;Exit Reasons&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;Not every non-completion is an error. &lt;code dir=&quot;auto&quot;&gt;EnsembleOutput.getExitReason()&lt;/code&gt; distinguishes between four terminal states:&lt;/p&gt;

























&lt;table&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Exit Reason&lt;/th&gt;&lt;th&gt;Meaning&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;COMPLETED&lt;/code&gt;&lt;/td&gt;&lt;td&gt;All tasks ran to completion normally&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;USER_EXIT_EARLY&lt;/code&gt;&lt;/td&gt;&lt;td&gt;A human reviewer chose to stop the pipeline&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;TIMEOUT&lt;/code&gt;&lt;/td&gt;&lt;td&gt;A review gate timeout expired&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;ERROR&lt;/code&gt;&lt;/td&gt;&lt;td&gt;An unrecoverable exception terminated the pipeline&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;EnsembleOutput&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;output&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;ensemble&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;run&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;switch&lt;/span&gt;&lt;span&gt; (&lt;/span&gt;&lt;span&gt;output&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;getExitReason&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;) {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;case&lt;/span&gt;&lt;span&gt; COMPLETED&lt;/span&gt;&lt;span&gt;:&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;System&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;out&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;println&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;All done: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;+&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;output&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;getRaw&lt;/span&gt;&lt;span&gt;())&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;break&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;case&lt;/span&gt;&lt;span&gt; USER_EXIT_EARLY&lt;/span&gt;&lt;span&gt;:&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;System&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;out&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;println&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;User stopped after &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;            &lt;/span&gt;&lt;span&gt;+&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;output&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;completedTasks&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;size&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;+&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt; task(s)&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;break&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;case&lt;/span&gt;&lt;span&gt; TIMEOUT&lt;/span&gt;&lt;span&gt;:&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;System&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;out&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;println&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Review gate timed out&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;break&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;case&lt;/span&gt;&lt;span&gt; ERROR&lt;/span&gt;&lt;span&gt;:&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;// Typically handled via exception&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;break&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;The distinction between &lt;code dir=&quot;auto&quot;&gt;USER_EXIT_EARLY&lt;/code&gt; and &lt;code dir=&quot;auto&quot;&gt;TIMEOUT&lt;/code&gt; matters for operational dashboards. A user exit is intentional — the pipeline did its job and the human made a decision. A timeout might indicate a process problem (reviewer was not available) and may need escalation.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;specific-exception-types&quot;&gt;Specific Exception Types&lt;/h2&gt;&lt;/div&gt;
&lt;div&gt;&lt;h3 id=&quot;validationexception&quot;&gt;ValidationException&lt;/h3&gt;&lt;/div&gt;
&lt;p&gt;Thrown before any LLM calls when the ensemble or its components are configured incorrectly. Common causes include missing required fields, tasks referencing unregistered agents, circular context dependencies, or invalid iteration limits.&lt;/p&gt;
&lt;p&gt;This exception is your build-time safety net. If you see it, the fix is always in the configuration code.&lt;/p&gt;
&lt;div&gt;&lt;h3 id=&quot;agentexecutionexception&quot;&gt;AgentExecutionException&lt;/h3&gt;&lt;/div&gt;
&lt;p&gt;Thrown when the LLM call itself fails — network errors, API errors, rate limiting, timeouts. Contains the agent role and task description so you can route the failure to the right team.&lt;/p&gt;
&lt;div&gt;&lt;h3 id=&quot;maxiterationsexceededexception&quot;&gt;MaxIterationsExceededException&lt;/h3&gt;&lt;/div&gt;
&lt;p&gt;Thrown when an agent exceeds its &lt;code dir=&quot;auto&quot;&gt;maxIterations&lt;/code&gt; limit during the ReAct loop. Contains both the configured limit and the actual iteration count.&lt;/p&gt;
&lt;p&gt;This is often a sign that the agent has too many tools and is cycling between them without making progress. The fix is usually to reduce the tool set, make tool descriptions more specific, or increase the iteration limit if the task genuinely requires many tool calls.&lt;/p&gt;
&lt;div&gt;&lt;h3 id=&quot;prompttemplateexception&quot;&gt;PromptTemplateException&lt;/h3&gt;&lt;/div&gt;
&lt;p&gt;Thrown when a task description contains &lt;code dir=&quot;auto&quot;&gt;{variable}&lt;/code&gt; placeholders that were not resolved. The exception lists the missing variable names, making it straightforward to fix.&lt;/p&gt;
&lt;div&gt;&lt;h3 id=&quot;guardrailviolationexception&quot;&gt;GuardrailViolationException&lt;/h3&gt;&lt;/div&gt;
&lt;p&gt;Thrown when an input or output guardrail blocks execution. Contains the guardrail type (INPUT or OUTPUT), the violation message, the task description, and the agent role. This integrates with the &lt;a href=&quot;https://agentensemble.net/blog/guardrails-for-agent-output/&quot;&gt;guardrail system&lt;/a&gt; covered in the previous post.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;the-retry-question&quot;&gt;The Retry Question&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;AgentEnsemble does not include built-in retry logic. This is a deliberate design choice.&lt;/p&gt;
&lt;p&gt;The reasoning is that retry policies are highly context-dependent. A rate-limited API call might benefit from exponential backoff. A malformed LLM response might benefit from a retry with the same prompt. A task that failed because the model cannot perform the requested work should not be retried at all.&lt;/p&gt;
&lt;p&gt;For transient failures, implement retry at the call site:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;int&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;attempts&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;0&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;EnsembleOutput&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;output&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;null&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;while&lt;/span&gt;&lt;span&gt; (attempts &lt;/span&gt;&lt;span&gt;&amp;#x3C;&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;3&lt;/span&gt;&lt;span&gt;) {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;try&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;output &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;ensemble&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;run&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;inputs&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;break&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;} &lt;/span&gt;&lt;span&gt;catch&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;AgentExecutionException&lt;/span&gt;&lt;span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;e&lt;/span&gt;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;attempts&lt;/span&gt;&lt;span&gt;++&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;if&lt;/span&gt;&lt;span&gt; (attempts &lt;/span&gt;&lt;span&gt;==&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;3&lt;/span&gt;&lt;span&gt;) &lt;/span&gt;&lt;span&gt;throw&lt;/span&gt;&lt;span&gt; e;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;Thread&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;sleep&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;1000L&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;*&lt;/span&gt;&lt;span&gt; attempts&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;For production use, consider integrating a resilience library such as Resilience4j, which provides circuit breakers, rate limiters, and retry policies that compose well with the exception hierarchy.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;the-operational-model&quot;&gt;The Operational Model&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;The error handling design reflects a particular view of how agent systems should be operated: failures are expected, partial results are valuable, and the framework should give you structured information rather than opaque error strings.&lt;/p&gt;
&lt;p&gt;The exception hierarchy makes it possible to build monitoring and alerting that distinguishes between configuration errors (fix the code), transient failures (retry or escalate), agent loops (tune the workflow), and intentional stops (human decision). The partial result preservation makes it possible to build resumable pipelines. The exit reasons make it possible to build dashboards that accurately represent pipeline outcomes.&lt;/p&gt;
&lt;p&gt;None of this prevents failures. It gives you the handles to respond to them systematically.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;The full error handling guide is in the &lt;a href=&quot;https://agentensemble.net/guides/error-handling/&quot;&gt;documentation&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I’d be interested in whether you have found the exception hierarchy granularity to be sufficient, or whether there are failure modes in your agent systems that do not map cleanly to these categories.&lt;/p&gt;</content:encoded><category>java</category><category>ai</category><category>agents</category><category>architecture</category></item><item><title>Scoped Memory for Agent Systems: Cross-Run Persistence Without Global State</title><link>https://agentensemble.net/blog/memory-across-runs/</link><guid isPermaLink="true">https://agentensemble.net/blog/memory-across-runs/</guid><description>Task-scoped memory lets agents accumulate knowledge across runs through named scopes, pluggable stores, and explicit isolation boundaries.

</description><pubDate>Fri, 29 May 2026 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;Most agent frameworks treat each run as stateless. The agent starts fresh, does its work, and the output is consumed by whatever called it. If you run the same workflow again next week, the agent has no memory of what it produced last time.&lt;/p&gt;
&lt;p&gt;For some use cases that is fine. For others — recurring research tasks, iterative drafting, accumulated domain knowledge — you want the agent to remember what it learned in previous runs and build on it.&lt;/p&gt;
&lt;p&gt;The question is how to add cross-run memory without introducing global shared state that makes the system hard to reason about.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;named-scopes-as-the-isolation-mechanism&quot;&gt;Named Scopes as the Isolation Mechanism&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;&lt;a href=&quot;https://github.com/AgentEnsemble/agentensemble&quot;&gt;AgentEnsemble&lt;/a&gt; uses named memory scopes. Each task declares which scopes it reads from and writes to. A task can only see memory from scopes it explicitly declares.&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;MemoryStore&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;store&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;MemoryStore&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;inMemory&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;Task&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;researchTask&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;Task&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;description&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Research current AI trends&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;expectedOutput&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;A research report&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;agent&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;researcher&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;memory&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;ai-research&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;Ensemble&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;agent&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;researcher&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;task&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;researchTask&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;memoryStore&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;store&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;run&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;After the run, the task’s output is stored in the &lt;code dir=&quot;auto&quot;&gt;&quot;ai-research&quot;&lt;/code&gt; scope. On a second run with the same store, the agent’s prompt automatically includes entries from the first run under a &lt;code dir=&quot;auto&quot;&gt;## Memory: ai-research&lt;/code&gt; section.&lt;/p&gt;
&lt;p&gt;The scope name is the isolation boundary. Task A storing into &lt;code dir=&quot;auto&quot;&gt;&quot;research&quot;&lt;/code&gt; and task B declaring only &lt;code dir=&quot;auto&quot;&gt;&quot;drafts&quot;&lt;/code&gt; means task B never sees task A’s output. This is not a security mechanism — it is an attention mechanism. It controls what context an agent receives, keeping prompts focused on relevant history rather than everything that ever happened.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;how-it-works-at-the-prompt-level&quot;&gt;How It Works at the Prompt Level&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;The mechanics are straightforward:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;At task startup, the framework retrieves entries from every declared scope and injects them into the agent’s prompt.&lt;/li&gt;
&lt;li&gt;At task completion, the framework stores the task output into every declared scope.&lt;/li&gt;
&lt;li&gt;Because entries persist in the &lt;code dir=&quot;auto&quot;&gt;MemoryStore&lt;/code&gt; across runs, agents in later runs automatically see outputs from earlier runs.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The prompt injection looks like this:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;## Memory: ai-project&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;The following information from scope &quot;ai-project&quot; may be relevant:&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;---&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;Research findings from previous run: AI is accelerating in healthcare...&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;---&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;## Task&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;Analyse the research findings&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;There is no magic retrieval. The framework puts the memory content into the prompt, and the LLM uses it (or ignores it) during reasoning.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;pluggable-storage&quot;&gt;Pluggable Storage&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;&lt;code dir=&quot;auto&quot;&gt;MemoryStore&lt;/code&gt; has two built-in implementations:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;In-memory&lt;/strong&gt; stores entries in insertion order per scope. Retrieval returns the most recent entries without semantic search. Suitable for development, testing, and single-JVM runs. Entries do not survive JVM restarts.&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;MemoryStore&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;store&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;MemoryStore&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;inMemory&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Embedding-based&lt;/strong&gt; stores entries via an embedding model and retrieves them via semantic similarity search. The backing &lt;code dir=&quot;auto&quot;&gt;EmbeddingStore&lt;/code&gt; controls durability — Chroma, Qdrant, Pinecone, pgvector, or any LangChain4j-compatible store.&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;EmbeddingModel&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;embeddingModel&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;OpenAiEmbeddingModel&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;apiKey&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;System&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;getenv&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;))&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;modelName&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;text-embedding-3-small&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;EmbeddingStore&lt;/span&gt;&lt;span&gt;&amp;#x3C;&lt;/span&gt;&lt;span&gt;TextSegment&lt;/span&gt;&lt;span&gt;&gt; &lt;/span&gt;&lt;span&gt;embeddingStore&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;ChromaEmbeddingStore&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;baseUrl&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;http://localhost:8000&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;collectionName&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;agentensemble-memory&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;MemoryStore&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;store&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;MemoryStore&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;embeddings&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;embeddingModel, embeddingStore&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;The design tradeoff is explicit. In-memory is fast and simple but loses data on restart and does not do semantic retrieval. Embedding-based is durable and semantically aware but requires an embedding model and a vector store. You choose based on your operational requirements.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;eviction-policies&quot;&gt;Eviction Policies&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;Unbounded memory is a prompt-size problem. Every stored entry adds tokens to the next run’s prompt. Scopes support optional eviction to keep sizes bounded:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;// Retain only the 5 most recent entries&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;MemoryScope&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;name&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;research&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;keepLastEntries&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;5&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;// Retain only entries from the past 7 days&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;MemoryScope&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;name&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;research&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;keepEntriesWithin&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;Duration&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;ofDays&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;7&lt;/span&gt;&lt;span&gt;))&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;Eviction is applied after each task stores its output. For embedding-based stores, eviction is a no-op since most embedding stores do not support deletion of individual entries.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;memorytool-agent-driven-memory-access&quot;&gt;MemoryTool: Agent-Driven Memory Access&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;In addition to the automatic scope-based mechanism, agents can interact with memory directly during their ReAct loop using &lt;code dir=&quot;auto&quot;&gt;MemoryTool&lt;/code&gt;:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;Agent&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;researcher&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;Agent&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;role&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Researcher&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;goal&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Research and remember important facts&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;tools&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;MemoryTool&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;of&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;research&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, store&lt;/span&gt;&lt;span&gt;))&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;&lt;code dir=&quot;auto&quot;&gt;MemoryTool&lt;/code&gt; provides two tool methods the LLM can call: &lt;code dir=&quot;auto&quot;&gt;storeMemory(key, value)&lt;/code&gt; to store an arbitrary fact, and &lt;code dir=&quot;auto&quot;&gt;retrieveMemory(query)&lt;/code&gt; to retrieve relevant memories by query.&lt;/p&gt;
&lt;p&gt;When the same &lt;code dir=&quot;auto&quot;&gt;MemoryStore&lt;/code&gt; instance is used for both &lt;code dir=&quot;auto&quot;&gt;MemoryTool&lt;/code&gt; and &lt;code dir=&quot;auto&quot;&gt;Ensemble.builder().memoryStore(...)&lt;/code&gt;, explicit tool access and automatic scope-based access share the same backing store. This means an agent can both receive automatic context from previous runs and actively query or store additional facts during execution.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;multiple-tasks-sharing-a-scope&quot;&gt;Multiple Tasks Sharing a Scope&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;Multiple tasks can declare the same scope name. Each task writes its output to the scope after it completes, so later tasks in a sequential workflow see earlier tasks’ outputs:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;Task&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;research&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;Task&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;description&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Research AI trends&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;memory&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;ai-project&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;Task&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;analysis&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;Task&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;description&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Analyse the research findings&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;memory&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;ai-project&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;Ensemble&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;task&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;research&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;task&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;analysis&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;memoryStore&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;store&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;run&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;This is within-run memory sharing. The analysis task sees the research task’s output because they share the &lt;code dir=&quot;auto&quot;&gt;&quot;ai-project&quot;&lt;/code&gt; scope. On the next run, both tasks see outputs from the previous run’s research and analysis.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;the-design-principle&quot;&gt;The Design Principle&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;The key design decision is that memory is opt-in and scoped, not global and automatic. An agent does not remember everything by default. Each task explicitly declares what it wants to remember and what it wants to recall.&lt;/p&gt;
&lt;p&gt;This makes the system easier to reason about. You can look at a task definition and know exactly what memory context it will receive. You can test a task with a pre-populated store and verify that it uses the memory correctly. You can clear a scope without affecting other scopes.&lt;/p&gt;
&lt;p&gt;The tradeoff is that you have to think about memory design upfront. Which tasks share scopes? How many entries should be retained? Should you use semantic search or recency-based retrieval? These are design decisions that the framework surfaces explicitly rather than hiding behind defaults.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;The full memory guide is in the &lt;a href=&quot;https://agentensemble.net/guides/memory/&quot;&gt;documentation&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I’d be interested in how you handle the prompt-size tension — whether bounded eviction is sufficient, or whether you have needed more sophisticated retrieval strategies for production memory systems.&lt;/p&gt;</content:encoded><category>java</category><category>ai</category><category>agents</category><category>architecture</category></item><item><title>Tool Pipelines: Eliminating LLM Round-Trips for Deterministic Tool Chains</title><link>https://agentensemble.net/blog/tool-pipelines/</link><guid isPermaLink="true">https://agentensemble.net/blog/tool-pipelines/</guid><description>ToolPipeline chains multiple tools into a single compound tool that executes without LLM mediation between steps.

</description><pubDate>Wed, 27 May 2026 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;In a standard ReAct loop, every tool call requires an LLM round-trip. The agent calls a search tool, receives results, reasons about them, calls a filter tool, receives filtered output, reasons again, calls a format tool, and so on. Each step costs tokens, adds latency, and requires the LLM to make a decision that is often trivial — the next step in the chain is predetermined.&lt;/p&gt;
&lt;p&gt;For deterministic data transformation chains, the LLM adds no reasoning value between steps. It just passes the output of one tool as input to the next. The interesting question is whether you can collapse that chain into a single tool call.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;the-toolpipeline-abstraction&quot;&gt;The ToolPipeline Abstraction&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;&lt;a href=&quot;https://github.com/AgentEnsemble/agentensemble&quot;&gt;AgentEnsemble&lt;/a&gt; provides &lt;code dir=&quot;auto&quot;&gt;ToolPipeline&lt;/code&gt;, which chains multiple tools into a single compound tool. The LLM calls it once; all steps execute sequentially without LLM round-trips between them.&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;// Standard ReAct loop (3 LLM round-trips for tool mediation):&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;LLM -&gt; search_tool -&gt; LLM -&gt; filter_tool -&gt; LLM -&gt; format_tool -&gt; LLM&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;// With ToolPipeline (0 extra round-trips):&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;LLM -&gt; search_then_filter_then_format -&gt; LLM&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;The simplest way to create one:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;ToolPipeline&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;pipeline&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;ToolPipeline&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;of&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;new&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;WebSearchTool&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;provider&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;new&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;JsonParserTool&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;FileWriteTool&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;of&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;outputPath&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;// name: &quot;web_search_then_json_parser_then_file_write&quot;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;Register it on a task like any other tool:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;var&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;task&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;Task&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;description&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Research AI trends and save the top result to disk&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;expectedOutput&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Confirmation that the result was saved&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;tools&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;List&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;of&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;pipeline&lt;/span&gt;&lt;span&gt;))&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;div&gt;&lt;h2 id=&quot;data-flow-and-adapters&quot;&gt;Data Flow and Adapters&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;By default, &lt;code dir=&quot;auto&quot;&gt;ToolResult.getOutput()&lt;/code&gt; from step N is passed as the input to step N+1. This works when tool outputs are directly consumable by the next tool.&lt;/p&gt;
&lt;p&gt;When you need to reshape data between steps, attach an adapter:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;ToolPipeline&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;pipeline&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;ToolPipeline&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;name&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;extract_and_calculate&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;description&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Extract a numeric field from JSON and apply a formula&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;step&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;new&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;JsonParserTool&lt;/span&gt;&lt;span&gt;())&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;adapter&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;result &lt;/span&gt;&lt;span&gt;-&gt;&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;result&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;getOutput&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;+&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt; * 1.1&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;step&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;new&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;CalculatorTool&lt;/span&gt;&lt;span&gt;())&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;The adapter transforms the &lt;code dir=&quot;auto&quot;&gt;JsonParserTool&lt;/code&gt; output (e.g., &lt;code dir=&quot;auto&quot;&gt;&quot;149.99&quot;&lt;/code&gt;) into a calculator expression (&lt;code dir=&quot;auto&quot;&gt;&quot;149.99 * 1.1&quot;&lt;/code&gt;) before passing it to &lt;code dir=&quot;auto&quot;&gt;CalculatorTool&lt;/code&gt;. Adapters have full access to &lt;code dir=&quot;auto&quot;&gt;ToolResult&lt;/code&gt;, including &lt;code dir=&quot;auto&quot;&gt;getStructuredOutput()&lt;/code&gt; for typed payloads.&lt;/p&gt;
&lt;p&gt;This is the key design decision: adapters are plain Java functions, not LLM calls. They handle the deterministic reshaping that the LLM would otherwise do at full inference cost.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;error-strategies&quot;&gt;Error Strategies&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;Pipelines support two error strategies:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;FAIL_FAST&lt;/strong&gt; (default) stops the pipeline on the first failed step and returns that failure to the LLM immediately. Subsequent steps are never executed.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;CONTINUE_ON_FAILURE&lt;/strong&gt; continues executing subsequent steps even when an intermediate step fails. The failed step’s error message is forwarded as input to the next step.&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;ToolPipeline&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;pipeline&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;ToolPipeline&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;name&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;resilient_pipeline&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;description&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Continues even when a step fails&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;step&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;stepA&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;step&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;stepB&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;step&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;stepC&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;errorStrategy&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;PipelineErrorStrategy&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;&lt;span&gt;CONTINUE_ON_FAILURE&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;The choice between them depends on whether downstream steps can recover from upstream failures. For a search-then-save pipeline, FAIL_FAST makes sense — there is nothing to save if the search failed. For a multi-source aggregation, CONTINUE_ON_FAILURE lets the pipeline produce partial results.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;approval-gates-within-pipelines&quot;&gt;Approval Gates Within Pipelines&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;Steps inside a pipeline that require human approval will pause mid-pipeline, exactly as if they were standalone tools. The pipeline propagates the ensemble’s &lt;code dir=&quot;auto&quot;&gt;ReviewHandler&lt;/code&gt; to all nested steps automatically.&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;ToolPipeline&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;pipeline&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;ToolPipeline&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;of&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;new&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;JsonParserTool&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;FileWriteTool&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;outputPath&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;requireApproval&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;true&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;This means you can build pipelines that include a human checkpoint before a destructive operation (like writing to disk or calling an external API) without losing the token savings for the deterministic steps before the checkpoint.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;nesting-and-composition&quot;&gt;Nesting and Composition&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;A &lt;code dir=&quot;auto&quot;&gt;ToolPipeline&lt;/code&gt; implements &lt;code dir=&quot;auto&quot;&gt;AgentTool&lt;/code&gt;, so it can be used as a step inside another pipeline:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;ToolPipeline&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;inner&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;ToolPipeline&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;of&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;step_a&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;desc&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, toolA, toolB&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;ToolPipeline&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;outer&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;ToolPipeline&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;of&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;outer&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;desc&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, inner, toolC&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;This lets you build reusable pipeline fragments and compose them into larger chains. Each pipeline records its own aggregate metrics (timing, success/failure counts) in addition to the per-step metrics from individual tools.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;when-to-use-pipelines-vs-separate-tools&quot;&gt;When to Use Pipelines vs. Separate Tools&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;The decision boundary is whether the LLM needs to reason between steps.&lt;/p&gt;
&lt;p&gt;Use &lt;code dir=&quot;auto&quot;&gt;ToolPipeline&lt;/code&gt; when steps are deterministic and order-locked — the LLM should not skip or reorder them, and the data transformations between steps are mechanical. The full chain appears as one operation to the LLM.&lt;/p&gt;
&lt;p&gt;Use separate tools when the LLM needs to decide which tool to call next based on intermediate results, or when intermediate results are useful for the LLM to see and reason about.&lt;/p&gt;
&lt;p&gt;In practice, this means pipelines work well for data retrieval and transformation chains (search, parse, filter, write), while separate tools work better for exploratory workflows where the agent needs to adapt its approach based on what it finds.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;the-broader-pattern&quot;&gt;The Broader Pattern&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;ToolPipeline is one instance of a broader design principle in AgentEnsemble: when something is deterministic, do not pay LLM inference costs for it. This same principle appears in &lt;a href=&quot;https://agentensemble.net/blog/deterministic-only-orchestration/&quot;&gt;deterministic-only orchestration&lt;/a&gt; (tasks that never call an LLM), &lt;a href=&quot;https://agentensemble.net/blog/typed-tool-inputs/&quot;&gt;typed tool inputs&lt;/a&gt; (schema validation without LLM intervention), and &lt;a href=&quot;https://agentensemble.net/blog/phase-level-workflow-grouping/&quot;&gt;phase-level workflow grouping&lt;/a&gt; (execution order declared in code, not negotiated by the LLM).&lt;/p&gt;
&lt;p&gt;The common thread is that the framework should handle mechanical work mechanically, and reserve LLM inference for decisions that actually require reasoning.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;The full tool pipeline guide is in the &lt;a href=&quot;https://agentensemble.net/guides/tool-pipeline/&quot;&gt;documentation&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Curious whether you have seen tool chains where the boundary between “deterministic” and “needs reasoning” is ambiguous, and how you would draw that line.&lt;/p&gt;</content:encoded><category>java</category><category>ai</category><category>agents</category><category>architecture</category></item><item><title>Guardrails for Agent Output: Pluggable Validation Before and After LLM Calls</title><link>https://agentensemble.net/blog/guardrails-for-agent-output/</link><guid isPermaLink="true">https://agentensemble.net/blog/guardrails-for-agent-output/</guid><description>Input and output guardrails give you programmatic control over what enters and exits each agent task, without modifying prompts.

</description><pubDate>Mon, 25 May 2026 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;One of the harder problems in agent systems is constraining output quality without turning every prompt into a wall of instructions. You can ask the LLM to stay under 3000 characters, or to always include a conclusion section, or to never mention competitor products. But prompt-based constraints are probabilistic. The LLM might follow them. It might not.&lt;/p&gt;
&lt;p&gt;Guardrails are the deterministic layer. They run as Java code before and after the LLM call, and they enforce rules that prompts cannot guarantee.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;the-model&quot;&gt;The Model&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;&lt;a href=&quot;https://github.com/AgentEnsemble/agentensemble&quot;&gt;AgentEnsemble&lt;/a&gt; implements guardrails as two functional interfaces: &lt;code dir=&quot;auto&quot;&gt;InputGuardrail&lt;/code&gt; and &lt;code dir=&quot;auto&quot;&gt;OutputGuardrail&lt;/code&gt;. Both return a &lt;code dir=&quot;auto&quot;&gt;GuardrailResult&lt;/code&gt; — either success or failure with a reason.&lt;/p&gt;
&lt;p&gt;Input guardrails run before the LLM is contacted. If any fails, execution stops immediately and the agent’s LLM is never called. Output guardrails run after the agent produces a response (and after structured output parsing, if configured).&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;InputGuardrail&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;piiGuardrail&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; input &lt;/span&gt;&lt;span&gt;-&gt;&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;String&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;desc&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;input&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;taskDescription&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;toLowerCase&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;if&lt;/span&gt;&lt;span&gt; (&lt;/span&gt;&lt;span&gt;desc&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;contains&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;ssn&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;||&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;desc&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;contains&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;credit card&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;) {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;return&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;GuardrailResult&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;failure&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;            &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Task description may contain personally identifiable information&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;return&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;GuardrailResult&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;success&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;};&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;OutputGuardrail&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;lengthGuardrail&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; output &lt;/span&gt;&lt;span&gt;-&gt;&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;if&lt;/span&gt;&lt;span&gt; (&lt;/span&gt;&lt;span&gt;output&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;rawResponse&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;length&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;&gt;&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;3000&lt;/span&gt;&lt;span&gt;) {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;return&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;GuardrailResult&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;failure&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;            &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Response is &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;+&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;output&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;rawResponse&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;length&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;            &lt;/span&gt;&lt;span&gt;+&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt; chars, exceeds limit of 3000&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;return&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;GuardrailResult&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;success&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;};&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;Both are configured per-task:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;var&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;task&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;Task&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;description&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Write an executive summary&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;expectedOutput&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;A concise summary&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;agent&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;writer&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;inputGuardrails&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;List&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;of&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;piiGuardrail&lt;/span&gt;&lt;span&gt;))&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;outputGuardrails&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;List&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;of&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;lengthGuardrail&lt;/span&gt;&lt;span&gt;))&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;div&gt;&lt;h2 id=&quot;why-functional-interfaces&quot;&gt;Why Functional Interfaces&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;The choice to make guardrails functional interfaces rather than annotation-based or configuration-driven has a few practical consequences.&lt;/p&gt;
&lt;p&gt;First, guardrails are composable. You can build them from lambdas, combine them, or wrap them in utility methods. A guardrail that checks for PII can be reused across every task in the ensemble without any framework-specific wiring.&lt;/p&gt;
&lt;p&gt;Second, they are testable in isolation. A guardrail is a pure function from input to result. You can unit test it without standing up an ensemble or mocking an LLM.&lt;/p&gt;
&lt;p&gt;Third, they are stateless by default. Since guardrails may run concurrently (in parallel workflows), stateless lambdas are inherently thread-safe. If you need stateful validation, thread safety is your responsibility.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;what-input-guardrails-see&quot;&gt;What Input Guardrails See&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;The &lt;code dir=&quot;auto&quot;&gt;GuardrailInput&lt;/code&gt; record carries everything you need to make a pre-execution decision:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code dir=&quot;auto&quot;&gt;taskDescription()&lt;/code&gt; — the task description text&lt;/li&gt;
&lt;li&gt;&lt;code dir=&quot;auto&quot;&gt;expectedOutput()&lt;/code&gt; — the expected output specification&lt;/li&gt;
&lt;li&gt;&lt;code dir=&quot;auto&quot;&gt;contextOutputs()&lt;/code&gt; — outputs from prior context tasks (immutable)&lt;/li&gt;
&lt;li&gt;&lt;code dir=&quot;auto&quot;&gt;agentRole()&lt;/code&gt; — the role of the agent about to execute&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This means you can write guardrails that check not just the current task, but the outputs of upstream tasks. For example, a guardrail that rejects a writing task if the research task upstream produced no findings:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;InputGuardrail&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;requireResearch&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; input &lt;/span&gt;&lt;span&gt;-&gt;&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;boolean&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;hasResearch&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;input&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;contextOutputs&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;stream&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;anyMatch&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;o &lt;/span&gt;&lt;span&gt;-&gt;&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;o&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;getRaw&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;length&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;&gt;&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;100&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;if&lt;/span&gt;&lt;span&gt; (&lt;/span&gt;&lt;span&gt;!&lt;/span&gt;&lt;span&gt;hasResearch) {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;return&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;GuardrailResult&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;failure&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;No substantive research output found&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;return&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;GuardrailResult&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;success&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;};&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;div&gt;&lt;h2 id=&quot;output-guardrails-and-typed-output&quot;&gt;Output Guardrails and Typed Output&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;When a task uses &lt;code dir=&quot;auto&quot;&gt;outputType&lt;/code&gt; for structured output, the execution order is:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Input guardrails run (before LLM)&lt;/li&gt;
&lt;li&gt;LLM executes and produces raw text&lt;/li&gt;
&lt;li&gt;Structured output parsing (JSON extraction + deserialization)&lt;/li&gt;
&lt;li&gt;Output guardrails run (with both &lt;code dir=&quot;auto&quot;&gt;rawResponse()&lt;/code&gt; and &lt;code dir=&quot;auto&quot;&gt;parsedOutput()&lt;/code&gt; available)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This means output guardrails can inspect the typed Java object directly:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;record&lt;/span&gt;&lt;span&gt; ResearchReport&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;String&lt;/span&gt;&lt;span&gt; title, &lt;/span&gt;&lt;span&gt;List&amp;#x3C;&lt;/span&gt;&lt;span&gt;String&lt;/span&gt;&lt;span&gt;&gt;&lt;/span&gt;&lt;span&gt; findings, &lt;/span&gt;&lt;span&gt;String&lt;/span&gt;&lt;span&gt; conclusion&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt; {}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;OutputGuardrail&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;findingsGuardrail&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; output &lt;/span&gt;&lt;span&gt;-&gt;&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;if&lt;/span&gt;&lt;span&gt; (&lt;/span&gt;&lt;span&gt;output&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;parsedOutput&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;instanceof&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;ResearchReport&lt;/span&gt;&lt;span&gt; report) {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;if&lt;/span&gt;&lt;span&gt; (&lt;/span&gt;&lt;span&gt;report&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;findings&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;==&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;null&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;||&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;report&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;findings&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;isEmpty&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;) {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;            &lt;/span&gt;&lt;span&gt;return&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;GuardrailResult&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;failure&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;                &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Report must include at least one finding&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;return&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;GuardrailResult&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;success&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;};&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;This is where guardrails and typed outputs reinforce each other. The type system gives you a parsed object; the guardrail gives you a place to enforce business rules on that object.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;multiple-guardrails-and-evaluation-order&quot;&gt;Multiple Guardrails and Evaluation Order&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;Multiple guardrails per task are evaluated in order. The first failure stops evaluation — subsequent guardrails are not called.&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;var&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;task&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;Task&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;description&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Write an article&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;expectedOutput&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;An article&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;agent&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;writer&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;inputGuardrails&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;List&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;of&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;piiGuardrail, roleGuardrail, domainGuardrail&lt;/span&gt;&lt;span&gt;))&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;outputGuardrails&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;List&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;of&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;lengthGuardrail, conclusionGuardrail&lt;/span&gt;&lt;span&gt;))&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;If you want to collect all failures rather than short-circuit, compose them into a single guardrail:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;InputGuardrail&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;compositeGuardrail&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; input &lt;/span&gt;&lt;span&gt;-&gt;&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;List&lt;/span&gt;&lt;span&gt;&amp;#x3C;&lt;/span&gt;&lt;span&gt;String&lt;/span&gt;&lt;span&gt;&gt; &lt;/span&gt;&lt;span&gt;failures&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;new&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;ArrayList&lt;/span&gt;&lt;span&gt;&amp;#x3C;&gt;();&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;for&lt;/span&gt;&lt;span&gt; (&lt;/span&gt;&lt;span&gt;InputGuardrail&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;g&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;:&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;List&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;of&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;piiGuardrail, roleGuardrail&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;) {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;GuardrailResult&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;r&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;g&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;validate&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;input&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;if&lt;/span&gt;&lt;span&gt; (&lt;/span&gt;&lt;span&gt;!&lt;/span&gt;&lt;span&gt;r&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;isSuccess&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;) &lt;/span&gt;&lt;span&gt;failures&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;add&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;r&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;getMessage&lt;/span&gt;&lt;span&gt;())&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;return&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;failures&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;isEmpty&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;?&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;GuardrailResult&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;success&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;:&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;GuardrailResult&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;failure&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;String&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;join&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;; &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, failures&lt;/span&gt;&lt;span&gt;))&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;};&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;div&gt;&lt;h2 id=&quot;exception-propagation&quot;&gt;Exception Propagation&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;When a guardrail fails, &lt;code dir=&quot;auto&quot;&gt;GuardrailViolationException&lt;/code&gt; is thrown. It propagates through the workflow executor and is wrapped in &lt;code dir=&quot;auto&quot;&gt;TaskExecutionException&lt;/code&gt;, following the same pattern as other task failures.&lt;/p&gt;
&lt;p&gt;The exception carries structured information — guardrail type (INPUT or OUTPUT), violation message, task description, and agent role — so you can route failures to metrics or alerting without parsing error strings.&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;try&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;ensemble&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;run&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;} &lt;/span&gt;&lt;span&gt;catch&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;TaskExecutionException&lt;/span&gt;&lt;span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;ex&lt;/span&gt;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;if&lt;/span&gt;&lt;span&gt; (&lt;/span&gt;&lt;span&gt;ex&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;getCause&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;instanceof&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;GuardrailViolationException&lt;/span&gt;&lt;span&gt; gve) {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;metrics&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;increment&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;guardrail.violation.&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;+&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;gve&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;getGuardrailType&lt;/span&gt;&lt;span&gt;())&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;log&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;warn&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Guardrail blocked task &apos;{}&apos;: {}&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;            &lt;/span&gt;&lt;span&gt;gve&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;getTaskDescription&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;gve&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;getViolationMessage&lt;/span&gt;&lt;span&gt;())&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;div&gt;&lt;h2 id=&quot;the-tradeoff&quot;&gt;The Tradeoff&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;Guardrails are deterministic checks, not semantic analysis. A length limit is easy to enforce. A toxicity check is harder — you would need to call an external classifier inside the guardrail, which adds latency and its own failure modes.&lt;/p&gt;
&lt;p&gt;The design intentionally keeps guardrails as simple synchronous functions. If you need async validation, external API calls, or retry logic, you implement that inside the guardrail function. The framework does not impose an opinion on how complex your validation should be.&lt;/p&gt;
&lt;p&gt;This means guardrails are most useful for structural and policy checks — length limits, required sections, PII filters, role-based access, schema validation on typed outputs. For semantic quality checks, the phase review and task reflection mechanisms (covered in earlier posts) are a better fit.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;The full guardrails guide is in the &lt;a href=&quot;https://agentensemble.net/guides/guardrails/&quot;&gt;documentation&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I’d be interested in whether the input/output split feels like the right abstraction, or whether you have seen validation needs that do not fit cleanly into either category.&lt;/p&gt;</content:encoded><category>java</category><category>ai</category><category>agents</category><category>architecture</category></item><item><title>Wiring Agent Ensembles into Spring Boot, Micronaut, and Quarkus</title><link>https://agentensemble.net/blog/framework-integration/</link><guid isPermaLink="true">https://agentensemble.net/blog/framework-integration/</guid><description>AgentEnsemble is a plain Java library. Framework integration is just dependency injection around the same builder API.

</description><pubDate>Sat, 23 May 2026 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;One question that comes up early when evaluating an agent orchestration library is how it fits into an existing backend stack. If your services run on Spring Boot, Micronaut, or Quarkus, you want agents to live inside the same dependency injection container, use the same configuration system, and expose metrics through the same actuator endpoints.&lt;/p&gt;
&lt;p&gt;The interesting design decision in &lt;a href=&quot;https://github.com/AgentEnsemble/agentensemble&quot;&gt;AgentEnsemble&lt;/a&gt; is that it has no framework dependencies at all. It is a plain Java 21+ library with a builder API. Framework integration is just a matter of wrapping those builder calls in whatever DI mechanism your framework uses. Nothing in the library changes.&lt;/p&gt;
&lt;p&gt;This keeps the core small and testable, but it also means the integration patterns are worth spelling out explicitly.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;the-builder-api-as-the-integration-surface&quot;&gt;The Builder API as the Integration Surface&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;Every AgentEnsemble component — agents, tasks, ensembles, memory stores, listeners — is created through builders. The framework never scans for annotations, never registers beans automatically, and never assumes a particular lifecycle model.&lt;/p&gt;
&lt;p&gt;This is deliberate. The builder API is the integration surface. In a DI container, you turn builder calls into bean definitions. In a plain &lt;code dir=&quot;auto&quot;&gt;main()&lt;/code&gt; method, you call the same builders directly.&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;Agent&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;researcher&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;Agent&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;role&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Research Analyst&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;goal&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Find accurate, up-to-date information&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;backstory&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;You are a meticulous researcher.&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;That same code works identically inside a Spring &lt;code dir=&quot;auto&quot;&gt;@Configuration&lt;/code&gt;, a Micronaut &lt;code dir=&quot;auto&quot;&gt;@Factory&lt;/code&gt;, a Quarkus CDI producer, or a static main method.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;spring-boot&quot;&gt;Spring Boot&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;Spring Boot is the most common case. The LangChain4j Spring Boot starters handle &lt;code dir=&quot;auto&quot;&gt;ChatLanguageModel&lt;/code&gt; bean creation from &lt;code dir=&quot;auto&quot;&gt;application.properties&lt;/code&gt; automatically — AgentEnsemble does not duplicate that responsibility.&lt;/p&gt;
&lt;div&gt;&lt;h3 id=&quot;dependencies&quot;&gt;Dependencies&lt;/h3&gt;&lt;/div&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;dependencies&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;implementation&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;net.agentensemble:agentensemble-core:2.10.0&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;implementation&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;dev.langchain4j:langchain4j-spring-boot-starter:1.11.0&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;implementation&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;dev.langchain4j:langchain4j-open-ai-spring-boot-starter:1.11.0&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;// Optional: metrics via Spring Boot Actuator&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;implementation&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;net.agentensemble:agentensemble-metrics-micrometer:2.10.0&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;div&gt;&lt;h3 id=&quot;configuration-class&quot;&gt;Configuration Class&lt;/h3&gt;&lt;/div&gt;
&lt;p&gt;Spring injects the &lt;code dir=&quot;auto&quot;&gt;ChatLanguageModel&lt;/code&gt; bean (created by the LangChain4j starter) and any &lt;code dir=&quot;auto&quot;&gt;EnsembleListener&lt;/code&gt; beans you have declared elsewhere.&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;@&lt;/span&gt;&lt;span&gt;Configuration&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;public&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;class&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;AgentEnsembleConfig&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;@&lt;/span&gt;&lt;span&gt;Bean&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;public&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;Agent&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;researcher&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;return&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;Agent&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;role&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Research Analyst&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;goal&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Find accurate, up-to-date information on the given topic&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;backstory&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;You are a meticulous researcher with a talent for &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;                        &lt;/span&gt;&lt;span&gt;+&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;finding relevant information quickly.&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;@&lt;/span&gt;&lt;span&gt;Bean&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;public&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;Ensemble&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;ensemble&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;            &lt;/span&gt;&lt;span&gt;ChatLanguageModel&lt;/span&gt;&lt;span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;chatModel&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;            &lt;/span&gt;&lt;span&gt;Agent&lt;/span&gt;&lt;span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;researcher&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;            &lt;/span&gt;&lt;span&gt;List&lt;/span&gt;&lt;span&gt;&amp;#x3C;&lt;/span&gt;&lt;span&gt;EnsembleListener&lt;/span&gt;&lt;span&gt;&lt;span&gt;&gt; &lt;/span&gt;&lt;span&gt;listeners&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;            &lt;/span&gt;&lt;span&gt;Optional&lt;/span&gt;&lt;span&gt;&amp;#x3C;&lt;/span&gt;&lt;span&gt;ToolMetrics&lt;/span&gt;&lt;span&gt;&lt;span&gt;&gt; &lt;/span&gt;&lt;span&gt;toolMetrics&lt;/span&gt;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;Ensemble&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;Builder&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;Ensemble&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;chatModel&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;chatModel&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;agents&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;researcher&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;listeners&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;forEach&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;::&lt;/span&gt;&lt;span&gt;listener&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;toolMetrics&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;ifPresent&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;::&lt;/span&gt;&lt;span&gt;toolMetrics&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;return&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;The pattern here is standard Spring: declare beans, let Spring wire them. Any &lt;code dir=&quot;auto&quot;&gt;@Component&lt;/code&gt; implementing &lt;code dir=&quot;auto&quot;&gt;EnsembleListener&lt;/code&gt; is automatically collected via the &lt;code dir=&quot;auto&quot;&gt;List&amp;#x3C;EnsembleListener&gt;&lt;/code&gt; injection.&lt;/p&gt;
&lt;div&gt;&lt;h3 id=&quot;metrics-via-actuator&quot;&gt;Metrics via Actuator&lt;/h3&gt;&lt;/div&gt;
&lt;p&gt;If you use Micrometer with Spring Boot Actuator, declare a &lt;code dir=&quot;auto&quot;&gt;ToolMetrics&lt;/code&gt; bean and agent metrics appear at &lt;code dir=&quot;auto&quot;&gt;/actuator/metrics&lt;/code&gt; automatically:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;@&lt;/span&gt;&lt;span&gt;Bean&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;public&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;ToolMetrics&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;toolMetrics&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;MeterRegistry&lt;/span&gt;&lt;span&gt; registry&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;return&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;new&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;MicrometerToolMetrics&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;registry&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;div&gt;&lt;h3 id=&quot;using-the-ensemble&quot;&gt;Using the Ensemble&lt;/h3&gt;&lt;/div&gt;
&lt;p&gt;Inject the &lt;code dir=&quot;auto&quot;&gt;Ensemble&lt;/code&gt; bean wherever you need it. Build tasks at the call site where you have the runtime inputs:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;@&lt;/span&gt;&lt;span&gt;Service&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;public&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;class&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;ResearchService&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;private&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;final&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;Ensemble&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;ensemble&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;private&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;final&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;Agent&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;researcher&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;public&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;ResearchService&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;Ensemble&lt;/span&gt;&lt;span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;ensemble&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;/span&gt;&lt;span&gt;Agent&lt;/span&gt;&lt;span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;researcher&lt;/span&gt;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;this&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;ensemble&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; ensemble;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;this&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;researcher&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; researcher;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;public&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;String&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;research&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;String&lt;/span&gt;&lt;span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;topic&lt;/span&gt;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;Task&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;task&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;Task&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;description&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Research and summarise: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;+&lt;/span&gt;&lt;span&gt; topic&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;expectedOutput&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;A concise summary with key findings&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;agent&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;researcher&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;return&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;ensemble&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;run&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;task&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;finalOutput&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;div&gt;&lt;h2 id=&quot;micronaut&quot;&gt;Micronaut&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;Micronaut does not have a LangChain4j integration module, so you create the &lt;code dir=&quot;auto&quot;&gt;ChatLanguageModel&lt;/code&gt; bean directly. The rest of the pattern is the same — a &lt;code dir=&quot;auto&quot;&gt;@Factory&lt;/code&gt; class with &lt;code dir=&quot;auto&quot;&gt;@Singleton&lt;/code&gt; methods.&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;@&lt;/span&gt;&lt;span&gt;Factory&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;public&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;class&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;AgentEnsembleFactory&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;@&lt;/span&gt;&lt;span&gt;Singleton&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;public&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;ChatLanguageModel&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;chatModel&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;            &lt;/span&gt;&lt;/span&gt;&lt;span&gt;@&lt;/span&gt;&lt;span&gt;Value&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;${agentensemble.openai.api-key}&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;) &lt;/span&gt;&lt;span&gt;String&lt;/span&gt;&lt;span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;apiKey&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;            &lt;/span&gt;&lt;/span&gt;&lt;span&gt;@&lt;/span&gt;&lt;span&gt;Value&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;${agentensemble.openai.model-name}&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;) &lt;/span&gt;&lt;span&gt;String&lt;/span&gt;&lt;span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;modelName&lt;/span&gt;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;return&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;OpenAiChatModel&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;apiKey&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;apiKey&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;modelName&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;modelName&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;@&lt;/span&gt;&lt;span&gt;Singleton&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;public&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;Ensemble&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;ensemble&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;            &lt;/span&gt;&lt;span&gt;ChatLanguageModel&lt;/span&gt;&lt;span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;chatModel&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;            &lt;/span&gt;&lt;span&gt;Agent&lt;/span&gt;&lt;span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;researcher&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;            &lt;/span&gt;&lt;span&gt;List&lt;/span&gt;&lt;span&gt;&amp;#x3C;&lt;/span&gt;&lt;span&gt;EnsembleListener&lt;/span&gt;&lt;span&gt;&lt;span&gt;&gt; &lt;/span&gt;&lt;span&gt;listeners&lt;/span&gt;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;Ensemble&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;Builder&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;Ensemble&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;chatModel&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;chatModel&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;agents&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;researcher&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;listeners&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;forEach&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;::&lt;/span&gt;&lt;span&gt;listener&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;return&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;Micronaut injects all &lt;code dir=&quot;auto&quot;&gt;EnsembleListener&lt;/code&gt; beans automatically via the &lt;code dir=&quot;auto&quot;&gt;List&amp;#x3C;EnsembleListener&gt;&lt;/code&gt; parameter. Micrometer metrics work out of the box since Micronaut ships with native Micrometer support.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;quarkus&quot;&gt;Quarkus&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;Quarkus has its own &lt;code dir=&quot;auto&quot;&gt;quarkus-langchain4j&lt;/code&gt; extension with a different programming model. The example below uses the standard LangChain4j library directly with Quarkus CDI:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;@&lt;/span&gt;&lt;span&gt;ApplicationScoped&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;public&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;class&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;AgentEnsembleProducer&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;@&lt;/span&gt;&lt;span&gt;ConfigProperty&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;name&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;agentensemble.openai.api-key&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;String&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;apiKey&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;@&lt;/span&gt;&lt;span&gt;Produces&lt;/span&gt;&lt;span&gt; @&lt;/span&gt;&lt;span&gt;ApplicationScoped&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;public&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;ChatLanguageModel&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;chatModel&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;return&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;OpenAiChatModel&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;apiKey&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;apiKey&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;modelName&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;gpt-4o&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;@&lt;/span&gt;&lt;span&gt;Produces&lt;/span&gt;&lt;span&gt; @&lt;/span&gt;&lt;span&gt;ApplicationScoped&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;public&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;Ensemble&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;ensemble&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;            &lt;/span&gt;&lt;span&gt;ChatLanguageModel&lt;/span&gt;&lt;span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;chatModel&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;            &lt;/span&gt;&lt;span&gt;Agent&lt;/span&gt;&lt;span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;researcher&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;            &lt;/span&gt;&lt;span&gt;Instance&lt;/span&gt;&lt;span&gt;&amp;#x3C;&lt;/span&gt;&lt;span&gt;EnsembleListener&lt;/span&gt;&lt;span&gt;&lt;span&gt;&gt; &lt;/span&gt;&lt;span&gt;listeners&lt;/span&gt;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;Ensemble&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;Builder&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;Ensemble&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;chatModel&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;chatModel&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;                &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;agents&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;researcher&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;listeners&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;forEach&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;::&lt;/span&gt;&lt;span&gt;listener&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;return&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;The only Quarkus-specific detail is &lt;code dir=&quot;auto&quot;&gt;Instance&amp;#x3C;EnsembleListener&gt;&lt;/code&gt; instead of &lt;code dir=&quot;auto&quot;&gt;List&amp;#x3C;EnsembleListener&gt;&lt;/code&gt; — CDI’s lazy injection mechanism.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;the-design-tradeoff&quot;&gt;The Design Tradeoff&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;The choice to keep AgentEnsemble framework-agnostic means there is no auto-configuration, no classpath scanning, and no starter module that wires everything with a single dependency. You write the configuration class yourself.&lt;/p&gt;
&lt;p&gt;The upside is that the integration is completely transparent. There is no hidden magic, no classpath-sensitive behavior, and no risk of version conflicts between the library’s framework assumptions and your application’s framework version. The builder API is the same everywhere, so moving between frameworks (or running without one) requires changing only the DI wiring.&lt;/p&gt;
&lt;p&gt;For teams that already have a preferred framework and know how to write configuration classes, this is usually the right tradeoff. The wiring code is small, readable, and lives in one place.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;what-crosses-the-di-boundary&quot;&gt;What Crosses the DI Boundary&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;A few integration points are worth calling out:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Listeners&lt;/strong&gt; integrate naturally as DI beans. Declare any &lt;code dir=&quot;auto&quot;&gt;EnsembleListener&lt;/code&gt; implementation as a bean, and the ensemble configuration collects them.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Memory&lt;/strong&gt; components (&lt;code dir=&quot;auto&quot;&gt;MemoryStore&lt;/code&gt;, &lt;code dir=&quot;auto&quot;&gt;EnsembleMemory&lt;/code&gt;) are created via builders and passed to the ensemble. In a DI framework, declare them as beans.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Tools&lt;/strong&gt; are configured per-agent. Declare tool instances as beans and inject them into agent factory methods.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Metrics&lt;/strong&gt; via &lt;code dir=&quot;auto&quot;&gt;MicrometerToolMetrics&lt;/code&gt; plug into whatever &lt;code dir=&quot;auto&quot;&gt;MeterRegistry&lt;/code&gt; your framework provides.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The general rule: if a component is created via a builder, it can be a bean. If it is passed to the ensemble builder, it can be injected.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;The framework integration guide and full code examples are in the &lt;a href=&quot;https://agentensemble.net/guides/framework-integration/&quot;&gt;documentation&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I’d be interested in whether this level of framework-agnosticism feels right, or whether starter modules that auto-configure common setups would be more useful for your team.&lt;/p&gt;</content:encoded><category>java</category><category>ai</category><category>agents</category><category>spring</category></item><item><title>Operating Agent Networks: Visual Topology, Drill-Down, and Runtime Visibility</title><link>https://agentensemble.net/blog/network-dashboard-and-operations/</link><guid isPermaLink="true">https://agentensemble.net/blog/network-dashboard-and-operations/</guid><description>The operational layer for agent networks -- interactive topology graphs, per-ensemble drill-down, audit trails, and the dashboard that ties discovery, capacity, and health together.

</description><pubDate>Thu, 21 May 2026 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;Building an agent network is one problem. Operating it is a different one. When you have ten ensembles communicating over WebSockets, sharing capabilities via discovery, routing requests across federation boundaries, and managing capacity with priority queues — you need to see what is happening.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;the-visibility-gap&quot;&gt;The visibility gap&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;Individual ensemble dashboards show what one ensemble is doing. They do not show the network — how ensembles relate to each other, where requests flow, and where bottlenecks form.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;the-network-dashboard&quot;&gt;The network dashboard&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;AgentEnsemble’s network dashboard provides a topology view of the entire ensemble network:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Ensembles as interactive nodes with lifecycle state, queue depth, and progress&lt;/li&gt;
&lt;li&gt;Shared capabilities as animated edges between nodes&lt;/li&gt;
&lt;li&gt;Click to open ensemble detail sidebar (capabilities, metrics, connection status)&lt;/li&gt;
&lt;li&gt;Drill-down to live execution dashboard for any ensemble&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Architecture: independent WebSocket connections to each ensemble, no central aggregator. The dashboard has no persistent state — refresh reconnects and rebuilds.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;audit-trail&quot;&gt;Audit trail&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;The historical record of network events: work requests, capacity changes, discovery events, federation routing decisions. Append-only, backed by the same transport infrastructure (in-memory for dev, Kafka for production).&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;three-levels-of-visibility&quot;&gt;Three levels of visibility&lt;/h2&gt;&lt;/div&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Network level&lt;/strong&gt; — topology, connections, capacity distribution, routing patterns&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Ensemble level&lt;/strong&gt; — queue depth, active tasks, shared capabilities, health&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Execution level&lt;/strong&gt; — individual task traces, agent iterations, tool calls&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Each level answers different questions. The network dashboard provides levels 1 and 2, with drill-down to level 3. The audit trail adds the historical dimension.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;The network dashboard is part of &lt;a href=&quot;https://github.com/AgentEnsemble/agentensemble&quot;&gt;AgentEnsemble&lt;/a&gt;. The &lt;a href=&quot;https://agentensemble.net/guides/network-dashboard/&quot;&gt;network dashboard guide&lt;/a&gt; covers setup, and the &lt;a href=&quot;https://agentensemble.net/guides/audit-trail/&quot;&gt;audit trail guide&lt;/a&gt; covers the historical event log.&lt;/p&gt;</content:encoded><category>java</category><category>ai</category><category>agents</category><category>architecture</category></item><item><title>Testing Distributed Agent Systems: Stubs, Recordings, and Isolation</title><link>https://agentensemble.net/blog/testing-distributed-agent-systems/</link><guid isPermaLink="true">https://agentensemble.net/blog/testing-distributed-agent-systems/</guid><description>How to test agent ensembles that depend on remote capabilities -- network stubs for predictable behavior, recordings for assertion, and isolation testing without real connections.

</description><pubDate>Tue, 19 May 2026 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;Testing a single agent ensemble is already harder than testing most software: the output is non-deterministic, the execution path depends on LLM responses, and the number of iterations is unpredictable.&lt;/p&gt;
&lt;p&gt;Testing a &lt;em&gt;network&lt;/em&gt; of agent ensembles adds distributed system concerns on top of that: WebSocket connections between services, shared state across ensembles, capability discovery, and cross-ensemble delegation.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;the-testing-problem&quot;&gt;The testing problem&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;An ensemble that delegates work via &lt;code dir=&quot;auto&quot;&gt;NetworkTask&lt;/code&gt; or &lt;code dir=&quot;auto&quot;&gt;NetworkTool&lt;/code&gt; has external dependencies. In tests, you need control over what those dependencies return without running real ensembles.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;stubs-for-predictable-behavior&quot;&gt;Stubs for predictable behavior&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;&lt;code dir=&quot;auto&quot;&gt;NetworkTask.stub()&lt;/code&gt; returns canned responses without connecting to any real ensemble:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;StubNetworkTask&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;mealStub&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;NetworkTask&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;stub&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;kitchen&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;prepare-meal&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Meal prepared: wagyu steak, medium-rare. Estimated 25 minutes.&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;Ensemble&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;roomService&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;Ensemble&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;chatLanguageModel&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;model&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;task&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;Task&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;description&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Handle room service request&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;tools&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;mealStub&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;())&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;Deterministic network behavior while the ensemble’s own LLM interactions remain non-deterministic.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;recordings-for-assertion&quot;&gt;Recordings for assertion&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;&lt;code dir=&quot;auto&quot;&gt;NetworkTask.recording()&lt;/code&gt; captures every request for later assertion:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;RecordingNetworkTask&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;recorder&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;NetworkTask&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;recording&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;kitchen&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;prepare-meal&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;roomService&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;run&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;assertThat&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&lt;span&gt;recorder&lt;/span&gt;&lt;span&gt;.callCount&lt;/span&gt;&lt;/span&gt;&lt;span&gt;())&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;isEqualTo&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;1&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;assertThat&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&lt;span&gt;recorder&lt;/span&gt;&lt;span&gt;.lastRequest&lt;/span&gt;&lt;/span&gt;&lt;span&gt;())&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;contains&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;wagyu&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;div&gt;&lt;h2 id=&quot;testing-patterns-summary&quot;&gt;Testing patterns summary&lt;/h2&gt;&lt;/div&gt;






























&lt;table&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;What to test&lt;/th&gt;&lt;th&gt;Tool&lt;/th&gt;&lt;th&gt;Approach&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Ensemble uses network response correctly&lt;/td&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;NetworkTask.stub()&lt;/code&gt;&lt;/td&gt;&lt;td&gt;Canned response, deterministic&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Ensemble sends correct request&lt;/td&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;NetworkTask.recording()&lt;/code&gt;&lt;/td&gt;&lt;td&gt;Capture and assert&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Two ensembles work together&lt;/td&gt;&lt;td&gt;In-process transport&lt;/td&gt;&lt;td&gt;Real interaction, no network&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;End-to-end&lt;/td&gt;&lt;td&gt;WebSocket transport&lt;/td&gt;&lt;td&gt;Full integration test&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;
&lt;div&gt;&lt;h2 id=&quot;the-design-principle&quot;&gt;The design principle&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;Network behavior and business logic are separable concerns. Test doubles let you test business logic without infrastructure. In-process transport lets you test interaction without the network. Full integration tests verify everything works together.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;Network testing tools are part of &lt;a href=&quot;https://github.com/AgentEnsemble/agentensemble&quot;&gt;AgentEnsemble&lt;/a&gt;. The &lt;a href=&quot;https://agentensemble.net/guides/network-testing/&quot;&gt;network testing guide&lt;/a&gt; covers the full API.&lt;/p&gt;</content:encoded><category>java</category><category>ai</category><category>agents</category><category>architecture</category></item><item><title>Capacity Management in Agent Networks: Rate Limiting, Priority Queues, and Backpressure</title><link>https://agentensemble.net/blog/capacity-management-and-backpressure/</link><guid isPermaLink="true">https://agentensemble.net/blog/capacity-management-and-backpressure/</guid><description>Protecting agent networks from themselves -- rate limiting, priority queues with aging, operational profiles, and scheduled tasks for capacity-aware infrastructure.

</description><pubDate>Sun, 17 May 2026 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;Agent ensembles that run as long-lived services on a network will, at some point, receive more work than they can handle. The question is what happens next.&lt;/p&gt;
&lt;p&gt;Without capacity management, the answer is usually one of: unbounded queue growth (OOM), random request dropping, or cascade failures where an overloaded ensemble backs up its callers.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;the-capacity-problem-in-agent-networks&quot;&gt;The capacity problem in agent networks&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;Agent workloads have properties that make capacity management harder than in traditional request/response systems:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Variable execution time.&lt;/strong&gt; A simple analysis task might take 5 seconds. A complex coding task might take 5 minutes.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Variable cost.&lt;/strong&gt; Each agent iteration consumes LLM tokens. An overloaded system burns money faster.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Fan-out amplification.&lt;/strong&gt; One incoming request to a coordinator might fan out to 5 different ensembles.&lt;/li&gt;
&lt;/ul&gt;
&lt;div&gt;&lt;h2 id=&quot;three-layers-of-capacity-management&quot;&gt;Three layers of capacity management&lt;/h2&gt;&lt;/div&gt;
&lt;div&gt;&lt;h3 id=&quot;1-reactive-rate-limits-and-backpressure&quot;&gt;1. Reactive: Rate limits and backpressure&lt;/h3&gt;&lt;/div&gt;
&lt;p&gt;Concurrency limits protect ensembles from overload. When the limit is reached, requests queue. When the queue is full, backpressure signals propagate upstream.&lt;/p&gt;
&lt;div&gt;&lt;h3 id=&quot;2-priority-queues-with-aging&quot;&gt;2. Priority: Queues with aging&lt;/h3&gt;&lt;/div&gt;
&lt;p&gt;&lt;code dir=&quot;auto&quot;&gt;PriorityRequestQueue&lt;/code&gt; adds priority levels with aging to prevent starvation. Requests waiting beyond the aging interval get promoted, guaranteeing every request is eventually processed.&lt;/p&gt;
&lt;div&gt;&lt;h3 id=&quot;3-proactive-operational-profiles&quot;&gt;3. Proactive: Operational profiles&lt;/h3&gt;&lt;/div&gt;
&lt;p&gt;&lt;code dir=&quot;auto&quot;&gt;NetworkProfile&lt;/code&gt; bundles per-ensemble capacity targets and shared memory pre-load directives into deployable units. Apply via schedule, directive system, or manual trigger.&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;NetworkProfile&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;weekendProfile&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;NetworkProfile&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;name&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;sporting-event-weekend&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;ensemble&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;front-desk&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;Capacity&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;replicas&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;4&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;maxConcurrent&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;50&lt;/span&gt;&lt;span&gt;))&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;ensemble&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;kitchen&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;Capacity&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;replicas&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;3&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;maxConcurrent&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;100&lt;/span&gt;&lt;span&gt;))&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;preload&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;kitchen&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;inventory&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Extra beer and ice stocked&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;div&gt;&lt;h2 id=&quot;the-design-principle&quot;&gt;The design principle&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;Each layer addresses a different time horizon: seconds (rate limits), minutes (priority queues), and hours/days (operational profiles). Together, they give operators the tools to keep an agent network running under variable load.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;Capacity management is part of &lt;a href=&quot;https://github.com/AgentEnsemble/agentensemble&quot;&gt;AgentEnsemble&lt;/a&gt;. The &lt;a href=&quot;https://agentensemble.net/guides/rate-limiting/&quot;&gt;rate limiting guide&lt;/a&gt;, &lt;a href=&quot;https://agentensemble.net/guides/operational-profiles/&quot;&gt;operational profiles guide&lt;/a&gt;, and &lt;a href=&quot;https://agentensemble.net/guides/scheduled-tasks/&quot;&gt;scheduled tasks guide&lt;/a&gt; cover the full APIs.&lt;/p&gt;</content:encoded><category>java</category><category>ai</category><category>agents</category><category>architecture</category></item><item><title>Shared Memory Across Agent Ensembles: Consistency Models for Distributed State</title><link>https://agentensemble.net/blog/shared-memory-and-consistency/</link><guid isPermaLink="true">https://agentensemble.net/blog/shared-memory-and-consistency/</guid><description>When multiple agent ensembles need shared state -- eventual consistency, optimistic locking, distributed locks, and choosing the right consistency model for the data.

</description><pubDate>Fri, 15 May 2026 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;When agent ensembles operate as independent services on a network, they occasionally need to share state. The question is not whether to share state — it is how to share it without creating the coordination problems that shared mutable state always creates in distributed systems.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;the-consistency-spectrum&quot;&gt;The consistency spectrum&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;Not all shared state needs the same consistency guarantees:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Inventory notes&lt;/strong&gt; are advisory. Eventual consistency is fine.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Room assignments&lt;/strong&gt; are exclusive. This needs distributed locks or optimistic locking.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Configuration preferences&lt;/strong&gt; are rarely updated. Eventual consistency with version tracking works well.&lt;/li&gt;
&lt;/ul&gt;
&lt;div&gt;&lt;h2 id=&quot;sharedmemory-with-configurable-consistency&quot;&gt;SharedMemory with configurable consistency&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;AgentEnsemble v3.0.0 introduces &lt;code dir=&quot;auto&quot;&gt;SharedMemory&lt;/code&gt; with per-scope consistency selection:&lt;/p&gt;

























&lt;table&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Model&lt;/th&gt;&lt;th&gt;Behavior&lt;/th&gt;&lt;th&gt;Use case&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;EVENTUAL&lt;/code&gt;&lt;/td&gt;&lt;td&gt;Last-write-wins, no coordination&lt;/td&gt;&lt;td&gt;Context, preferences, notes&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;OPTIMISTIC&lt;/code&gt;&lt;/td&gt;&lt;td&gt;Version-checked writes, retry on conflict&lt;/td&gt;&lt;td&gt;Counters, shared documents&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;LOCKED&lt;/code&gt;&lt;/td&gt;&lt;td&gt;Distributed lock before each read/write&lt;/td&gt;&lt;td&gt;Room assignments, exclusive resources&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Different scopes can use different models:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;SharedMemory&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;inventory&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;SharedMemory&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;store&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;MemoryStore&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;inMemory&lt;/span&gt;&lt;span&gt;())&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;consistency&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;Consistency&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;&lt;span&gt;EVENTUAL&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;SharedMemory&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;rooms&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;SharedMemory&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;store&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;MemoryStore&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;inMemory&lt;/span&gt;&lt;span&gt;())&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;consistency&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;Consistency&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;&lt;span&gt;LOCKED&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;div&gt;&lt;h2 id=&quot;the-design-principle&quot;&gt;The design principle&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;The useful insight is that shared state in agent networks is not monolithic. Different categories of state need different consistency guarantees, and forcing a single model is either too expensive or too weak.&lt;/p&gt;
&lt;p&gt;The consistency model is a property of the data, not a property of the system. Choose it based on what happens when two ensembles access the same state concurrently.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;Shared memory is part of &lt;a href=&quot;https://github.com/AgentEnsemble/agentensemble&quot;&gt;AgentEnsemble&lt;/a&gt;. The &lt;a href=&quot;https://agentensemble.net/guides/shared-memory/&quot;&gt;shared memory guide&lt;/a&gt; covers the full API including consistency models and network configuration.&lt;/p&gt;</content:encoded><category>java</category><category>ai</category><category>agents</category><category>architecture</category></item><item><title>Federation for Agent Networks: Cross-Namespace Capability Sharing via Realms</title><link>https://agentensemble.net/blog/federation-across-namespaces/</link><guid isPermaLink="true">https://agentensemble.net/blog/federation-across-namespaces/</guid><description>How agent ensembles share capabilities across Kubernetes namespaces and clusters -- realms, capacity advertisement, federated routing, and trust boundaries.

</description><pubDate>Wed, 13 May 2026 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;Discovery lets ensembles find capabilities within a network. But in a real deployment, not every ensemble lives in the same namespace or even the same cluster. A hotel chain might run separate ensemble networks at each property, each in its own Kubernetes namespace, but want them to share spare capacity when one property is overloaded.&lt;/p&gt;
&lt;p&gt;This is the federation problem: how do you extend capability discovery across trust and network boundaries without collapsing everything into one flat namespace?&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;realms-as-trust-boundaries&quot;&gt;Realms as trust boundaries&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;AgentEnsemble v3.0.0 introduces &lt;strong&gt;realms&lt;/strong&gt; as the organizational unit for federation. A realm is a namespace-level discovery and trust boundary — typically mapping to a Kubernetes namespace in production deployments.&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;FederationConfig&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;federation&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;FederationConfig&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;localRealm&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;hotel-downtown&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;federationName&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Hotel Chain&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;realm&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;hotel-airport&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;hotel-airport-ns&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;realm&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;hotel-beach&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;hotel-beach-ns&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;Within a realm, ensembles discover each other freely. Cross-realm discovery requires explicit opt-in: an ensemble must advertise its capacity as &lt;strong&gt;shareable&lt;/strong&gt; for other realms to use it.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;capacity-advertisement&quot;&gt;Capacity advertisement&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;Ensembles periodically broadcast their current load and availability. The &lt;code dir=&quot;auto&quot;&gt;shareable&lt;/code&gt; flag is the federation gate — when &lt;code dir=&quot;auto&quot;&gt;true&lt;/code&gt;, spare capacity is available to other realms.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;the-routing-hierarchy&quot;&gt;The routing hierarchy&lt;/h2&gt;&lt;/div&gt;

























&lt;table&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Priority&lt;/th&gt;&lt;th&gt;Scope&lt;/th&gt;&lt;th&gt;Condition&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;1 (highest)&lt;/td&gt;&lt;td&gt;Local realm&lt;/td&gt;&lt;td&gt;Provider is in the same realm&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;2&lt;/td&gt;&lt;td&gt;Same realm (unregistered)&lt;/td&gt;&lt;td&gt;Provider has no realm info (assumed local)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;3 (lowest)&lt;/td&gt;&lt;td&gt;Cross-realm&lt;/td&gt;&lt;td&gt;Provider is in a different realm and &lt;code dir=&quot;auto&quot;&gt;shareable = true&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Within each level, the least-loaded provider is preferred. The hierarchy encodes a simple principle: prefer local providers, fall back to cross-realm when local capacity is insufficient.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;the-design-principle&quot;&gt;The design principle&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;Federation is a capacity-sharing problem, not a networking problem. The networking already works across boundaries. What federation adds is a policy layer: who can use whose spare capacity, and in what order.&lt;/p&gt;
&lt;p&gt;Realms provide the organizational unit. Capacity advertisement provides the data. The routing hierarchy provides the policy. Together, they turn independent agent networks into a cooperative federation that shares spare capacity while maintaining operational independence.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;Federation is part of &lt;a href=&quot;https://github.com/AgentEnsemble/agentensemble&quot;&gt;AgentEnsemble&lt;/a&gt;. The &lt;a href=&quot;https://agentensemble.net/guides/federation/&quot;&gt;federation guide&lt;/a&gt; covers the full API including capacity advertisement and realm configuration.&lt;/p&gt;</content:encoded><category>java</category><category>ai</category><category>agents</category><category>architecture</category></item><item><title>Dynamic Discovery in Agent Networks: From Hardcoded Routes to Capability Catalogs</title><link>https://agentensemble.net/blog/discovery-and-capability-catalogs/</link><guid isPermaLink="true">https://agentensemble.net/blog/discovery-and-capability-catalogs/</guid><description>How agent ensembles discover each other&apos;s capabilities at runtime -- tag-based capability catalogs, dynamic wiring, and the shift from static to discovered routing.

</description><pubDate>Mon, 11 May 2026 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;The simplest way to connect two agent ensembles is a direct reference: ensemble A knows ensemble B’s address and calls it. This works when you have two or three ensembles with stable relationships.&lt;/p&gt;
&lt;p&gt;It stops working when you have ten ensembles, or when ensembles come and go, or when the same capability is provided by multiple ensembles and you want the caller to use whichever one is available. At that point, you need discovery — a way for ensembles to find capabilities without knowing in advance who provides them.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;the-static-wiring-problem&quot;&gt;The static wiring problem&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;In a statically wired agent network, every cross-ensemble call requires knowing the provider’s identity and address. This creates coupling. If the provider moves, every caller needs updating. If you add a second provider for capacity, callers need load-balancing logic.&lt;/p&gt;
&lt;p&gt;The fundamental issue is that callers should care about &lt;em&gt;what&lt;/em&gt; they need, not &lt;em&gt;who&lt;/em&gt; provides it or &lt;em&gt;where&lt;/em&gt; it runs.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;capability-advertisement-with-tags&quot;&gt;Capability advertisement with tags&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;AgentEnsemble v3.0.0 introduces capability discovery. Ensembles advertise their shared tasks and tools with optional tags, and other ensembles discover providers at runtime:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;Ensemble&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;kitchen&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;Ensemble&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;chatLanguageModel&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;model&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;task&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;Task&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;of&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Manage kitchen operations&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;))&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;shareTool&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;check-inventory&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, inventoryTool, &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;food&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;stock&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;shareTask&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;prepare-meal&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, mealTask, &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;food&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;cooking&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;kitchen&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;start&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;7329&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;// Another ensemble discovers capabilities dynamically&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;NetworkTool&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;inventoryCheck&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;NetworkTool&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;discover&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;check-inventory&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, registry&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;Tags classify capabilities for filtered discovery. Query for categories rather than specific names:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;List&lt;/span&gt;&lt;span&gt;&amp;#x3C;&lt;/span&gt;&lt;span&gt;CapabilityInfo&lt;/span&gt;&lt;span&gt;&gt; &lt;/span&gt;&lt;span&gt;food&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;registry&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;findByTag&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;food&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;List&lt;/span&gt;&lt;span&gt;&amp;#x3C;&lt;/span&gt;&lt;span&gt;CapabilityInfo&lt;/span&gt;&lt;span&gt;&gt; &lt;/span&gt;&lt;span&gt;stockChecks&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;registry&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;findByTags&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;food&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;stock&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;div&gt;&lt;h2 id=&quot;dynamic-vs-static-wiring&quot;&gt;Dynamic vs. static wiring&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;The two approaches coexist. Use static wiring for well-known, stable relationships. Use dynamic discovery for capabilities that may be provided by different ensembles depending on deployment, capacity, or availability.&lt;/p&gt;
&lt;p&gt;The agent using the task or tool does not know which approach was used to create it.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;tradeoffs&quot;&gt;Tradeoffs&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Discovery adds a lookup step.&lt;/strong&gt; Initial resolution queries the registry; subsequent uses are cached.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Tag semantics are convention-based.&lt;/strong&gt; No schema enforcement — tag conventions need to be agreed upon across teams.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Multiple providers create ambiguity.&lt;/strong&gt; The registry needs a selection strategy (least-loaded, round-robin, affinity-based).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Registry availability is a dependency.&lt;/strong&gt; For critical paths, consider falling back to static wiring when discovery is unavailable.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;the-design-principle&quot;&gt;The design principle&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;The useful abstraction is separating &lt;em&gt;what&lt;/em&gt; from &lt;em&gt;who&lt;/em&gt;. An ensemble that needs a capability should express that need without specifying the provider. This separation enables the network to evolve — new providers can come online, existing providers can be replaced, capacity can be redistributed — without callers needing to change.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;Capability discovery is part of &lt;a href=&quot;https://github.com/AgentEnsemble/agentensemble&quot;&gt;AgentEnsemble&lt;/a&gt;. The &lt;a href=&quot;https://agentensemble.net/guides/discovery/&quot;&gt;discovery guide&lt;/a&gt; covers the full API including tag-based filtering.&lt;/p&gt;</content:encoded><category>java</category><category>ai</category><category>agents</category><category>architecture</category></item><item><title>Durable Transport for Agent Networks: Moving from In-Process Queues to Kafka</title><link>https://agentensemble.net/blog/durable-transport-with-kafka/</link><guid isPermaLink="true">https://agentensemble.net/blog/durable-transport-with-kafka/</guid><description>What changes when agent networks need delivery guarantees -- Kafka-backed request queues, delivery registries, and the operational realities of durable messaging.

</description><pubDate>Sat, 09 May 2026 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;In-process queues are fine for development. They are fast, deterministic, and require zero infrastructure. But they have a property that becomes a liability in production: when the process dies, the queue contents disappear.&lt;/p&gt;
&lt;p&gt;For agent networks that run as long-lived services — handling work requests over hours or days — losing queued requests on restart is not acceptable. The transport layer needs durability, and that means moving from in-process data structures to something that survives process failures.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;what-durability-means-for-agent-networks&quot;&gt;What durability means for agent networks&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;An agent ensemble network has three communication patterns that need durable backing:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Work request delivery&lt;/strong&gt; — a request from one ensemble to another should not be lost if the receiving ensemble is temporarily unavailable&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Response routing&lt;/strong&gt; — when an ensemble completes a request, the response needs to reach the original caller even if the caller restarted&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Capability advertisement&lt;/strong&gt; — shared tasks and tools should remain discoverable across process restarts&lt;/li&gt;
&lt;/ol&gt;
&lt;div&gt;&lt;h2 id=&quot;kafka-as-the-transport-backing&quot;&gt;Kafka as the transport backing&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;The &lt;code dir=&quot;auto&quot;&gt;agentensemble-transport-kafka&lt;/code&gt; module implements the transport SPIs against Apache Kafka:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;KafkaTransportConfig&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;config&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;KafkaTransportConfig&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;bootstrapServers&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;kafka:9092&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;consumerGroupId&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;kitchen-ensemble&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;topicPrefix&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;agentensemble.&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;div&gt;&lt;h3 id=&quot;request-queues&quot;&gt;Request queues&lt;/h3&gt;&lt;/div&gt;
&lt;p&gt;&lt;code dir=&quot;auto&quot;&gt;KafkaRequestQueue&lt;/code&gt; produces work requests to a Kafka topic and consumes them with manual offset commits. If the ensemble crashes mid-processing, the request will be redelivered on restart.&lt;/p&gt;
&lt;div&gt;&lt;h3 id=&quot;priority-queues-with-aging&quot;&gt;Priority queues with aging&lt;/h3&gt;&lt;/div&gt;
&lt;p&gt;For workloads where some requests are more urgent than others, &lt;code dir=&quot;auto&quot;&gt;PriorityRequestQueue&lt;/code&gt; adds priority levels with aging to prevent starvation. Requests that wait longer than the aging interval are promoted to the next higher priority level.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;what-changes-operationally&quot;&gt;What changes operationally&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;Moving from in-process to Kafka transport changes the operational profile:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Startup behavior&lt;/strong&gt; — with Kafka, ensembles may start with a backlog of unprocessed requests&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Failure modes&lt;/strong&gt; — infrastructure-level errors (broker unavailable) rather than process-fatal&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Monitoring&lt;/strong&gt; — consumer lag, partition health, broker connectivity&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Ordering&lt;/strong&gt; — per-partition ordering, not strict FIFO&lt;/li&gt;
&lt;/ul&gt;
&lt;div&gt;&lt;h2 id=&quot;the-configuration-boundary&quot;&gt;The configuration boundary&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;The ensemble does not know it is using Kafka — it interacts with transport SPIs. The Kafka-specific configuration lives in the infrastructure layer:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;// Infrastructure layer&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;KafkaRequestQueue&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;queue&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;KafkaRequestQueue&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;config&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;kafkaConfig&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;ensembleName&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;kitchen&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;// Application layer (transport-agnostic)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;Ensemble&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;kitchen&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;Ensemble&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;chatLanguageModel&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;model&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;task&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;Task&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;of&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Manage kitchen operations&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;))&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;requestQueue&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;queue&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;Same ensemble code works in development (in-process queues) and production (Kafka) without changes.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;tradeoffs&quot;&gt;Tradeoffs&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;At-least-once delivery.&lt;/strong&gt; A request may be processed twice if the ensemble crashes after completing work but before committing the offset. For most agent workloads (non-deterministic anyway), this is acceptable.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Operational complexity.&lt;/strong&gt; Kafka needs to be provisioned, monitored, and maintained. For small deployments, the overhead may not be justified.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Latency.&lt;/strong&gt; Kafka adds millisecond-scale latency. For agent workloads where execution takes seconds or minutes, this is negligible.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;The Kafka transport module is part of &lt;a href=&quot;https://github.com/AgentEnsemble/agentensemble&quot;&gt;AgentEnsemble&lt;/a&gt;. The &lt;a href=&quot;https://agentensemble.net/guides/durable-transport/&quot;&gt;durable transport guide&lt;/a&gt; covers the full configuration and operational details.&lt;/p&gt;</content:encoded><category>java</category><category>ai</category><category>agents</category><category>architecture</category></item><item><title>Transport SPI: Making Agent Network Infrastructure Pluggable</title><link>https://agentensemble.net/blog/transport-spi-pluggable-agent-infrastructure/</link><guid isPermaLink="true">https://agentensemble.net/blog/transport-spi-pluggable-agent-infrastructure/</guid><description>The transport abstraction that makes ensemble networks infrastructure-agnostic -- from in-process queues to Kafka without changing application code.

</description><pubDate>Thu, 07 May 2026 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;When agent ensembles become long-running services that communicate over a network, the communication layer becomes infrastructure. And infrastructure has a property that application code should not: it varies by deployment environment.&lt;/p&gt;
&lt;p&gt;Development uses in-process queues. Staging might use Redis. Production runs Kafka. The application code — the agents, tasks, workflows — should not change between these environments. The question is where to draw the abstraction line.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;the-transport-problem&quot;&gt;The transport problem&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;An ensemble network needs several communication primitives:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Request queues&lt;/strong&gt; — how work requests arrive at an ensemble&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Delivery registries&lt;/strong&gt; — how responses get routed back to the requester&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Capability registries&lt;/strong&gt; — how ensembles advertise and discover shared tasks and tools&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Capacity tracking&lt;/strong&gt; — how ensembles report their current load&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Each of these has a natural in-process implementation (maps, queues, lists) and at least one distributed implementation (Kafka topics, Redis streams, service registries). If these are hardcoded to a specific backing store, every deployment environment change requires code changes.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;the-spi-design&quot;&gt;The SPI design&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;AgentEnsemble defines transport as a set of Java interfaces — a Service Provider Interface — with pluggable implementations:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;Transport&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;transport&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;Transport&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;websocket&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;kitchen&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;// Or for production with delivery guarantees&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;Transport&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;transport&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;Transport&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;simple&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;kitchen&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, deliveryRegistry&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;The &lt;code dir=&quot;auto&quot;&gt;Transport&lt;/code&gt; interface provides access to the individual primitives:&lt;/p&gt;

























&lt;table&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Primitive&lt;/th&gt;&lt;th&gt;Interface&lt;/th&gt;&lt;th&gt;Purpose&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Request queue&lt;/td&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;RequestQueue&lt;/code&gt;&lt;/td&gt;&lt;td&gt;Inbound work request buffering&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Delivery registry&lt;/td&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;DeliveryRegistry&lt;/code&gt;&lt;/td&gt;&lt;td&gt;Response routing back to callers&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Capability registry&lt;/td&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;CapabilityRegistry&lt;/code&gt;&lt;/td&gt;&lt;td&gt;Shared task/tool advertisement&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Each interface has a simple contract. &lt;code dir=&quot;auto&quot;&gt;RequestQueue&lt;/code&gt;, for example:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;public&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;interface&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;RequestQueue&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;void&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;enqueue&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;WorkRequest&lt;/span&gt;&lt;span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;request&lt;/span&gt;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;Optional&lt;/span&gt;&lt;span&gt;&amp;#x3C;&lt;/span&gt;&lt;span&gt;WorkRequest&lt;/span&gt;&lt;span&gt;&gt; &lt;/span&gt;&lt;span&gt;poll&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;Duration&lt;/span&gt;&lt;span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;timeout&lt;/span&gt;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;int&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;size&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;The in-process implementation uses a &lt;code dir=&quot;auto&quot;&gt;LinkedBlockingQueue&lt;/code&gt;. The Kafka implementation produces to a topic and consumes with manual offset commits. Same interface, different backing.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;why-this-matters-for-agent-systems&quot;&gt;Why this matters for agent systems&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;The transport SPI is not unusual as an architectural pattern — it is a standard dependency inversion. What makes it interesting in the agent context is what it enables.&lt;/p&gt;
&lt;p&gt;Agent networks are inherently non-deterministic. Agents take variable time, produce variable output, and may fail in unpredictable ways. Adding infrastructure variability on top of that makes the system harder to reason about.&lt;/p&gt;
&lt;p&gt;By isolating transport from application logic, you can:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Test with in-process transport&lt;/strong&gt; — no containers, no network, deterministic ordering&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Develop locally with WebSocket transport&lt;/strong&gt; — real network behavior, zero infrastructure setup&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Deploy to production with Kafka&lt;/strong&gt; — durability, horizontal scaling, replay capability&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Switch between environments&lt;/strong&gt; — without touching agent code, task definitions, or workflow configuration&lt;/li&gt;
&lt;/ul&gt;
&lt;div&gt;&lt;h2 id=&quot;the-capability-registry&quot;&gt;The capability registry&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;One of the more interesting transport primitives is the capability registry. When an ensemble shares a task or tool on the network, that capability needs to be discoverable by other ensembles.&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;CapabilityRegistry&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;registry&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;transport&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;capabilityRegistry&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;registry&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;register&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;prepare-meal&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;CapabilityType&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;TASK&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;kitchen&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;registry&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;register&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;check-inventory&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;CapabilityType&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;TOOL&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;kitchen&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;Optional&lt;/span&gt;&lt;span&gt;&amp;#x3C;&lt;/span&gt;&lt;span&gt;String&lt;/span&gt;&lt;span&gt;&gt; &lt;/span&gt;&lt;span&gt;provider&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;registry&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;findProvider&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;prepare-meal&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;In simple mode, this is an in-memory map. In production, it could be backed by a service registry, a shared database, or Kafka’s consumer group protocol. The application code that registers and discovers capabilities does not change.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;tradeoffs&quot;&gt;Tradeoffs&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Abstraction leaks.&lt;/strong&gt; In-process queues have different ordering and delivery guarantees than Kafka topics. The SPI abstracts the interface but cannot fully abstract the semantics.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Configuration complexity.&lt;/strong&gt; Each transport implementation has its own configuration. The SPI does not unify configuration — you still need environment-specific setup for each backing store.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Performance characteristics vary.&lt;/strong&gt; In-process queues are nanosecond-scale. Kafka adds millisecond-scale latency. If your agent workflow is latency-sensitive, the transport choice matters.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;the-design-principle&quot;&gt;The design principle&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;The useful insight is that agent network communication has a small number of well-defined primitives, and these primitives have natural implementations at every scale. Defining the primitives as interfaces lets the infrastructure decision be made at deployment time rather than at development time.&lt;/p&gt;
&lt;p&gt;This is standard dependency inversion. It is not novel. But it is the foundation that makes everything else in the ensemble network possible — durable transport, discovery, federation, and capacity management all build on these same interfaces.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;The transport SPI is part of &lt;a href=&quot;https://github.com/AgentEnsemble/agentensemble&quot;&gt;AgentEnsemble&lt;/a&gt;. The &lt;a href=&quot;https://agentensemble.net/guides/durable-transport/&quot;&gt;durable transport guide&lt;/a&gt; covers the Kafka implementation in detail.&lt;/p&gt;</content:encoded><category>java</category><category>ai</category><category>agents</category><category>architecture</category></item><item><title>Bridging MCP into Java Agent Systems: Reusing the Tool Ecosystem Without Leaving the JVM</title><link>https://agentensemble.net/blog/mcp-bridge-for-java-agents/</link><guid isPermaLink="true">https://agentensemble.net/blog/mcp-bridge-for-java-agents/</guid><description>How to connect MCP servers to Java agents -- McpToolFactory, lifecycle management, and mixing MCP tools with native Java tools.

</description><pubDate>Tue, 05 May 2026 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;The Model Context Protocol has created a growing ecosystem of tool servers — filesystem operations, git integration, database access, API connectors. Most of these servers are written in TypeScript and communicate over stdio or SSE.&lt;/p&gt;
&lt;p&gt;If you are building agent systems on the JVM, you face a choice: rewrite every tool in Java, or find a way to use what already exists. The useful answer is usually both — and the bridge between them needs to be clean enough that the rest of your system does not care which approach a particular tool uses.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;the-integration-problem&quot;&gt;The integration problem&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;MCP servers expose tools through a well-defined protocol. LangChain4j (which AgentEnsemble builds on) already has MCP client support via &lt;code dir=&quot;auto&quot;&gt;McpClient&lt;/code&gt; and &lt;code dir=&quot;auto&quot;&gt;McpToolProvider&lt;/code&gt;. But there is a gap: LangChain4j’s MCP integration produces tools for its &lt;code dir=&quot;auto&quot;&gt;AiServices&lt;/code&gt; abstraction, not for AgentEnsemble’s &lt;code dir=&quot;auto&quot;&gt;AgentTool&lt;/code&gt; interface.&lt;/p&gt;
&lt;p&gt;The bridge needs to:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Connect to any MCP server (stdio or SSE transport)&lt;/li&gt;
&lt;li&gt;Discover available tools from the server&lt;/li&gt;
&lt;li&gt;Adapt each MCP tool to the &lt;code dir=&quot;auto&quot;&gt;AgentTool&lt;/code&gt; interface&lt;/li&gt;
&lt;li&gt;Manage the server subprocess lifecycle&lt;/li&gt;
&lt;li&gt;Allow MCP tools and Java-native tools to coexist in the same agent’s tool list&lt;/li&gt;
&lt;/ol&gt;
&lt;div&gt;&lt;h2 id=&quot;mcptoolfactory&quot;&gt;McpToolFactory&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;The &lt;code dir=&quot;auto&quot;&gt;agentensemble-mcp&lt;/code&gt; module provides &lt;code dir=&quot;auto&quot;&gt;McpToolFactory&lt;/code&gt; as the primary entry point. Connect to any MCP-compatible server and get back standard &lt;code dir=&quot;auto&quot;&gt;AgentTool&lt;/code&gt; instances:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;try&lt;/span&gt;&lt;span&gt; (&lt;/span&gt;&lt;span&gt;StdioMcpTransport&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;transport&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;new&lt;/span&gt;&lt;span&gt; StdioMcpTransport.&lt;/span&gt;&lt;span&gt;Builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;command&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;List&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;of&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;npx&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;--yes&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;            &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;@modelcontextprotocol/server-filesystem&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;/workspace&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;))&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;) {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;List&lt;/span&gt;&lt;span&gt;&amp;#x3C;&lt;/span&gt;&lt;span&gt;AgentTool&lt;/span&gt;&lt;span&gt;&gt; &lt;/span&gt;&lt;span&gt;tools&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;McpToolFactory&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;fromServer&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;transport&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;Agent&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;agent&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;Agent&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;role&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;File analyst&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;goal&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Analyze project structure&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;tools&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;tools&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;llm&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;model&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;The factory connects to the server, enumerates its tools, and wraps each one as an &lt;code dir=&quot;auto&quot;&gt;McpAgentTool&lt;/code&gt;. Because MCP tools already have typed parameter schemas, the wrapper passes those schemas through to LangChain4j’s &lt;code dir=&quot;auto&quot;&gt;ToolSpecification&lt;/code&gt; directly — no intermediate Java record needed.&lt;/p&gt;
&lt;p&gt;You can also filter to specific tools:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;List&lt;/span&gt;&lt;span&gt;&amp;#x3C;&lt;/span&gt;&lt;span&gt;AgentTool&lt;/span&gt;&lt;span&gt;&gt; &lt;/span&gt;&lt;span&gt;tools&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;McpToolFactory&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;fromServer&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;transport,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;read_file&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;search_files&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;directory_tree&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;This is useful when a server exposes tools you do not want the agent to have access to — write operations, for instance, when the agent should only read.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;convenience-factories-for-common-servers&quot;&gt;Convenience factories for common servers&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;The two most common MCP servers for coding workflows are the filesystem and git reference servers. &lt;code dir=&quot;auto&quot;&gt;McpToolFactory&lt;/code&gt; provides convenience methods that handle the subprocess setup:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;try&lt;/span&gt;&lt;span&gt; (&lt;/span&gt;&lt;span&gt;McpServerLifecycle&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;fs&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;McpToolFactory&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;filesystem&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;projectDir&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;McpServerLifecycle&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;git&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;McpToolFactory&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;git&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;projectDir&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;) {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;fs&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;start&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;git&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;start&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;List&lt;/span&gt;&lt;span&gt;&amp;#x3C;&lt;/span&gt;&lt;span&gt;AgentTool&lt;/span&gt;&lt;span&gt;&gt; &lt;/span&gt;&lt;span&gt;allTools&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;new&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;ArrayList&lt;/span&gt;&lt;span&gt;&amp;#x3C;&gt;();&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;allTools&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;addAll&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;fs&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;tools&lt;/span&gt;&lt;span&gt;())&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;allTools&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;addAll&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;git&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;tools&lt;/span&gt;&lt;span&gt;())&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;// Use allTools in any agent&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;The filesystem server provides: &lt;code dir=&quot;auto&quot;&gt;read_file&lt;/code&gt;, &lt;code dir=&quot;auto&quot;&gt;write_file&lt;/code&gt;, &lt;code dir=&quot;auto&quot;&gt;edit_file&lt;/code&gt;, &lt;code dir=&quot;auto&quot;&gt;search_files&lt;/code&gt;, &lt;code dir=&quot;auto&quot;&gt;list_directory&lt;/code&gt;, &lt;code dir=&quot;auto&quot;&gt;directory_tree&lt;/code&gt;, &lt;code dir=&quot;auto&quot;&gt;get_file_info&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The git server provides: &lt;code dir=&quot;auto&quot;&gt;git_status&lt;/code&gt;, &lt;code dir=&quot;auto&quot;&gt;git_diff_unstaged&lt;/code&gt;, &lt;code dir=&quot;auto&quot;&gt;git_diff_staged&lt;/code&gt;, &lt;code dir=&quot;auto&quot;&gt;git_diff&lt;/code&gt;, &lt;code dir=&quot;auto&quot;&gt;git_commit&lt;/code&gt;, &lt;code dir=&quot;auto&quot;&gt;git_add&lt;/code&gt;, &lt;code dir=&quot;auto&quot;&gt;git_log&lt;/code&gt;, &lt;code dir=&quot;auto&quot;&gt;git_branch&lt;/code&gt;, &lt;code dir=&quot;auto&quot;&gt;git_create_branch&lt;/code&gt;, &lt;code dir=&quot;auto&quot;&gt;git_checkout&lt;/code&gt;, &lt;code dir=&quot;auto&quot;&gt;git_show&lt;/code&gt;, &lt;code dir=&quot;auto&quot;&gt;git_reset&lt;/code&gt;.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;lifecycle-management&quot;&gt;Lifecycle management&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;MCP servers run as subprocesses. If you do not shut them down, you leak processes. &lt;code dir=&quot;auto&quot;&gt;McpServerLifecycle&lt;/code&gt; implements &lt;code dir=&quot;auto&quot;&gt;AutoCloseable&lt;/code&gt; so try-with-resources handles cleanup:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;try&lt;/span&gt;&lt;span&gt; (&lt;/span&gt;&lt;span&gt;McpServerLifecycle&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;server&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;McpToolFactory&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;filesystem&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;dir&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;) {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;server&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;start&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;// Use server.tools() ...&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;} &lt;/span&gt;&lt;span&gt;// server is shut down here, subprocess is killed&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;For long-running ensembles, &lt;code dir=&quot;auto&quot;&gt;McpServerLifecycle&lt;/code&gt; also integrates with the ensemble’s lifecycle listener. When the ensemble stops, any attached MCP servers are shut down automatically.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;mixing-mcp-and-java-native-tools&quot;&gt;Mixing MCP and Java-native tools&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;The most practical pattern is combining MCP tools with Java-native tools in the same agent. MCP provides the filesystem and git operations; Java-native tools handle domain-specific logic, calculations, or API calls:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;try&lt;/span&gt;&lt;span&gt; (&lt;/span&gt;&lt;span&gt;McpServerLifecycle&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;fs&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;McpToolFactory&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;filesystem&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;projectDir&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;) {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;fs&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;start&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;Agent&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;agent&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;Agent&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;role&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Code reviewer&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;goal&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Review code changes and check style compliance&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;tools&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;fs&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;tools&lt;/span&gt;&lt;span&gt;())&lt;/span&gt;&lt;span&gt;                    &lt;/span&gt;&lt;span&gt;// MCP filesystem tools&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;tools&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;List&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;of&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;                       &lt;/span&gt;&lt;span&gt;// Java-native tools&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;            &lt;/span&gt;&lt;span&gt;new&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;StyleCheckerTool&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;            &lt;/span&gt;&lt;span&gt;new&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;MetricsCalculatorTool&lt;/span&gt;&lt;span&gt;()))&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;llm&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;model&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;Both tool types implement the same &lt;code dir=&quot;auto&quot;&gt;AgentTool&lt;/code&gt; interface. The agent sees a flat list of tools with names and descriptions. It does not know or care which ones are backed by an MCP subprocess and which are pure Java.&lt;/p&gt;
&lt;p&gt;This composability is the point. You can start with MCP servers for rapid capability acquisition, then replace individual tools with Java implementations when you need more control, better performance, or fewer runtime dependencies — without changing the agent configuration.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;connecting-to-custom-mcp-servers&quot;&gt;Connecting to custom MCP servers&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;Any MCP-compatible server works — not just the reference implementations. If you have a custom server that exposes domain-specific tools, connect it the same way:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;try&lt;/span&gt;&lt;span&gt; (&lt;/span&gt;&lt;span&gt;StdioMcpTransport&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;transport&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;new&lt;/span&gt;&lt;span&gt; StdioMcpTransport.&lt;/span&gt;&lt;span&gt;Builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;command&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;List&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;of&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;python&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;-m&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;my_custom_mcp_server&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;))&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;) {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;List&lt;/span&gt;&lt;span&gt;&amp;#x3C;&lt;/span&gt;&lt;span&gt;AgentTool&lt;/span&gt;&lt;span&gt;&gt; &lt;/span&gt;&lt;span&gt;tools&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;McpToolFactory&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;fromServer&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;transport&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;SSE transport works for remote servers:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;SseMcpTransport&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;transport&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;new&lt;/span&gt;&lt;span&gt; SseMcpTransport.&lt;/span&gt;&lt;span&gt;Builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;sseUrl&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;http://mcp-server:8080/sse&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;List&lt;/span&gt;&lt;span&gt;&amp;#x3C;&lt;/span&gt;&lt;span&gt;AgentTool&lt;/span&gt;&lt;span&gt;&gt; &lt;/span&gt;&lt;span&gt;tools&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;McpToolFactory&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;fromServer&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;transport&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;div&gt;&lt;h2 id=&quot;tradeoffs&quot;&gt;Tradeoffs&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Subprocess overhead.&lt;/strong&gt; Each MCP server is a separate process. For the reference servers, this means Node.js must be installed. The startup cost is measurable (typically 1-2 seconds). For long-running agents, this is negligible; for one-shot scripts, it adds latency.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Debugging across process boundaries.&lt;/strong&gt; When an MCP tool fails, the error comes back as a string from the subprocess. You lose Java stack traces and structured exception types.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;No hot-reload.&lt;/strong&gt; If the MCP server crashes, the tools become unavailable. The bridge does not automatically restart servers. For production deployments, you would want health-check and restart logic around the lifecycle objects.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;when-to-use-mcp-vs-java-native-tools&quot;&gt;When to use MCP vs. Java-native tools&lt;/h2&gt;&lt;/div&gt;








































&lt;table&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Consideration&lt;/th&gt;&lt;th&gt;MCP&lt;/th&gt;&lt;th&gt;Java-native&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;Ecosystem breadth&lt;/td&gt;&lt;td&gt;Large and growing&lt;/td&gt;&lt;td&gt;You build what you need&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Runtime dependency&lt;/td&gt;&lt;td&gt;Node.js (for reference servers)&lt;/td&gt;&lt;td&gt;Pure JVM&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Startup latency&lt;/td&gt;&lt;td&gt;1-2s per server&lt;/td&gt;&lt;td&gt;Instant&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Debugging&lt;/td&gt;&lt;td&gt;Cross-process&lt;/td&gt;&lt;td&gt;Same-process stack traces&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Customization&lt;/td&gt;&lt;td&gt;Limited to server’s API&lt;/td&gt;&lt;td&gt;Full control&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Integration with Java types&lt;/td&gt;&lt;td&gt;String-based&lt;/td&gt;&lt;td&gt;Native records, type safety&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The practical pattern: start with MCP for rapid capability bootstrapping, move to Java-native tools for anything performance-sensitive or deeply integrated with your domain.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;The MCP bridge is part of &lt;a href=&quot;https://github.com/AgentEnsemble/agentensemble&quot;&gt;AgentEnsemble&lt;/a&gt;. The &lt;a href=&quot;https://agentensemble.net/guides/mcp/&quot;&gt;MCP bridge guide&lt;/a&gt; covers the full API and transport options.&lt;/p&gt;</content:encoded><category>java</category><category>ai</category><category>agents</category><category>architecture</category></item><item><title>Coding Agents on the JVM: Project Detection, Workspace Isolation, and Tool Composition</title><link>https://agentensemble.net/blog/coding-agents-on-the-jvm/</link><guid isPermaLink="true">https://agentensemble.net/blog/coding-agents-on-the-jvm/</guid><description>What it takes to build reliable coding agents in Java -- project detection, git worktree isolation, and composable tool backends.

</description><pubDate>Sun, 03 May 2026 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;Most agent frameworks treat coding tasks the same as any other task: give the agent a file-read tool and a file-write tool and hope for the best.&lt;/p&gt;
&lt;p&gt;In practice, an agent that can read and write files is not the same as an agent that can reliably work on a codebase. The gap between “can modify files” and “can fix a bug in a Gradle project” is significant, and it is mostly about context that the agent needs but does not have.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;the-missing-context&quot;&gt;The missing context&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;A coding agent needs to know things that a general-purpose agent does not:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;What kind of project is this?&lt;/strong&gt; Is it Java with Gradle, Python with pip, TypeScript with npm? The build command, test command, and source layout all follow from this.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Where is the code?&lt;/strong&gt; Source roots like &lt;code dir=&quot;auto&quot;&gt;src/main/java&lt;/code&gt; are conventions, not universal truths. The agent needs to know where to look.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;How do I verify my changes?&lt;/strong&gt; Running &lt;code dir=&quot;auto&quot;&gt;./gradlew test&lt;/code&gt; is fundamentally different from running &lt;code dir=&quot;auto&quot;&gt;npm test&lt;/code&gt;. The agent needs the right command for the project.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;How do I avoid breaking things?&lt;/strong&gt; If the agent edits files directly in the user’s working tree, a failed experiment leaves half-finished code behind.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Without this context, agents make predictable mistakes: they guess at build commands, search in wrong directories, and leave the codebase in a worse state than they found it.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;project-detection-as-a-first-class-concern&quot;&gt;Project detection as a first-class concern&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;The approach I’ve been working on in &lt;a href=&quot;https://github.com/AgentEnsemble/agentensemble&quot;&gt;AgentEnsemble&lt;/a&gt; treats project detection as an explicit step before tool assembly.&lt;/p&gt;
&lt;p&gt;&lt;code dir=&quot;auto&quot;&gt;ProjectDetector.analyze(Path)&lt;/code&gt; scans the project root for build-file markers and returns a &lt;code dir=&quot;auto&quot;&gt;ProjectContext&lt;/code&gt; that captures language, build system, source roots, and the commands needed to build and test:&lt;/p&gt;















































&lt;table&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Marker file&lt;/th&gt;&lt;th&gt;Language&lt;/th&gt;&lt;th&gt;Build command&lt;/th&gt;&lt;th&gt;Test command&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;build.gradle.kts&lt;/code&gt; / &lt;code dir=&quot;auto&quot;&gt;build.gradle&lt;/code&gt;&lt;/td&gt;&lt;td&gt;Java&lt;/td&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;./gradlew build&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;./gradlew test&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;pom.xml&lt;/code&gt;&lt;/td&gt;&lt;td&gt;Java&lt;/td&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;mvn compile&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;mvn test&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;package.json&lt;/code&gt; + &lt;code dir=&quot;auto&quot;&gt;tsconfig.json&lt;/code&gt;&lt;/td&gt;&lt;td&gt;TypeScript&lt;/td&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;npm run build&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;npm test&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;pyproject.toml&lt;/code&gt; / &lt;code dir=&quot;auto&quot;&gt;requirements.txt&lt;/code&gt;&lt;/td&gt;&lt;td&gt;Python&lt;/td&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;python -m build&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;python -m pytest&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;go.mod&lt;/code&gt;&lt;/td&gt;&lt;td&gt;Go&lt;/td&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;go build ./...&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;go test ./...&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;Cargo.toml&lt;/code&gt;&lt;/td&gt;&lt;td&gt;Rust&lt;/td&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;cargo build&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;cargo test&lt;/code&gt;&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;This is not magic — it is a lookup table backed by file-existence checks. But it means the agent’s system prompt includes the correct build and test commands for the specific project, rather than generic instructions that may or may not apply.&lt;/p&gt;
&lt;p&gt;The detected context is injected into the agent’s instructions automatically. The agent knows it is working on a Java/Gradle project with source at &lt;code dir=&quot;auto&quot;&gt;src/main/java&lt;/code&gt; and tests at &lt;code dir=&quot;auto&quot;&gt;src/test/java&lt;/code&gt;, and it knows that &lt;code dir=&quot;auto&quot;&gt;./gradlew test&lt;/code&gt; is the verification command.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;workspace-isolation-via-git-worktrees&quot;&gt;Workspace isolation via git worktrees&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;The harder problem is safety. A coding agent that writes directly to the user’s working tree is an agent that can break your build, conflict with your uncommitted work, or leave half-finished refactoring behind if it fails partway through.&lt;/p&gt;
&lt;p&gt;Git worktrees solve this cleanly. A worktree is a lightweight, branch-isolated copy of a repository that shares the same object store as the original. Creation is fast and disk-efficient because it does not duplicate the git history.&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;EnsembleOutput&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;result&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;CodingEnsemble&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;runIsolated&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;model, repoRoot,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;CodingTask&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;implement&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Add user profile endpoint&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;))&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;That &lt;code dir=&quot;auto&quot;&gt;runIsolated&lt;/code&gt; call:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Creates a git worktree from the current branch&lt;/li&gt;
&lt;li&gt;Runs the coding agent inside the worktree&lt;/li&gt;
&lt;li&gt;On success, preserves the worktree for review (you can inspect the changes, run tests again, then merge)&lt;/li&gt;
&lt;li&gt;On failure, cleans up the worktree automatically&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The key interface is &lt;code dir=&quot;auto&quot;&gt;Workspace&lt;/code&gt;:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;public&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;interface&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;Workspace&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;extends&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;AutoCloseable&lt;/span&gt;&lt;span&gt; {&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;Path&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;path&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;          &lt;/span&gt;&lt;span&gt;// Absolute path to the isolated directory&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;void&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;close&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;         &lt;/span&gt;&lt;span&gt;// Clean up (remove worktree)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;For non-git projects, a &lt;code dir=&quot;auto&quot;&gt;DirectoryWorkspace&lt;/code&gt; creates a temporary directory and optionally copies source files. But for the common case — a git repository — worktrees provide isolation without the cost of a full clone.&lt;/p&gt;
&lt;p&gt;The tradeoff is that worktrees require a git repository. If you are working on a non-git project or a freshly initialized directory, the fallback to temporary directories is less elegant. But for the vast majority of real codebases, worktrees are the right abstraction.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;composable-tool-backends&quot;&gt;Composable tool backends&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;Different environments have different constraints. Some teams run pure-JVM deployments where Node.js is not available. Others already use MCP servers and want to reuse them. A coding agent framework should not force one approach.&lt;/p&gt;
&lt;p&gt;AgentEnsemble provides three tool backends, selected via &lt;code dir=&quot;auto&quot;&gt;ToolBackend&lt;/code&gt;:&lt;/p&gt;






























&lt;table&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Backend&lt;/th&gt;&lt;th&gt;Description&lt;/th&gt;&lt;th&gt;Requires&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;AUTO&lt;/code&gt;&lt;/td&gt;&lt;td&gt;Detect best available backend&lt;/td&gt;&lt;td&gt;Nothing&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;JAVA&lt;/code&gt;&lt;/td&gt;&lt;td&gt;Java-native coding tools (glob, search, edit, shell, git, build, test)&lt;/td&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;agentensemble-tools-coding&lt;/code&gt; on classpath&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;MCP&lt;/code&gt;&lt;/td&gt;&lt;td&gt;MCP reference servers for filesystem + git&lt;/td&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;agentensemble-mcp&lt;/code&gt; + Node.js&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;MINIMAL&lt;/code&gt;&lt;/td&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;FileReadTool&lt;/code&gt; only&lt;/td&gt;&lt;td&gt;Always available&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;&lt;code dir=&quot;auto&quot;&gt;AUTO&lt;/code&gt; resolves in order: MCP &gt; JAVA &gt; MINIMAL. If neither optional module is on the classpath, the agent works with file-read only — limited, but functional for read-only analysis tasks.&lt;/p&gt;
&lt;p&gt;The Java backend provides purpose-built coding tools:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;GlobTool&lt;/strong&gt; — find files by pattern across the project&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;GrepTool&lt;/strong&gt; — search file contents with regex&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;CodeEditTool&lt;/strong&gt; — surgical line-range replacement (not full-file overwrite)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;ShellTool&lt;/strong&gt; — execute build/test commands with output capture&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;GitTool&lt;/strong&gt; — status, diff, stage, commit&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The MCP backend starts the official MCP filesystem and git reference servers as subprocesses and adapts their tools to the &lt;code dir=&quot;auto&quot;&gt;AgentTool&lt;/code&gt; interface. Both backends produce the same tool interface, so the rest of the framework does not care which one is active.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;the-one-liner-and-the-builder&quot;&gt;The one-liner and the builder&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;For the common case, a single call handles everything:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;EnsembleOutput&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;result&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;CodingEnsemble&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;run&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;model, &lt;/span&gt;&lt;span&gt;Path&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;of&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;/my/project&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;CodingTask&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;fix&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;NullPointerException in UserService.getById()&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;))&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;That call detects the project, assembles tools, generates a coding-specific system prompt, and runs the agent with a higher iteration limit (75 vs the default 25 — coding tasks typically need more rounds).&lt;/p&gt;
&lt;p&gt;For more control, the builder exposes every knob:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;Agent&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;agent&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;CodingAgent&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;llm&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;model&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;workingDirectory&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;Path&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;of&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;/my/project&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;))&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;toolBackend&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;ToolBackend&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;&lt;span&gt;JAVA&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;requireApproval&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;true&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;maxIterations&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;75&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;additionalTools&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;myCustomTool&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;The builder returns a standard &lt;code dir=&quot;auto&quot;&gt;Agent&lt;/code&gt; — no subclassing, no special execution path. You can use it with &lt;code dir=&quot;auto&quot;&gt;Task&lt;/code&gt;, &lt;code dir=&quot;auto&quot;&gt;Ensemble&lt;/code&gt;, phases, or any other framework feature. The coding agent is composed from the same primitives as every other agent.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;pre-configured-task-types&quot;&gt;Pre-configured task types&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;Common coding workflows have predictable shapes. A bug-fix task needs different instructions than a feature implementation or a refactoring:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;Task&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;fix&lt;/span&gt;&lt;span&gt;       &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;CodingTask&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;fix&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;NullPointerException in handler&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;Task&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;implement&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;CodingTask&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;implement&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Add pagination to /api/users&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;Task&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;refactor&lt;/span&gt;&lt;span&gt;  &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;CodingTask&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;refactor&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Extract UserRepository interface&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;Each returns a standard &lt;code dir=&quot;auto&quot;&gt;Task&lt;/code&gt; with appropriate description and expected-output templates. They can be further customized:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;Task&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;task&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;CodingTask&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;fix&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Some bug&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;toBuilder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;expectedOutput&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Custom expected output&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;This is a convenience, not a requirement. You can always construct a &lt;code dir=&quot;auto&quot;&gt;Task&lt;/code&gt; manually and pass it to a coding agent.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;tradeoffs-and-limitations&quot;&gt;Tradeoffs and limitations&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Project detection is heuristic.&lt;/strong&gt; It works for standard project layouts but will not detect custom build systems or unconventional directory structures. The fallback is explicit configuration via the builder.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Iteration limits are a blunt instrument.&lt;/strong&gt; A higher limit gives the agent more chances to iterate, but it also means higher token costs if the agent goes in circles. There is no substitute for good prompting and appropriate task scoping.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Workspace isolation adds a step.&lt;/strong&gt; The agent works in a worktree, but the user still needs to review and merge the changes. This is deliberate — automated merge would undermine the safety guarantee — but it does add friction to the workflow.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Tool backend selection is build-time.&lt;/strong&gt; You choose your backend by including the right dependency. Runtime switching between Java and MCP backends is possible via &lt;code dir=&quot;auto&quot;&gt;AUTO&lt;/code&gt;, but you cannot hot-swap mid-execution.&lt;/p&gt;
&lt;div&gt;&lt;h2 id=&quot;the-design-principle&quot;&gt;The design principle&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;The useful abstraction is not “an agent that can code” but “a standard agent with the right tools and context for coding tasks.” The coding agent is not a special type — it is a regular agent, assembled with project-aware tools, operating in an isolated workspace, and configured with appropriate iteration limits.&lt;/p&gt;
&lt;p&gt;This matters because it means coding agents compose with everything else in the framework: phases, delegation, human review, metrics, traces. There is no separate execution path to maintain.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;The coding agent modules are part of &lt;a href=&quot;https://github.com/AgentEnsemble/agentensemble&quot;&gt;AgentEnsemble&lt;/a&gt;. The &lt;a href=&quot;https://agentensemble.net/guides/coding-agents/&quot;&gt;coding agents guide&lt;/a&gt; and &lt;a href=&quot;https://agentensemble.net/guides/workspace-isolation/&quot;&gt;workspace isolation guide&lt;/a&gt; cover the full API.&lt;/p&gt;</content:encoded><category>java</category><category>ai</category><category>agents</category><category>architecture</category></item><item><title>Humans as Participants, Not Controllers: Designing Agent Systems That Run Without You</title><link>https://agentensemble.net/blog/human-participation/</link><guid isPermaLink="true">https://agentensemble.net/blog/human-participation/</guid><description>Most human-in-the-loop designs treat humans as gatekeepers -- the system pauses, waits for approval, times out. The harder design problem is building agent systems where humans connect and disconnect without becoming bottlenecks.

</description><pubDate>Thu, 30 Apr 2026 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;Most human-in-the-loop designs treat humans as gatekeepers. The agent pipeline pauses, a notification fires, a human reviews and approves, the pipeline continues. If the human is not there, the system waits. If the human takes too long, the system times out.&lt;/p&gt;
&lt;p&gt;This works for simple approval workflows. It does not work for systems that need to run autonomously for hours or days while humans come and go.&lt;/p&gt;
&lt;p&gt;The harder design problem is: how do you build agent systems where humans are participants in the system rather than controllers of it? Where the system runs without them, benefits from their presence, and does not break when they leave?&lt;/p&gt;
&lt;hr&gt;
&lt;div&gt;&lt;h2 id=&quot;the-controller-model-vs-the-participant-model&quot;&gt;The Controller Model vs the Participant Model&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;In the controller model, the human is a required step in the pipeline. The system cannot proceed without them. If the human is unavailable, the system blocks. Every approval gate is a potential bottleneck.&lt;/p&gt;
&lt;p&gt;In the participant model, the human connects to a running system, observes its current state, provides input where useful, makes decisions that require their authority, and disconnects. The system keeps running.&lt;/p&gt;
&lt;p&gt;The distinction is not about removing humans from the loop. It is about changing the default from “blocked, waiting for human” to “running autonomously, human welcome.”&lt;/p&gt;
&lt;hr&gt;
&lt;div&gt;&lt;h2 id=&quot;the-interaction-spectrum&quot;&gt;The Interaction Spectrum&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;Not all human interactions have the same urgency or the same blocking requirement. The design uses a five-level spectrum:&lt;/p&gt;



































&lt;table&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Level&lt;/th&gt;&lt;th&gt;Example&lt;/th&gt;&lt;th&gt;Behavior&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;strong&gt;Autonomous&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;Housekeeping cleans rooms after checkout&lt;/td&gt;&lt;td&gt;No human needed&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;strong&gt;Advisory&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;Manager says “prioritize VIP guest”&lt;/td&gt;&lt;td&gt;Human input welcomed but not required&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;strong&gt;Notifiable&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;”Water leak detected in room 305”&lt;/td&gt;&lt;td&gt;Alert a human, proceed with best-effort response&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;strong&gt;Approvable&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;Guest requests late checkout&lt;/td&gt;&lt;td&gt;Ask human if available, auto-approve on timeout&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;strong&gt;Gated&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;Opening the hotel safe&lt;/td&gt;&lt;td&gt;Cannot proceed without human authorization&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;Most interactions in a well-designed system should fall in the first three levels. The system handles them autonomously. Humans are notified of important events but do not need to take action for the system to continue.&lt;/p&gt;
&lt;p&gt;The gated level is reserved for decisions that genuinely require human authority — security decisions, compliance gates, large financial commitments. These are intentionally rare and intentionally blocking.&lt;/p&gt;
&lt;hr&gt;
&lt;div&gt;&lt;h2 id=&quot;gated-reviews-with-role-requirements&quot;&gt;Gated Reviews with Role Requirements&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;When a task requires human authorization, the review specifies who can approve:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;Task&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;openSafe&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;Task&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;description&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Open the hotel safe for cash reconciliation&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;review&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;Review&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;prompt&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Manager authorization required to open the safe&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;requiredRole&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;manager&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;timeout&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;Duration&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;&lt;span&gt;ZERO&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/span&gt;&lt;span&gt;  &lt;/span&gt;&lt;span&gt;// no timeout -- wait until a human decides&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;())&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;When this review fires and no qualified human is connected:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The review is queued&lt;/li&gt;
&lt;li&gt;An out-of-band notification is sent (Slack, email, webhook)&lt;/li&gt;
&lt;li&gt;The task waits&lt;/li&gt;
&lt;li&gt;When a qualified human connects to the dashboard, they see the pending review immediately&lt;/li&gt;
&lt;li&gt;They approve or reject, and the task resumes&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The key design choice: &lt;code dir=&quot;auto&quot;&gt;timeout(Duration.ZERO)&lt;/code&gt; means the system waits indefinitely. This is appropriate for decisions that genuinely cannot be made without human authority. For less critical approvals, a timeout with auto-approve provides the fallback:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;Review&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;prompt&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Guest requests late checkout -- approve?&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;requiredRole&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;front-desk&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;timeout&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;Duration&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;ofMinutes&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;10&lt;/span&gt;&lt;span&gt;))&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;timeoutDecision&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;ReviewDecision&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;&lt;span&gt;APPROVE&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;If no human responds within 10 minutes, the system auto-approves. The human can still intervene within the window, but the system does not block indefinitely for a non-critical decision.&lt;/p&gt;
&lt;hr&gt;
&lt;div&gt;&lt;h2 id=&quot;human-directives&quot;&gt;Human Directives&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;Humans can inject guidance into any ensemble they have access to:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;{&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;  &lt;/span&gt;&lt;span&gt;&quot;type&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;directive&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;  &lt;/span&gt;&lt;span&gt;&quot;to&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;room-service&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;  &lt;/span&gt;&lt;span&gt;&quot;from&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;manager:human&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;  &lt;/span&gt;&lt;span&gt;&quot;content&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Guest in 801 is VIP, prioritize all their requests&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;Directives are non-blocking. They do not pause the system or wait for acknowledgment. They are injected as additional context for future task executions. The next time room service processes a request related to room 801, the directive is included in the prompt context.&lt;/p&gt;
&lt;p&gt;This models how human managers actually work. A hotel manager does not approve every room service order. They walk through the hotel, observe what is happening, and give occasional direction: “That table needs attention.” “The VIP in the penthouse gets priority.” Then they move on.&lt;/p&gt;
&lt;hr&gt;
&lt;div&gt;&lt;h2 id=&quot;control-plane-directives&quot;&gt;Control Plane Directives&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;Beyond natural language guidance, humans (or automated policies) can send structured control plane directives:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;{&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;  &lt;/span&gt;&lt;span&gt;&quot;type&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;directive&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;  &lt;/span&gt;&lt;span&gt;&quot;to&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;kitchen&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;  &lt;/span&gt;&lt;span&gt;&quot;from&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;cost-policy:automated&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;  &lt;/span&gt;&lt;span&gt;&quot;action&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;SET_MODEL_TIER&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;  &lt;/span&gt;&lt;span&gt;&quot;value&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;FALLBACK&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;This switches the kitchen ensemble to a cheaper LLM model without restarting. The ensemble has configurable model tiers:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;Ensemble&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;chatLanguageModel&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;gpt4&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;// primary&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;fallbackModel&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;gpt4Mini&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;        &lt;/span&gt;&lt;span&gt;// cheaper fallback&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;Other control plane actions include pausing an ensemble, adjusting priority weights, enabling or disabling specific shared tasks, and changing queue depth limits. These are operational controls that affect ensemble behavior at runtime without redeployment.&lt;/p&gt;
&lt;hr&gt;
&lt;div&gt;&lt;h2 id=&quot;late-join-state-synchronization&quot;&gt;Late-Join State Synchronization&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;When a human connects to the dashboard — whether it is their first time today or they are reconnecting after a network interruption — they need to see the current state of the system immediately. They should not have to wait for events to stream in before understanding what is happening.&lt;/p&gt;
&lt;p&gt;The existing late-join mechanism (from v2.1.0’s &lt;code dir=&quot;auto&quot;&gt;agentensemble-web&lt;/code&gt; module) extends to the network level. When a human connects:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The dashboard sends a &lt;code dir=&quot;auto&quot;&gt;hello&lt;/code&gt; message with the human’s identity and roles&lt;/li&gt;
&lt;li&gt;Each ensemble the human has access to sends a &lt;code dir=&quot;auto&quot;&gt;snapshotTrace&lt;/code&gt; — the current state of all active tasks, pending reviews, queue depths, and recent events&lt;/li&gt;
&lt;li&gt;Live events start streaming immediately&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The human is caught up within seconds of connecting. Pending reviews that match their role are highlighted. They can start making decisions immediately without waiting for context to accumulate.&lt;/p&gt;
&lt;hr&gt;
&lt;div&gt;&lt;h2 id=&quot;operational-resilience&quot;&gt;Operational Resilience&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;The participant model enables several operational patterns that the controller model cannot support:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Elastic scaling with human oversight.&lt;/strong&gt; A conference weekend means higher load. The system scales automatically (K8s HPA watching queue depth). The human manager connects, observes the scaled-up state, adjusts priorities if needed, and disconnects. The system handles the load autonomously.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Operational profiles.&lt;/strong&gt; Predefined configurations for known scenarios:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;NetworkProfile&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;sportingEvent&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;NetworkProfile&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;name&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;sporting-event-weekend&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;ensemble&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;front-desk&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;Capacity&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;replicas&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;4&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;maxConcurrent&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;50&lt;/span&gt;&lt;span&gt;))&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;ensemble&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;kitchen&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;Capacity&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;replicas&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;3&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;maxConcurrent&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;100&lt;/span&gt;&lt;span&gt;))&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;ensemble&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;room-service&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;Capacity&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;replicas&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;3&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;maxConcurrent&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;80&lt;/span&gt;&lt;span&gt;))&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;preload&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;kitchen&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;inventory&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Extra beer and ice stocked&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;network&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;applyProfile&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;sportingEvent&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;A human can apply a profile, or profiles can activate on a schedule or via rules.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Simulation and chaos engineering.&lt;/strong&gt; Before the conference, simulate the expected load: “What happens if kitchen goes down during peak dinner service?” Run a simulation with mock LLMs, time-compressed. Get a capacity report. Then inject a kitchen failure as a chaos test. Assert that room service’s circuit breaker opens within 30 seconds and the fallback activates within 1 minute. These are built into the framework, not bolted on.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Federation.&lt;/strong&gt; Hotel A is at capacity. Hotel B across town has idle kitchen capacity. Overflow requests route to Hotel B automatically. The human manager sees both hotels on the same dashboard. This is the network-of-networks level — multiple independent agent systems sharing capacity when needed.&lt;/p&gt;
&lt;hr&gt;
&lt;div&gt;&lt;h2 id=&quot;tradeoffs&quot;&gt;Tradeoffs&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Autonomy vs oversight.&lt;/strong&gt; The more autonomous the system, the less opportunity for human correction before a mistake propagates. The mitigation is observability: the system runs autonomously but every decision is traced, logged, and visible. Humans review after the fact and inject directives to adjust future behavior.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Gating cost.&lt;/strong&gt; Every gated review is a potential bottleneck and a source of latency. The design pressure is to minimize gated interactions — reserve them for decisions that genuinely require human authority. If you find yourself gating routine operations, the system design needs revision, not more human approvals.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Notification fatigue.&lt;/strong&gt; A system that notifies humans about everything trains them to ignore notifications. The notification levels (autonomous, advisory, notifiable, approvable, gated) exist to keep the signal-to-noise ratio high. Most things should be autonomous. Notifications should be reserved for things that actually need attention.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Simulation fidelity.&lt;/strong&gt; Simulations use mock LLMs and time compression. The behavior will not perfectly match production. The value is in finding structural problems — capacity bottlenecks, missing fallbacks, broken circuit breakers — not in predicting exact outcomes.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;This is the third and final post in the Ensemble Network architecture arc. The architecture is planned for AgentEnsemble v3.0.0. The previous posts cover &lt;a href=&quot;https://agentensemble.net/blog/ensembles-as-services/&quot;&gt;ensembles as services&lt;/a&gt; and &lt;a href=&quot;https://agentensemble.net/blog/cross-ensemble-delegation/&quot;&gt;cross-ensemble delegation&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The &lt;a href=&quot;https://agentensemble.net/design/ensemble-network/&quot;&gt;design document&lt;/a&gt; covers the full architecture including discovery, error handling, versioning, security, testing, and the phased delivery plan.&lt;/p&gt;
&lt;p&gt;AgentEnsemble is open-source under the MIT license.&lt;/p&gt;</content:encoded><category>java</category><category>ai</category><category>agents</category><category>architecture</category></item><item><title>Task Sharing vs Tool Sharing: Cross-Ensemble Delegation in Distributed Agent Systems</title><link>https://agentensemble.net/blog/cross-ensemble-delegation/</link><guid isPermaLink="true">https://agentensemble.net/blog/cross-ensemble-delegation/</guid><description>MCP gives agents tool-level interoperability -- call a function, get a result. The harder problem is delegating a complex, multi-step process to another autonomous service. The contract between them turns out to be natural language.

</description><pubDate>Mon, 27 Apr 2026 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;MCP (Model Context Protocol) gives agents the ability to call tools hosted by other services. This is useful — it is function-level interoperability. An agent calls a function, gets a result, continues.&lt;/p&gt;
&lt;p&gt;But there is a level above function calls that most frameworks have not addressed: what happens when one autonomous agent system needs to delegate a complex, multi-step process to another autonomous agent system?&lt;/p&gt;
&lt;p&gt;The distinction matters. Calling a tool is like borrowing a calculator. Delegating a task is like hiring a department.&lt;/p&gt;
&lt;hr&gt;
&lt;div&gt;&lt;h2 id=&quot;two-kinds-of-sharing&quot;&gt;Two Kinds of Sharing&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;When agent ensembles run as long-lived services on a network (as described in the &lt;a href=&quot;https://agentensemble.net/blog/ensembles-as-services/&quot;&gt;previous post&lt;/a&gt;), they need to share capabilities with each other. There are two fundamentally different kinds of sharing:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Tool sharing&lt;/strong&gt; exposes a single function. The calling agent invokes it in its ReAct loop, gets a result, and continues reasoning. The tool executes atomically — there is no multi-step process, no internal agents, no review gates. This is what MCP provides.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Task sharing&lt;/strong&gt; exposes a complete process. The calling ensemble delegates work to another ensemble, which runs its own agents, tools, memory, and review gates to produce the result. The caller does not know or control the internal process. It hands off work and gets back a result.&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;// Room service uses both kinds of sharing from kitchen&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;Ensemble&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;roomService&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;Ensemble&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;name&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;room-service&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;chatLanguageModel&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;model&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;task&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;Task&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;description&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Handle guest room service request&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;tools&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;            &lt;/span&gt;&lt;span&gt;// Task sharing: delegates the full meal preparation process&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;            &lt;/span&gt;&lt;span&gt;NetworkTask&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;from&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;kitchen&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;prepare-meal&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;            &lt;/span&gt;&lt;span&gt;// Tool sharing: calls a single function for inventory check&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;            &lt;/span&gt;&lt;span&gt;NetworkTool&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;from&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;kitchen&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;check-inventory&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;            &lt;/span&gt;&lt;span&gt;NetworkTool&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;from&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;kitchen&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;dietary-check&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;            &lt;/span&gt;&lt;span&gt;// Task sharing: delegates repair work to maintenance&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;            &lt;/span&gt;&lt;span&gt;NetworkTask&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;from&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;maintenance&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;repair-request&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;))&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;())&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;Both &lt;code dir=&quot;auto&quot;&gt;NetworkTask&lt;/code&gt; and &lt;code dir=&quot;auto&quot;&gt;NetworkTool&lt;/code&gt; implement the same &lt;code dir=&quot;auto&quot;&gt;AgentTool&lt;/code&gt; interface. The agent calling them does not know whether a tool is local or remote, or whether it triggers a single function or an entire pipeline. The existing ReAct loop, tool executor, metrics, and tracing all work unchanged.&lt;/p&gt;
&lt;hr&gt;
&lt;div&gt;&lt;h2 id=&quot;how-delegation-works&quot;&gt;How Delegation Works&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;When an agent calls a shared tool, the flow is straightforward:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Agent calls &lt;code dir=&quot;auto&quot;&gt;check-inventory(&quot;wagyu beef&quot;)&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code dir=&quot;auto&quot;&gt;NetworkTool&lt;/code&gt; serializes the call into a WorkRequest&lt;/li&gt;
&lt;li&gt;Request is sent to the kitchen ensemble (WebSocket or queue)&lt;/li&gt;
&lt;li&gt;Kitchen executes &lt;code dir=&quot;auto&quot;&gt;inventoryTool.execute(&quot;wagyu beef&quot;)&lt;/code&gt; locally&lt;/li&gt;
&lt;li&gt;Result flows back: &lt;code dir=&quot;auto&quot;&gt;&quot;Yes, 3 portions available&quot;&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Agent continues its ReAct loop&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;When an agent calls a shared task, the flow involves a full pipeline on the other side:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Agent calls &lt;code dir=&quot;auto&quot;&gt;prepare-meal(&quot;Wagyu steak, medium-rare, room 403&quot;)&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code dir=&quot;auto&quot;&gt;NetworkTask&lt;/code&gt; serializes a WorkRequest with the full task context&lt;/li&gt;
&lt;li&gt;Request is sent to kitchen&lt;/li&gt;
&lt;li&gt;Kitchen runs its complete task pipeline — agent synthesis, tool calls, execution, review gates&lt;/li&gt;
&lt;li&gt;Result flows back: &lt;code dir=&quot;auto&quot;&gt;&quot;Preparing now, estimated 25 minutes, ticket #4071&quot;&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Agent continues&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The critical difference: in step 4 of the task delegation, the kitchen ensemble is running its own agents with its own tools and its own review gates. The room service agent is not involved in any of that. It delegated the work and is waiting for a result — or continuing with other work if the request was async.&lt;/p&gt;
&lt;hr&gt;
&lt;div&gt;&lt;h2 id=&quot;the-workrequest-envelope&quot;&gt;The WorkRequest Envelope&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;Every cross-ensemble message uses a standardized envelope:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;public&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;record&lt;/span&gt;&lt;span&gt; WorkRequest&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;String&lt;/span&gt;&lt;span&gt; requestId,           &lt;/span&gt;&lt;span&gt;// Correlation + idempotency key&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;String&lt;/span&gt;&lt;span&gt; from,                &lt;/span&gt;&lt;span&gt;// Requesting ensemble name&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;String&lt;/span&gt;&lt;span&gt; task,                &lt;/span&gt;&lt;span&gt;// Shared task or tool name to execute&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;String&lt;/span&gt;&lt;span&gt; context,             &lt;/span&gt;&lt;span&gt;// Natural language input/context&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;Priority&lt;/span&gt;&lt;span&gt; priority,          &lt;/span&gt;&lt;span&gt;// CRITICAL / HIGH / NORMAL / LOW&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;Duration&lt;/span&gt;&lt;span&gt; deadline,          &lt;/span&gt;&lt;span&gt;// Caller&apos;s SLA (&quot;I need this within...&quot;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;DeliverySpec&lt;/span&gt;&lt;span&gt; delivery,      &lt;/span&gt;&lt;span&gt;// How and where to return the result&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;String&lt;/span&gt;&lt;span&gt; traceContext,        &lt;/span&gt;&lt;span&gt;// W3C traceparent for distributed tracing&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;CachePolicy&lt;/span&gt;&lt;span&gt; cachePolicy,    &lt;/span&gt;&lt;span&gt;// USE_CACHED / FORCE_FRESH&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;String&lt;/span&gt;&lt;span&gt; cacheKey             &lt;/span&gt;&lt;span&gt;// Optional, for result caching&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt; {}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;A few design choices in this envelope are worth noting:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The context field is natural language.&lt;/strong&gt; When maintenance asks procurement to order parts, the context is: “Order replacement valve for building 2 boiler.” Not a typed JSON schema. Not a protobuf message. Natural language that the receiving ensemble’s LLM interprets.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The deadline belongs to the caller, not the provider.&lt;/strong&gt; The requester sets the SLA: “I need this within 30 minutes.” The provider responds with an estimated completion time. If the estimate exceeds the deadline, the caller decides: accept the longer wait, try another provider (federation), or continue without.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Delivery is caller-specified.&lt;/strong&gt; The requester tells the provider how to return the result — WebSocket for real-time, a durable queue for reliability, a webhook for external integration, or a shared store for polling.&lt;/p&gt;
&lt;hr&gt;
&lt;div&gt;&lt;h2 id=&quot;natural-language-as-contract&quot;&gt;Natural Language as Contract&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;This is the design choice I find most interesting and most debatable.&lt;/p&gt;
&lt;p&gt;In traditional microservice architectures, services communicate via typed schemas — protobuf, OpenAPI, GraphQL. Schema versioning is a constant source of friction. A field name change breaks callers. A new required field breaks backwards compatibility. Teams spend significant effort on schema evolution, versioning policies, and migration tooling.&lt;/p&gt;
&lt;p&gt;In the Ensemble Network, the contract between services is natural language. When maintenance tells procurement “order replacement parts for the boiler valve,” it does not matter whether procurement’s internal schema changed. The LLM on the receiving side interprets the request. Minor changes in wording do not break callers.&lt;/p&gt;
&lt;p&gt;This works because the participants are LLMs, not deterministic parsers. An LLM that receives “order parts for the boiler” and an LLM that receives “purchase replacement components for the heating system” will produce equivalent behavior. The semantic intent is preserved even when the exact phrasing varies.&lt;/p&gt;
&lt;p&gt;The tradeoff is real: you lose type safety. A typed schema guarantees that the data conforms to a specific shape. Natural language does not. If the receiving ensemble misinterprets the request, you get a wrong result, not a compile error. The mitigation is the same as elsewhere in agent systems: review gates, guardrails, and observability.&lt;/p&gt;
&lt;hr&gt;
&lt;div&gt;&lt;h2 id=&quot;three-request-modes&quot;&gt;Three Request Modes&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;The caller decides how to wait for the result:&lt;/p&gt;

























&lt;table&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Mode&lt;/th&gt;&lt;th&gt;Behavior&lt;/th&gt;&lt;th&gt;Use case&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;strong&gt;Await&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;Block until result&lt;/td&gt;&lt;td&gt;Critical path: “Can’t continue without this”&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;strong&gt;Async&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;Submit and continue; result delivered later&lt;/td&gt;&lt;td&gt;Non-critical: “Order towels when you get to it”&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;strong&gt;Await with deadline&lt;/strong&gt;&lt;/td&gt;&lt;td&gt;Wait up to N; then continue with partial/no result&lt;/td&gt;&lt;td&gt;Balanced: “Wait 30 min, then proceed with what I know”&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The await-with-deadline mode is the most operationally useful. It lets the caller set a budget for how long to wait before continuing. If the provider delivers within the deadline, the caller uses the result. If not, it makes a decision: retry, use a fallback, or proceed without.&lt;/p&gt;
&lt;hr&gt;
&lt;div&gt;&lt;h2 id=&quot;capacity-management&quot;&gt;Capacity Management&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;The provider’s default response to load is accept and queue, not reject. LLM tasks are not real-time request/response — they take seconds to hours. Everyone expects latency. The provider accepts the work into a priority queue and returns an estimated completion time:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;{&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;  &lt;/span&gt;&lt;span&gt;&quot;type&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;task_accepted&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;  &lt;/span&gt;&lt;span&gt;&quot;requestId&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;maint-7721&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;  &lt;/span&gt;&lt;span&gt;&quot;queuePosition&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;7&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;  &lt;/span&gt;&lt;span&gt;&quot;estimatedCompletion&quot;&lt;/span&gt;&lt;span&gt;: &lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;PT45M&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;}&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;Rejection only happens at hard limits — the queue itself is full. This “bend, don’t break” approach matches the reality of LLM workloads: capacity is elastic, latency is expected, and it is almost always better to queue work than to reject it.&lt;/p&gt;
&lt;p&gt;Priority queuing ensures critical requests are processed first (CRITICAL &gt; HIGH &gt; NORMAL &gt; LOW). Within the same priority, FIFO. Low-priority items age over time to prevent starvation.&lt;/p&gt;
&lt;hr&gt;
&lt;div&gt;&lt;h2 id=&quot;distributed-tracing&quot;&gt;Distributed Tracing&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;Every WorkRequest carries a W3C &lt;code dir=&quot;auto&quot;&gt;traceparent&lt;/code&gt; header. When maintenance delegates to procurement, which delegates to logistics, the trace context propagates across all three. Open Jaeger (or any W3C-compatible tracing backend) and you see the full chain: which ensemble originated the request, how long each step took, where the bottleneck was.&lt;/p&gt;
&lt;p&gt;This is standard distributed tracing, not a custom solution. The same infrastructure teams use for HTTP microservices works here. The difference is that each span may represent an LLM call that takes 30 seconds instead of a database query that takes 3 milliseconds.&lt;/p&gt;
&lt;hr&gt;
&lt;div&gt;&lt;h2 id=&quot;tradeoffs&quot;&gt;Tradeoffs&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Loose coupling vs type safety.&lt;/strong&gt; Natural language contracts are resilient to change but do not guarantee correctness. Typed schemas guarantee correctness but are brittle to change. The right choice depends on how stable the interface is. For evolving, exploratory agent systems, natural language is pragmatic. For stable, high-volume interfaces, a typed schema wrapper may be worth the friction.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Latency tolerance.&lt;/strong&gt; Cross-ensemble delegation adds network hops and queuing delays. A task that takes 10 seconds locally may take 2 minutes when delegated across a network. The architecture assumes latency tolerance — if your use case requires sub-second responses, delegation is the wrong pattern.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Failure modes.&lt;/strong&gt; When the kitchen ensemble is down, room service’s &lt;code dir=&quot;auto&quot;&gt;prepare-meal&lt;/code&gt; call fails. The circuit breaker opens. The agent needs a fallback — suggest alternatives, queue the request for later, or inform the guest. Distributed systems fail in distributed ways. The framework provides the circuit breaker and fallback mechanisms, but the failure strategy is application-specific.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Observability cost.&lt;/strong&gt; Every cross-ensemble request generates trace data, metrics, and log entries. In a busy network with many delegations, the observability overhead is non-trivial. The tracing infrastructure needs to handle the volume, and teams need dashboards that make sense of the flow.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;This is the second post in a three-part arc on the Ensemble Network architecture. The next post covers human participation — how humans connect to and interact with a network of autonomous ensembles without becoming bottlenecks.&lt;/p&gt;
&lt;p&gt;The &lt;a href=&quot;https://agentensemble.net/design/ensemble-network/&quot;&gt;design document&lt;/a&gt; covers the full architecture.&lt;/p&gt;
&lt;p&gt;AgentEnsemble is open-source under the MIT license.&lt;/p&gt;</content:encoded><category>java</category><category>ai</category><category>agents</category><category>architecture</category></item><item><title>From Run-and-Exit to Always-On: When Agent Ensembles Become Services</title><link>https://agentensemble.net/blog/ensembles-as-services/</link><guid isPermaLink="true">https://agentensemble.net/blog/ensembles-as-services/</guid><description>Every multi-agent framework today models scripts -- define agents, run tasks, get output, exit. The harder problem is building agent systems that run continuously, handle work as it arrives, and survive restarts.

</description><pubDate>Fri, 24 Apr 2026 00:00:00 GMT</pubDate><content:encoded>&lt;p&gt;Every multi-agent framework works the same way at its core. You define some agents, give them tasks, press go, get output. The agents exist for the duration of the run and then disappear.&lt;/p&gt;
&lt;p&gt;This is fine for bounded problems: “research this topic and write a report.” But it does not model how real work gets done in production systems that need to be always-on, multi-domain, and human-augmented.&lt;/p&gt;
&lt;p&gt;The question I kept coming back to was: what changes when an ensemble stops being a script and starts being a service?&lt;/p&gt;
&lt;hr&gt;
&lt;div&gt;&lt;h2 id=&quot;scripts-vs-services&quot;&gt;Scripts vs Services&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;A script runs and exits. You invoke it, it does work, it returns a result, the process terminates. Every multi-agent framework today — CrewAI, AutoGen, LangGraph, AgentEnsemble v2.x — operates in this mode.&lt;/p&gt;
&lt;p&gt;A service runs continuously. It handles work as it arrives, communicates with peers, maintains state between requests, and survives restarts. The difference is not just about uptime — it changes the entire interaction model.&lt;/p&gt;
&lt;p&gt;When an ensemble is a script, it is invoked by something external. When an ensemble is a service, it participates in a network of other services. It can accept work from multiple sources, share capabilities with peers, and run proactive tasks on a schedule — all without an external orchestrator telling it what to do.&lt;/p&gt;
&lt;hr&gt;
&lt;div&gt;&lt;h2 id=&quot;the-hotel-model&quot;&gt;The Hotel Model&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;Consider a hotel. It is composed of departments: front desk, housekeeping, kitchen, room service, maintenance, procurement. Each department is autonomous — it has its own staff, processes, and expertise. These departments communicate with each other directly. Room service calls the kitchen to prepare a meal. Maintenance calls procurement to order spare parts.&lt;/p&gt;
&lt;p&gt;The hotel runs continuously. The manager comes in at 8am, walks around, checks on things, gives some direction, handles decisions that require authority, and goes home at 6pm. The hotel does not stop when the manager leaves.&lt;/p&gt;
&lt;p&gt;This maps directly to a distributed agent architecture:&lt;/p&gt;

































&lt;table&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Hotel concept&lt;/th&gt;&lt;th&gt;Agent system equivalent&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;A department&lt;/td&gt;&lt;td&gt;An ensemble — long-running, autonomous&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Staff within a department&lt;/td&gt;&lt;td&gt;Agents and tasks within the ensemble&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;The intercom / phone system&lt;/td&gt;&lt;td&gt;WebSocket mesh — the message transport&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;A work order&lt;/td&gt;&lt;td&gt;A WorkRequest — the standard message envelope&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;The hotel directory&lt;/td&gt;&lt;td&gt;Service registry — ensembles discover each other&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;The duty manager&lt;/td&gt;&lt;td&gt;A human who connects via the dashboard to observe and intervene&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;The key observation: the hotel is not centrally orchestrated. There is no “manager agent” that routes every message. Departments handle their domain and communicate laterally.&lt;/p&gt;
&lt;hr&gt;
&lt;div&gt;&lt;h2 id=&quot;two-execution-modes&quot;&gt;Two Execution Modes&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;The existing one-shot mode remains unchanged:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;EnsembleOutput&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;output&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;Ensemble&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;run&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;model,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;Task&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;of&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Research AI trends&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;,&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;Task&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;of&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Write a report&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;))&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;Tasks execute, output is returned, the ensemble is done. This is a “gig” — a bounded unit of work.&lt;/p&gt;
&lt;p&gt;The new long-running mode turns the ensemble into a service:&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;Ensemble&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;kitchen&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;=&lt;/span&gt;&lt;span&gt; &lt;/span&gt;&lt;span&gt;Ensemble&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;name&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;kitchen&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;chatLanguageModel&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;model&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;task&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;Task&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;of&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Manage kitchen operations&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;))&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;// Share capabilities to the network&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;shareTask&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;prepare-meal&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, &lt;/span&gt;&lt;span&gt;Task&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;description&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Prepare a meal as specified&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;expectedOutput&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Confirmation with preparation details and timing&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;())&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;shareTool&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;check-inventory&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;, inventoryTool&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;    &lt;/span&gt;&lt;span&gt;// Scheduled proactive task&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;scheduledTask&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;ScheduledTask&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;builder&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;name&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;inventory-report&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;task&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;Task&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;of&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;Check current inventory levels and report shortages&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;))&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;schedule&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;Schedule&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;every&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;Duration&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;ofHours&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;1&lt;/span&gt;&lt;span&gt;)))&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;broadcastTo&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;hotel.inventory&lt;/span&gt;&lt;span&gt;&quot;&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;        &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;())&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;    &lt;/span&gt;&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;build&lt;/span&gt;&lt;span&gt;()&lt;/span&gt;&lt;span&gt;;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;
&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;kitchen&lt;/span&gt;&lt;span&gt;.&lt;/span&gt;&lt;span&gt;start&lt;/span&gt;&lt;span&gt;(&lt;/span&gt;&lt;span&gt;7329&lt;/span&gt;&lt;span&gt;)&lt;/span&gt;&lt;span&gt;;  &lt;/span&gt;&lt;span&gt;// WebSocket server, K8s Service fronts this&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;In long-running mode, the ensemble:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Registers shared tasks and tools on the network&lt;/li&gt;
&lt;li&gt;Accepts incoming work requests via WebSocket, queue, HTTP, or topic subscription&lt;/li&gt;
&lt;li&gt;Processes work through a priority queue&lt;/li&gt;
&lt;li&gt;Delivers results via the caller-specified delivery method&lt;/li&gt;
&lt;li&gt;Runs scheduled proactive tasks on configured intervals&lt;/li&gt;
&lt;li&gt;Continues until explicitly stopped or drained&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The &lt;code dir=&quot;auto&quot;&gt;start(port)&lt;/code&gt; call is the boundary between script and service. Before it, the ensemble is a configuration. After it, the ensemble is an active participant in a network.&lt;/p&gt;
&lt;hr&gt;
&lt;div&gt;&lt;h2 id=&quot;work-ingress&quot;&gt;Work Ingress&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;When an ensemble becomes a service, work can arrive from multiple sources simultaneously:&lt;/p&gt;





























&lt;table&gt;&lt;thead&gt;&lt;tr&gt;&lt;th&gt;Source&lt;/th&gt;&lt;th&gt;Description&lt;/th&gt;&lt;/tr&gt;&lt;/thead&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;WebSocket&lt;/td&gt;&lt;td&gt;Direct from another ensemble (real-time)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Queue&lt;/td&gt;&lt;td&gt;Pull from durable queue (Kafka, SQS, Redis Streams)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;HTTP API&lt;/td&gt;&lt;td&gt;&lt;code dir=&quot;auto&quot;&gt;POST /api/work&lt;/code&gt; (external systems, scripts, CI pipelines)&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Topic subscription&lt;/td&gt;&lt;td&gt;React to events from other ensembles&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;Schedule&lt;/td&gt;&lt;td&gt;Internal cron/interval (proactive tasks)&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;
&lt;p&gt;All sources normalize into the same internal format before entering the ensemble’s priority queue. The ensemble processes work by priority (CRITICAL &gt; HIGH &gt; NORMAL &gt; LOW), with FIFO ordering within the same priority level.&lt;/p&gt;
&lt;p&gt;This means an ensemble can simultaneously handle direct requests from peer ensembles, pull batch work from a queue, respond to events, and run scheduled health checks — without any of these mechanisms knowing about each other.&lt;/p&gt;
&lt;hr&gt;
&lt;div&gt;&lt;h2 id=&quot;deployment-model&quot;&gt;Deployment Model&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;Each ensemble deploys as a Kubernetes service — one or more pods behind a K8s Service resource. Ensembles discover each other via DNS name. This is standard infrastructure that operations teams already know how to manage.&lt;/p&gt;
&lt;div&gt;&lt;figure&gt;&lt;figcaption&gt;&lt;/figcaption&gt;&lt;pre&gt;&lt;code&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;Namespace: hotel-downtown&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;  &lt;/span&gt;&lt;/span&gt;&lt;span&gt;+-- Service: kitchen&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;  &lt;/span&gt;&lt;/span&gt;&lt;span&gt;+-- Service: room-service&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;  &lt;/span&gt;&lt;/span&gt;&lt;span&gt;+-- Service: maintenance&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;  &lt;/span&gt;&lt;/span&gt;&lt;span&gt;+-- Service: front-desk&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;div&gt;&lt;span&gt;&lt;span&gt;  &lt;/span&gt;&lt;/span&gt;&lt;span&gt;+-- Service: dashboard&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;&lt;/div&gt;
&lt;p&gt;Scaling is handled by Kubernetes HPA watching queue depth or request latency. Conference weekend with heavy kitchen load? Scale kitchen to 3 replicas. Off-peak Tuesday? Scale back to 1. The ensemble handles replica coordination through broadcast-claim delivery: a work request is offered to all replicas, and the first to claim it processes it.&lt;/p&gt;
&lt;hr&gt;
&lt;div&gt;&lt;h2 id=&quot;what-changes&quot;&gt;What Changes&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;The shift from script to service changes several things:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Lifecycle management matters.&lt;/strong&gt; A script that crashes restarts from scratch. A service that crashes needs graceful shutdown, drain logic, and state recovery. The ensemble supports a drain mode where it stops accepting new work, finishes in-flight tasks, and shuts down cleanly. On restart, it picks up queued work from durable sources.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Proactive work becomes possible.&lt;/strong&gt; A script only does what you tell it to do. A service can schedule its own work — periodic inventory checks, health assessments, report generation. These scheduled tasks run on internal timers and broadcast results to interested subscribers.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Observability changes.&lt;/strong&gt; A script that runs for 30 seconds needs a log. A service that runs for months needs a dashboard. The existing web module (WebSocket server, live trace streaming, late-join snapshot) extends naturally to the long-running model.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The human relationship changes.&lt;/strong&gt; A script blocks on human input and times out. A service has humans who connect and disconnect. They observe the current state, give direction, handle decisions that need authority, and leave. The system keeps running. This is a deep enough topic that the next post in this series will cover it in detail.&lt;/p&gt;
&lt;hr&gt;
&lt;div&gt;&lt;h2 id=&quot;tradeoffs&quot;&gt;Tradeoffs&lt;/h2&gt;&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Complexity vs capability.&lt;/strong&gt; A script is simple: invoke it, get a result. A service requires infrastructure — Kubernetes, queues, monitoring, lifecycle management. If your workload is “run this pipeline once and give me the output,” the service model is unnecessary overhead.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Always-on cost.&lt;/strong&gt; A script uses resources only while it runs. A service uses resources continuously, even when idle. For intermittent workloads, the cost calculus favors one-shot execution with on-demand scaling.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;State management.&lt;/strong&gt; Scripts are stateless by nature — they start fresh every time. Services accumulate state: queued work, scheduled tasks, shared memory, connection state. This state needs to be durable, recoverable, and observable.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;When to use which.&lt;/strong&gt; The one-shot mode is right for discrete, bounded problems. The long-running mode is right when the workload is continuous, when multiple domains need to communicate, when humans need to observe and participate without blocking, and when the system needs to be always-on.&lt;/p&gt;
&lt;p&gt;Both modes coexist. An ensemble that runs as a long-running service can still execute individual tasks in one-shot mode internally. The architecture does not force a choice — it extends the existing model.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;This is the first post in a three-part arc on the Ensemble Network architecture planned for v3.0.0. The next post covers cross-ensemble delegation — how ensembles share tasks and tools across service boundaries, and why the contract between them is natural language, not typed schemas.&lt;/p&gt;
&lt;p&gt;The &lt;a href=&quot;https://agentensemble.net/design/ensemble-network/&quot;&gt;design document&lt;/a&gt; covers the full architecture.&lt;/p&gt;
&lt;p&gt;AgentEnsemble is open-source under the MIT license.&lt;/p&gt;</content:encoded><category>java</category><category>ai</category><category>agents</category><category>architecture</category></item></channel></rss>