Introduction
neuron provides composable building blocks for AI agents in Rust. Each block is an independent crate, versioned and published separately. Pull one block without buying the whole stack.
Philosophy: serde, not serde_json
neuron is to agent frameworks what serde is to serde_json. It defines traits
(Provider, Tool, ContextStrategy) and provides foundational implementations.
An SDK layer composes these blocks into opinionated workflows.
Every Rust and Python agent framework converges on the same ~300-line while loop. The differentiation is never the loop — it’s the blocks around it: context management, tool pipelines, durability, runtime. Nobody ships those blocks independently. That’s the gap neuron fills.
What’s Included
neuron ships the following crates:
| Crate | Purpose |
|---|---|
neuron-types | Core traits and types — Provider, Tool, ContextStrategy, Message |
neuron-provider-anthropic | Anthropic Messages API (streaming, tool use, server-side compaction) |
neuron-provider-openai | OpenAI Chat Completions + Embeddings API |
neuron-provider-ollama | Ollama local inference API |
neuron-tool | ToolRegistry with composable middleware pipeline |
neuron-tool-macros | #[neuron_tool] derive macro |
neuron-context | Compaction strategies, token counting, system prompt injection |
neuron-loop | Configurable AgentLoop with streaming, cancellation, parallel tools |
neuron-mcp | Model Context Protocol client and server (stdio + Streamable HTTP) |
neuron-runtime | Sessions, guardrails, TracingHook, GuardrailHook, DurableContext |
neuron-otel | OpenTelemetry instrumentation with GenAI semantic conventions (gen_ai.* spans) |
neuron | Umbrella crate with feature flags for all of the above |
Who Is This For?
- Rust developers building AI-powered applications who want control over each layer of the stack
- Framework authors who need well-tested building blocks to compose into higher-level abstractions
- AI agents (like Claude Code) that need to understand, evaluate, and work with the codebase
What neuron Is NOT
neuron is the layer below frameworks. It does not provide:
- CLI, TUI, or GUI applications
- Opinionated agent framework (compose one from the blocks)
- RAG pipeline (use the
EmbeddingProvidertrait with your own retrieval) - Workflow engine (integrate with Temporal/Restate via
DurableContext) - Retry middleware (use tower or your durable engine’s retry policy)
Next Steps
- Installation — add neuron to your project
- Quickstart — build your first agent in 50 lines
- Core Concepts — understand the five key abstractions
Installation
Using the Umbrella Crate
The fastest way to get started is the neuron umbrella crate with feature flags:
[dependencies]
neuron = { features = ["anthropic"] }
Or install via cargo:
cargo add neuron --features anthropic
Feature Flags
| Feature | Enables | Default |
|---|---|---|
anthropic | neuron-provider-anthropic | Yes |
openai | neuron-provider-openai | No |
ollama | neuron-provider-ollama | No |
mcp | neuron-mcp (Model Context Protocol) | No |
runtime | neuron-runtime (sessions, guardrails) | No |
otel | neuron-otel (OpenTelemetry instrumentation) | No |
full | All of the above | No |
Using Individual Crates
Each neuron crate is independently published. Use them directly for finer control over dependencies:
[dependencies]
neuron-types = "*"
neuron-provider-openai = "*"
neuron-tool = "*"
neuron-loop = "*"
This pulls in only what you need — no transitive dependency on providers you don’t use.
Minimum Supported Rust Version
neuron requires Rust 1.90+ (edition 2024). It uses native async traits
(RPITIT) and requires no #[async_trait] macro.
Environment Variables
Each provider loads credentials from environment variables via from_env():
| Provider | Environment Variable |
|---|---|
| Anthropic | ANTHROPIC_API_KEY |
| OpenAI | OPENAI_API_KEY |
| Ollama | OLLAMA_HOST (default: http://localhost:11434) |
Quickstart
Build a working AI agent in ~50 lines of Rust.
Prerequisites
- Rust 1.90+
- An API key for Anthropic or OpenAI (set as
ANTHROPIC_API_KEYorOPENAI_API_KEY)
The Agent
use neuron::prelude::*;
use neuron_provider_anthropic::Anthropic;
use neuron_tool::ToolRegistry;
use neuron_loop::AgentLoop;
use neuron_context::SlidingWindowStrategy;
use neuron_types::*;
use schemars::JsonSchema;
use serde::{Deserialize, Serialize};
// 1. Define a tool
struct GetWeather;
impl Tool for GetWeather {
const NAME: &'static str = "get_weather";
type Args = WeatherArgs;
type Output = String;
type Error = std::io::Error;
fn definition(&self) -> ToolDefinition {
ToolDefinition {
name: "get_weather".to_string(),
title: None,
description: "Get the current weather for a city".to_string(),
input_schema: schemars::schema_for!(WeatherArgs).into(),
output_schema: None,
annotations: None,
cache_control: None,
}
}
async fn call(&self, args: WeatherArgs, _ctx: &ToolContext) -> Result<String, std::io::Error> {
Ok(format!("Weather in {}: 72°F, sunny", args.city))
}
}
#[derive(Debug, Deserialize, JsonSchema)]
struct WeatherArgs {
/// The city to get weather for
city: String,
}
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// 2. Set up a provider
let provider = Anthropic::from_env()?;
// 3. Register tools
let mut tools = ToolRegistry::new();
tools.register(GetWeather);
// 4. Create the context strategy
let context = SlidingWindowStrategy::new(10, 100_000);
// 5. Build and run the agent loop
let mut agent = AgentLoop::builder(provider, context)
.tools(tools)
.system_prompt("You are a helpful weather assistant.")
.max_turns(5)
.build();
let ctx = ToolContext::default();
let result = agent.run(Message::user("What's the weather in San Francisco?"), &ctx).await?;
println!("{}", result.response);
Ok(())
}
What Just Happened?
- Provider —
Anthropic::from_env()creates an API client fromANTHROPIC_API_KEY - Tool —
GetWeatherimplements theTooltrait with typed args and output - Registry —
ToolRegistrystores tools and handles JSON deserialization - Context —
SlidingWindowStrategykeeps the conversation within token limits - Loop —
AgentLoopdrives the conversation: send message, get response, execute tools, repeat
The agent loop handles multi-turn tool use automatically. When Claude calls get_weather,
the loop executes the tool and sends the result back. The loop continues until Claude
responds without tool calls or hits max_turns.
Next Steps
- Core Concepts — understand Provider, Tool, ContextStrategy, and more
- Tools Guide — the
#[neuron_tool]macro, middleware, and advanced patterns - Providers Guide — switching between Anthropic, OpenAI, and Ollama
Core Concepts
neuron is built around five core abstractions. Each is a trait defined in
neuron-types with one or more implementations in satellite crates.
Provider
The Provider trait abstracts LLM API calls. Each provider is its own crate.
pub trait Provider: Send + Sync {
async fn complete(&self, request: CompletionRequest) -> Result<CompletionResponse, ProviderError>;
async fn complete_stream(&self, request: CompletionRequest) -> Result<StreamHandle, ProviderError>;
}
Implementations: Anthropic, OpenAi, Ollama. All support from_env() for
credential loading.
Tool
The Tool trait defines a function the model can call. Tools have typed
arguments (via schemars for JSON Schema) and typed outputs.
pub trait Tool: Send + Sync {
const NAME: &'static str;
type Args: DeserializeOwned + JsonSchema;
type Output: Serialize;
type Error: std::error::Error;
fn definition(&self) -> ToolDefinition;
async fn call(&self, args: Self::Args, ctx: &ToolContext) -> Result<Self::Output, Self::Error>;
}
The ToolRegistry stores tools with type erasure (ToolDyn) and runs them
through a composable middleware pipeline.
ContextStrategy
The ContextStrategy trait manages conversation history to stay within token
limits.
pub trait ContextStrategy: Send + Sync {
fn should_compact(&self, messages: &[Message], token_count: usize) -> bool;
async fn compact(&self, messages: Vec<Message>) -> Result<Vec<Message>, ContextError>;
fn token_estimate(&self, messages: &[Message]) -> usize;
}
Implementations: SlidingWindowStrategy (drop oldest messages),
ToolResultClearingStrategy (clear tool outputs), CompositeStrategy (chain
multiple strategies).
ObservabilityHook
Hooks observe the agent loop lifecycle without altering it (unless they terminate).
pub trait ObservabilityHook: Send + Sync {
async fn on_event(&self, event: HookEvent<'_>) -> HookAction;
}
HookAction is Continue, Skip, or Terminate(String). Implementations:
TracingHook (structured tracing spans), GuardrailHook (input/output
guardrails as hooks).
DurableContext
Wraps side effects (LLM calls, tool execution) for durable engines like Temporal or Restate.
pub trait DurableContext: Send + Sync {
fn execute_llm_call(&self, request: CompletionRequest, options: ActivityOptions)
-> impl Future<Output = Result<CompletionResponse, DurableError>> + Send;
fn execute_tool(&self, tool_name: &str, input: Value, ctx: &ToolContext, options: ActivityOptions)
-> impl Future<Output = Result<ToolOutput, DurableError>> + Send;
}
This enables journal-based replay and recovery for long-running agents.
Tools
The tool system in neuron lets you give LLMs the ability to call into your Rust
code. You define strongly-typed tools, register them in a ToolRegistry, and
optionally wrap execution with middleware for logging, validation, or permissions.
Quick example
use neuron_tool::{neuron_tool, ToolRegistry};
use neuron_types::ToolContext;
#[neuron_tool(name = "lookup", description = "Look up a value by key")]
async fn lookup(
/// The key to look up
key: String,
_ctx: &ToolContext,
) -> Result<String, std::io::Error> {
Ok(format!("value for {key}"))
}
#[tokio::main]
async fn main() {
let mut registry = ToolRegistry::new();
registry.register(LookupTool);
let ctx = ToolContext::default();
let output = registry
.execute("lookup", serde_json::json!({"key": "foo"}), &ctx)
.await
.unwrap();
println!("{:?}", output.content);
}
Core traits
Tool – strongly typed
The Tool trait is the primary way to define a tool. It uses Rust’s type system
to enforce correct input/output handling at compile time.
pub trait Tool: Send + Sync {
const NAME: &'static str;
type Args: DeserializeOwned + JsonSchema + Send;
type Output: Serialize;
type Error: std::error::Error + Send + 'static;
fn definition(&self) -> ToolDefinition;
fn call(&self, args: Self::Args, ctx: &ToolContext) -> impl Future<Output = Result<Self::Output, Self::Error>> + Send;
}
Key points:
NAME– a unique identifier the LLM uses to invoke the tool.Args– must deriveDeserializeandschemars::JsonSchemaso the registry can generate a JSON Schema for the LLM and deserialize its input.Output– must implementSerialize; the blanketToolDynimpl serializes it to JSON automatically.definition()– returns aToolDefinitioncontaining the name, description, and JSON Schema. The LLM sees this to decide when to call the tool.
ToolDyn – type-erased
Every Tool automatically implements ToolDyn via a blanket impl. ToolDyn is
the dyn-compatible version that the ToolRegistry stores internally:
pub trait ToolDyn: Send + Sync {
fn name(&self) -> &str;
fn definition(&self) -> ToolDefinition;
fn call_dyn(&self, input: serde_json::Value, ctx: &ToolContext) -> WasmBoxedFuture<'_, Result<ToolOutput, ToolError>>;
}
The blanket impl handles JSON deserialization of Args, calling Tool::call,
serializing the Output into ToolOutput, and mapping errors to ToolError.
The #[neuron_tool] macro
For simple tools, the neuron_tool attribute macro reduces boilerplate. It
generates the Args struct, Tool struct, and Tool impl from a single
annotated async function:
use neuron_tool::neuron_tool;
use neuron_types::ToolContext;
#[derive(Debug, serde::Serialize)]
struct WeatherOutput { temperature: f64, conditions: String }
#[derive(Debug, thiserror::Error)]
#[error("weather error: {0}")]
struct WeatherError(String);
#[neuron_tool(name = "get_weather", description = "Get current weather for a city")]
async fn get_weather(
/// City name (e.g. "San Francisco")
city: String,
_ctx: &ToolContext,
) -> Result<WeatherOutput, WeatherError> {
// The macro generates GetWeatherTool and GetWeatherArgs automatically
Ok(WeatherOutput { temperature: 72.0, conditions: "sunny".into() })
}
The macro generates:
GetWeatherArgs– a struct with#[derive(Deserialize, JsonSchema)]GetWeatherTool– a unit struct implementingTool- Doc comments on function parameters become JSON Schema descriptions
Register the generated struct: registry.register(GetWeatherTool).
ToolRegistry
The registry stores tools and executes them through an optional middleware chain.
use neuron_tool::ToolRegistry;
let mut registry = ToolRegistry::new();
// Register a strongly-typed tool (auto-erased to ToolDyn)
registry.register(MyTool);
// Register a pre-erased tool (e.g. from MCP bridge)
registry.register_dyn(arc_tool_dyn);
// Get all definitions to send to the LLM
let defs: Vec<ToolDefinition> = registry.definitions();
// Execute a tool by name with JSON input
let output = registry.execute("my_tool", json_input, &tool_ctx).await?;
// Look up a specific tool
let tool: Option<Arc<dyn ToolDyn>> = registry.get("my_tool");
ToolContext
Every tool call receives a ToolContext providing runtime information:
| Field | Type | Description |
|---|---|---|
cwd | PathBuf | Current working directory |
session_id | String | Session identifier |
environment | HashMap<String, String> | Key-value environment |
cancellation_token | CancellationToken | Cooperative cancellation |
progress_reporter | Option<Arc<dyn ProgressReporter>> | Progress feedback for long-running tools |
ToolContext implements Default with the current directory, an empty session
ID, an empty environment, and a fresh cancellation token.
Middleware
Middleware wraps tool execution with cross-cutting concerns. The pattern is
identical to axum’s from_fn – each middleware receives a Next that it can
call to continue the chain, or skip to short-circuit.
Writing middleware with closures
use neuron_tool::{tool_middleware_fn, ToolRegistry};
let logging = tool_middleware_fn(|call, ctx, next| {
Box::pin(async move {
println!("calling tool: {}", call.name);
let result = next.run(call, ctx).await;
println!("tool completed: is_error={}", result.as_ref().map(|o| o.is_error).unwrap_or(true));
result
})
});
let mut registry = ToolRegistry::new();
registry.add_middleware(logging);
Writing middleware as a struct
use neuron_tool::middleware::{ToolMiddleware, ToolCall, Next};
use neuron_types::{ToolContext, ToolError, ToolOutput, WasmBoxedFuture};
struct RateLimiter { /* ... */ }
impl ToolMiddleware for RateLimiter {
fn process<'a>(
&'a self,
call: &'a ToolCall,
ctx: &'a ToolContext,
next: Next<'a>,
) -> WasmBoxedFuture<'a, Result<ToolOutput, ToolError>> {
Box::pin(async move {
// Check rate limit, then proceed
next.run(call, ctx).await
})
}
}
Input validation middleware
A common use case is intercepting tool calls to validate input arguments before
the tool executes. When validation fails, returning ToolError::ModelRetry gives
the model a hint so it can self-correct rather than crashing the loop.
Here is a closure-based validation middleware that checks URL and numeric range arguments:
use neuron_tool::{tool_middleware_fn, ToolRegistry};
use neuron_types::ToolError;
let mut registry = ToolRegistry::new();
// Input validation middleware — rejects invalid arguments with a hint
// so the model can self-correct
registry.add_middleware(tool_middleware_fn(|call, ctx, next| {
Box::pin(async move {
// Validate URL arguments
if let Some(url) = call.input.get("url").and_then(|v| v.as_str()) {
if !url.starts_with("https://") {
return Err(ToolError::ModelRetry(
format!("url must start with https://, got '{url}'")
));
}
}
// Validate numeric ranges
if let Some(count) = call.input.get("count").and_then(|v| v.as_u64()) {
if count == 0 || count > 100 {
return Err(ToolError::ModelRetry(
format!("count must be 1-100, got {count}")
));
}
}
// Input is valid — proceed to the tool
next.run(call, ctx).await
})
}));
The middleware reads fields from call.input (a serde_json::Value) and returns
early with a validation hint when constraints are violated. Because it uses
ToolError::ModelRetry, the agentic loop converts the message into an error tool
result that the model sees as feedback – it can then retry the call with
corrected arguments.
For the struct-based approach, implement ToolMiddleware the same way as the
RateLimiter example above, placing validation logic inside the process method
and returning Err(ToolError::ModelRetry(hint)) on failure.
ToolError variants for validation
Choose the right error variant depending on whether the model can recover:
| Variant | Behavior | Use when |
|---|---|---|
ModelRetry(hint) | Loop sends the hint back to the model as an error tool result. The model retries with corrected arguments. | Validation errors the model can fix: bad format, out-of-range values, missing optional fields |
InvalidInput(msg) | Propagates as LoopError::Tool and stops the loop. | Unrecoverable issues: impossible argument combinations, security violations, malformed JSON |
Use ModelRetry as the default for input validation. Reserve InvalidInput for
cases where no amount of retrying will produce valid input.
Scoping validation to specific tools
Use per-tool middleware to apply validation only where it is needed:
// This validation runs only when the "fetch_page" tool is called
registry.add_tool_middleware("fetch_page", tool_middleware_fn(|call, ctx, next| {
Box::pin(async move {
if let Some(url) = call.input.get("url").and_then(|v| v.as_str()) {
if !url.starts_with("https://") {
return Err(ToolError::ModelRetry(
format!("fetch_page requires an https:// URL, got '{url}'")
));
}
}
next.run(call, ctx).await
})
}));
Middleware execution order
Middleware executes in registration order, wrapping tool calls from outside in:
- Global middleware (registered with
add_middleware) runs first - Per-tool middleware (registered with
add_tool_middleware) runs next - The actual tool executes last
registry.add_middleware(logging_middleware); // Runs first for ALL tools
registry.add_tool_middleware("search", auth_mw); // Runs second, only for "search"
// The tool itself runs last
Built-in middleware
neuron-tool ships built-in middleware implementations:
PermissionChecker– checks aPermissionPolicybefore each tool call. ReturnsToolError::PermissionDeniedonDenyorAskdecisions.OutputFormatter– truncates tool output exceeding a character limit. Useful to prevent large tool results from consuming the context window.SchemaValidator– validates tool call inputs against their JSON Schema before execution. Catches missing required fields and type mismatches.TimeoutMiddleware– wraps tool calls withtokio::time::timeout. Configurable default timeout and per-tool overrides for tools with different latency characteristics.StructuredOutputValidator– validates tool input against the tool’s JSON Schema and returnsToolError::ModelRetryon failure, giving the model a chance to self-correct with the validation error as a hint.RetryLimitedValidator– wrapsStructuredOutputValidatorwith a maximum retry count. After the retry limit is exhausted, convertsModelRetrytoToolError::InvalidInputto stop the loop rather than retrying indefinitely.
use neuron_tool::builtin::{PermissionChecker, OutputFormatter, SchemaValidator};
// Truncate outputs longer than 10,000 characters
registry.add_middleware(OutputFormatter::new(10_000));
// Validate inputs before execution
let validator = SchemaValidator::new(®istry);
registry.add_middleware(validator);
TimeoutMiddleware
Wraps each tool call with tokio::time::timeout. If a tool exceeds the
configured duration, the call is cancelled and returns
ToolError::ExecutionFailed with a timeout message.
use std::time::Duration;
use neuron_tool::builtin::TimeoutMiddleware;
// Default timeout of 30 seconds for all tools
let timeout = TimeoutMiddleware::new(Duration::from_secs(30))
// Override for specific tools that need more time
.with_tool_timeout("slow_search", Duration::from_secs(120))
.with_tool_timeout("code_execution", Duration::from_secs(300));
registry.add_middleware(timeout);
The middleware checks the tool name from the ToolCall and uses the per-tool
timeout if one was configured, otherwise the default. This is useful when most
tools are fast but a few (external API calls, code execution) need longer
deadlines.
StructuredOutputValidator
Validates tool input JSON against the tool’s JSON Schema before execution. When
validation fails, it returns ToolError::ModelRetry with the validation errors
as a hint, giving the model a chance to fix its arguments and retry.
use neuron_tool::builtin::StructuredOutputValidator;
// Validate tool output against a JSON Schema, with up to 3 retries
let schema = serde_json::json!({
"type": "object",
"required": ["result"],
"properties": { "result": { "type": "string" } }
});
let validator = StructuredOutputValidator::new(schema, 3);
registry.add_middleware(validator);
Unlike SchemaValidator (which returns ToolError::InvalidInput on failure),
StructuredOutputValidator uses the ModelRetry self-correction pattern. The
model receives the validation errors as feedback and can retry with corrected
arguments. This is directly inspired by Pydantic AI’s validation-retry loop.
RetryLimitedValidator
Wraps StructuredOutputValidator with a maximum retry count. After the model
has retried a specified number of times, the validator converts ModelRetry
to ToolError::InvalidInput, stopping the self-correction loop and propagating
the error.
use neuron_tool::builtin::{RetryLimitedValidator, StructuredOutputValidator};
// Create a structured validator, then wrap it with a retry limit
let schema = serde_json::json!({
"type": "object",
"required": ["result"],
"properties": { "result": { "type": "string" } }
});
let inner = StructuredOutputValidator::new(schema, 3);
let validator = RetryLimitedValidator::new(inner);
registry.add_middleware(validator);
This prevents infinite retry loops when the model consistently produces invalid
input. After the inner validator’s retry limit is exhausted,
RetryLimitedValidator converts ModelRetry to ToolError::InvalidInput.
ToolError::ModelRetry
The ModelRetry variant enables self-correction. When a tool returns
Err(ToolError::ModelRetry(hint)), the agentic loop converts the hint into
an error tool result and sends it back to the model. The model sees the hint
and can retry with corrected arguments.
use neuron_types::ToolError;
// Inside a tool's call() method:
if !is_valid_query(&args.query) {
return Err(ToolError::ModelRetry(
"Query must be a valid SQL SELECT statement. \
You provided a DELETE statement.".to_string()
));
}
This does not propagate as a LoopError – the loop continues with the model
receiving the hint as feedback.
Implementing Tool manually
When you need full control (custom schemas, complex error types), implement Tool
directly instead of using the macro:
use neuron_types::{Tool, ToolContext, ToolDefinition};
use serde::Deserialize;
#[derive(Debug, Deserialize, schemars::JsonSchema)]
struct SearchArgs {
query: String,
max_results: Option<usize>,
}
struct SearchTool { api_key: String }
impl Tool for SearchTool {
const NAME: &'static str = "search";
type Args = SearchArgs;
type Output = Vec<String>;
type Error = std::io::Error;
fn definition(&self) -> ToolDefinition {
ToolDefinition {
name: "search".into(),
title: Some("Web Search".into()),
description: "Search the web for information".into(),
input_schema: serde_json::to_value(
schemars::schema_for!(SearchArgs)
).unwrap(),
output_schema: None,
annotations: None,
cache_control: None,
}
}
async fn call(&self, args: SearchArgs, _ctx: &ToolContext) -> Result<Vec<String>, std::io::Error> {
let max = args.max_results.unwrap_or(5);
Ok(vec![format!("Result for '{}' (max {})", args.query, max)])
}
}
API reference
neuron_toolon docs.rsneuron_tool_macroson docs.rsTooltrait inneuron_typesToolDyntrait inneuron_types
Context management
neuron-context provides strategies for keeping conversation history within token
limits. When context grows too large, a ContextStrategy compacts messages –
dropping old ones, clearing tool results, or summarizing via an LLM. The crate
also includes token estimation, system prompt injection, and persistent context
sections.
Quick example
use neuron_context::{SlidingWindowStrategy, TokenCounter};
use neuron_types::{ContextStrategy, Message};
let strategy = SlidingWindowStrategy::new(
10, // keep the last 10 non-system messages
100_000, // compact when tokens exceed 100k
);
let messages = vec![
Message::system("You are a helpful assistant."),
Message::user("Hello"),
Message::assistant("Hi there!"),
// ... many more messages ...
];
let token_count = strategy.token_estimate(&messages);
if strategy.should_compact(&messages, token_count) {
let compacted = strategy.compact(messages).await?;
// compacted retains system messages + the last 10 non-system messages
}
The ContextStrategy trait
All strategies implement this trait from neuron-types:
pub trait ContextStrategy: Send + Sync {
/// Whether compaction should be triggered.
fn should_compact(&self, messages: &[Message], token_count: usize) -> bool;
/// Compact the message list to reduce token usage.
fn compact(&self, messages: Vec<Message>) -> impl Future<Output = Result<Vec<Message>, ContextError>> + Send;
/// Estimate the token count for a list of messages.
fn token_estimate(&self, messages: &[Message]) -> usize;
}
The agentic loop (AgentLoop) calls these methods between turns:
token_estimate()to get the current countshould_compact()to decide if action is neededcompact()to reduce the message list
Built-in strategies
SlidingWindowStrategy
Keeps system messages plus the most recent N non-system messages. Simple and predictable – older messages are dropped entirely.
use neuron_context::SlidingWindowStrategy;
// Keep last 20 non-system messages, trigger at 100k tokens
let strategy = SlidingWindowStrategy::new(20, 100_000);
// With a custom token counter (e.g. different chars-per-token ratio)
let counter = TokenCounter::with_ratio(3.5);
let strategy = SlidingWindowStrategy::with_counter(20, 100_000, counter);
What compaction actually does. SlidingWindowStrategy partitions messages
by role: system messages are always preserved regardless of the window size,
and the window count applies only to non-system messages. Here is a concrete
before/after showing a compaction with SlidingWindowStrategy::new(2, 500):
Before compaction (7 messages, ~800 tokens):
[system] "You are a helpful assistant."
[user] "What is Rust?"
[asst] "Rust is a systems programming language..."
[user] "How about memory safety?"
[asst] "Rust uses ownership and borrowing..."
[user] "What about async?"
[asst] "Rust supports async/await via futures..."
After compaction with SlidingWindowStrategy::new(2, 500):
[system] "You are a helpful assistant." <- always preserved
[user] "What about async?" <- last 2 non-system messages
[asst] "Rust supports async/await..." <- last 2 non-system messages
The first four non-system messages are dropped entirely. The system message
survives because the implementation unconditionally retains all system messages
before applying the sliding window to the remaining conversation. See
neuron-context/examples/compaction.rs
for a runnable demo.
ToolResultClearingStrategy
Replaces old tool result content with "[tool result cleared]" while preserving
the tool_use_id so the conversation still makes semantic sense. Keeps the most
recent N tool results intact.
This is effective when tool outputs are large (file contents, API responses) but the model only needs the recent ones to stay coherent.
use neuron_context::ToolResultClearingStrategy;
// Keep the 2 most recent tool results intact, clear older ones
let strategy = ToolResultClearingStrategy::new(2, 100_000);
SummarizationStrategy
Uses an LLM provider to summarize old messages, replacing them with a single summary message. Preserves the most recent N messages verbatim.
This produces the highest-quality compaction but costs an additional LLM call.
use neuron_context::SummarizationStrategy;
// Summarize old messages, keep the 5 most recent verbatim
let strategy = SummarizationStrategy::new(provider, 5, 100_000);
The summarization prompt asks the LLM to summarize concisely, focusing on key
information, decisions made, and tool call results. The summary is wrapped in a
[Summary of earlier conversation] prefix.
CompositeStrategy
Chains multiple strategies in order, applying each one until the token budget is met. After each strategy runs, the token count is re-estimated; iteration stops early if below the threshold.
Because ContextStrategy uses RPITIT (not dyn-compatible), strategies must be
wrapped in BoxedStrategy before composing:
use neuron_context::{
CompositeStrategy, SlidingWindowStrategy, ToolResultClearingStrategy,
strategies::BoxedStrategy,
};
let strategy = CompositeStrategy::new(vec![
// First: clear old tool results (cheap, often sufficient)
BoxedStrategy::new(ToolResultClearingStrategy::new(2, 100_000)),
// Second: drop old messages if still over budget
BoxedStrategy::new(SlidingWindowStrategy::new(10, 100_000)),
], 100_000);
This ordering is a best practice: try cheaper strategies first (clearing tool results), then progressively more aggressive ones (dropping messages, summarizing).
TokenCounter
A heuristic token estimator using a configurable characters-per-token ratio. The default ratio of 4.0 characters per token approximates GPT-family and Claude models.
use neuron_context::TokenCounter;
let counter = TokenCounter::new(); // 4.0 chars/token (default)
let counter = TokenCounter::with_ratio(3.5); // Custom ratio
// Estimate tokens for plain text
let tokens = counter.estimate_text("Hello, world!");
// Estimate tokens for a message list
let tokens = counter.estimate_messages(&messages);
// Estimate tokens for tool definitions
let tokens = counter.estimate_tools(&tool_definitions);
The counter estimates different content block types:
| Content type | Estimation method |
|---|---|
Text | len / chars_per_token |
Thinking | Thinking text length |
ToolUse | Name + serialized input |
ToolResult | Sum of content items |
Image | Fixed 300 tokens |
Document | Fixed 500 tokens |
Compaction | Content text length |
Each message adds a fixed 4-token overhead for role markers.
SystemInjector
Injects additional system prompt content based on turn count or token thresholds. Useful for reminders (“be concise”) or context-aware instructions that only apply under certain conditions.
use neuron_context::{SystemInjector, InjectionTrigger};
let mut injector = SystemInjector::new();
// Remind the model to be concise every 5 turns
injector.add_rule(
InjectionTrigger::EveryNTurns(5),
"Reminder: keep responses concise.".into(),
);
// Warn when context is getting large
injector.add_rule(
InjectionTrigger::OnTokenThreshold(50_000),
"Context is getting long. Summarize when possible.".into(),
);
// Check each turn -- returns a Vec of triggered content strings
let injections: Vec<String> = injector.check(turn_number, token_count);
PersistentContext
Aggregates named context sections and renders them into a single structured string. Use this to build system prompts from multiple independent sources (role definition, rules, domain knowledge) with explicit ordering.
use neuron_context::{PersistentContext, ContextSection};
let mut ctx = PersistentContext::new();
ctx.add_section(ContextSection {
label: "Role".into(),
content: "You are a senior Rust engineer.".into(),
priority: 0, // lower = rendered first
});
ctx.add_section(ContextSection {
label: "Output rules".into(),
content: "Always include code examples.".into(),
priority: 10,
});
let system_prompt = ctx.render();
// ## Role
// You are a senior Rust engineer.
//
// ## Output rules
// Always include code examples.
Server-side context management
Some providers (Anthropic) support server-side context compaction. Instead of the client compacting messages, the server pauses generation, compacts context internally, and resumes.
neuron supports this via three types in neuron-types:
ContextManagement– configuration sent inCompletionRequestto enable server-side compaction.ContentBlock::Compaction– a content block containing the compacted summary, emitted by the server.StopReason::Compaction– signals that the server paused to compact. The agentic loop automatically continues when it sees this stop reason.
use neuron_types::{CompletionRequest, ContextManagement, ContextEdit};
let request = CompletionRequest {
context_management: Some(ContextManagement {
edits: vec![ContextEdit::Compact {
strategy: "compact_20260112".into(),
}],
}),
..Default::default()
};
When AgentLoop receives StopReason::Compaction, it appends the assistant’s
message (which may contain ContentBlock::Compaction) and loops again without
treating it as a final response.
Choosing a strategy
| Strategy | Token cost | Quality | Best for |
|---|---|---|---|
SlidingWindowStrategy | None | Low (drops context) | Short conversations, prototyping |
ToolResultClearingStrategy | None | Medium (preserves flow) | Tool-heavy agents with large outputs |
SummarizationStrategy | 1 LLM call | High (semantic summary) | Long conversations needing continuity |
CompositeStrategy | Varies | High (layered) | Production agents with mixed workloads |
| Server-side compaction | Provider-managed | Provider-dependent | Anthropic users who prefer server management |
API reference
Providers
Provider crates implement the Provider trait from neuron-types, giving you
a uniform interface to call any LLM. neuron ships three provider crates –
Anthropic, OpenAI, and Ollama – each in its own crate following the serde
pattern: trait in core, implementation in a satellite.
Quick example
use neuron_provider_anthropic::Anthropic;
use neuron_types::{CompletionRequest, Message, Provider};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let provider = Anthropic::from_env()?;
let request = CompletionRequest {
messages: vec![Message::user("What is Rust?")],
max_tokens: Some(256),
..Default::default()
};
let response = provider.complete(request).await?;
println!("{}", response.message.content[0]); // ContentBlock::Text(...)
println!("Tokens: {} in, {} out", response.usage.input_tokens, response.usage.output_tokens);
Ok(())
}
The Provider trait
pub trait Provider: Send + Sync {
fn complete(&self, request: CompletionRequest)
-> impl Future<Output = Result<CompletionResponse, ProviderError>> + Send;
fn complete_stream(&self, request: CompletionRequest)
-> impl Future<Output = Result<StreamHandle, ProviderError>> + Send;
}
Key design points:
- Uses RPITIT (return position impl trait in trait) – Rust 2024 native async.
No
#[async_trait]needed. - Not object-safe by design. Use generics
<P: Provider>to compose. This avoids the overhead of boxing futures while keeping the API clean. complete()returns a fullCompletionResponsewith the message, token usage, and stop reason.complete_stream()returns aStreamHandlewhosereceiverfield is atokio::sync::mpsc::Receiver<StreamEvent>that yields text deltas, tool use blocks, usage stats, and a finalMessageCompleteevent.
Anthropic (neuron-provider-anthropic)
Client for the Anthropic Messages API.
Construction
use neuron_provider_anthropic::Anthropic;
// From environment variable (ANTHROPIC_API_KEY)
let provider = Anthropic::from_env()?;
// With explicit API key
let provider = Anthropic::new("sk-ant-...");
// Builder-style configuration
let provider = Anthropic::new("sk-ant-...")
.model("claude-opus-4-5")
.base_url("https://api.anthropic.com");
Configuration
| Method | Default | Description |
|---|---|---|
new(api_key) | – | Create with explicit key |
from_env() | – | Read ANTHROPIC_API_KEY from environment |
.model(name) | claude-sonnet-4-20250514 | Default model when request has empty model field |
.base_url(url) | https://api.anthropic.com | Override for proxies or testing |
Features
- Full content block mapping: text, thinking, tool use/result, images, documents, compaction
- Server-side context management via
ContextManagementrequest field - SSE streaming with manual parser
- Cache control on system prompts and tool definitions
ToolChoice::Requiredmaps to Anthropic’s{"type": "any"}
OpenAI (neuron-provider-openai)
Client for the OpenAI Chat Completions API. Also implements EmbeddingProvider
for the Embeddings API.
Construction
use neuron_provider_openai::OpenAi;
// From environment variable (OPENAI_API_KEY, optional OPENAI_ORG_ID)
let provider = OpenAi::from_env()?;
// With explicit API key
let provider = OpenAi::new("sk-...");
// Builder-style configuration
let provider = OpenAi::new("sk-...")
.model("gpt-4o")
.base_url("https://api.openai.com")
.organization("org-...");
Configuration
| Method | Default | Description |
|---|---|---|
new(api_key) | – | Create with explicit key |
from_env() | – | Read OPENAI_API_KEY (and optional OPENAI_ORG_ID) |
.model(name) | gpt-4o | Default model |
.base_url(url) | https://api.openai.com | Override for Azure, proxies, or testing |
.organization(org) | None | Sent as OpenAI-Organization header |
Embeddings
OpenAi also implements the EmbeddingProvider trait:
use neuron_types::{EmbeddingProvider, EmbeddingRequest};
use neuron_provider_openai::OpenAi;
let provider = OpenAi::from_env()?;
let request = EmbeddingRequest {
model: "text-embedding-3-small".into(),
input: vec!["Hello world".into(), "Goodbye world".into()],
dimensions: Some(256), // optional dimension reduction
..Default::default()
};
let response = provider.embed(request).await?;
// response.embeddings: Vec<Vec<f32>> -- one vector per input
// response.usage: EmbeddingUsage { prompt_tokens, total_tokens }
The EmbeddingProvider trait is separate from Provider because not all
embedding models support chat completion and vice versa. The OpenAi struct
implements both.
Features
- SSE streaming with
data: [DONE]sentinel - System prompts mapped to
role: "developer"(OpenAI convention) - Tool calls in
choices[0].message.tool_callsarray format ToolChoice::Requiredmaps to OpenAI’s"required"- Stream options include
include_usage: truefor token stats
Ollama (neuron-provider-ollama)
Client for the Ollama Chat API. Designed for local models with no authentication required by default.
Construction
use neuron_provider_ollama::Ollama;
// Default: localhost:11434, no auth
let provider = Ollama::new();
// From environment (reads OLLAMA_HOST if set)
let provider = Ollama::from_env()?;
// Builder-style configuration
let provider = Ollama::new()
.model("llama3.2")
.base_url("http://remote-host:11434")
.keep_alive("5m");
Configuration
| Method | Default | Description |
|---|---|---|
new() | – | Create with defaults (no auth needed) |
from_env() | – | Read OLLAMA_HOST for base URL |
.model(name) | llama3.2 | Default model |
.base_url(url) | http://localhost:11434 | Override for remote instances |
.keep_alive(duration) | None (server default) | Model memory residency ("5m", "0" to unload) |
Features
- NDJSON streaming (newline-delimited JSON, not SSE)
- No authentication by default (Ollama runs locally)
- Synthesizes tool call IDs with UUID (Ollama does not provide them natively)
keep_alivecontrols how long the model stays in GPU memory- Tool definitions use the same format as OpenAI (adopted by Ollama)
Provider + AgentLoop integration
The most common use of a provider is plugging it into an AgentLoop – the
commodity agentic while-loop that handles tool dispatch, context management, and
multi-turn conversation. Here is a complete, self-contained example using OpenAI:
use neuron_context::SlidingWindowStrategy;
use neuron_loop::AgentLoop;
use neuron_provider_openai::OpenAi;
use neuron_tool::ToolRegistry;
use neuron_types::ToolContext;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// 1. Build the provider
let provider = OpenAi::from_env()?.model("gpt-4o");
// 2. Choose a context strategy (keep last 20 messages, up to 100k tokens)
let context = SlidingWindowStrategy::new(20, 100_000);
// 3. Create a tool registry (empty here -- add tools as needed)
let tools = ToolRegistry::new();
// 4. Assemble the agent loop
let mut agent = AgentLoop::builder(provider, context)
.system_prompt("You are a helpful assistant.")
.max_turns(10)
.tools(tools)
.build();
// 5. Run with a plain text message
let ctx = ToolContext::default();
let result = agent.run_text("Hello!", &ctx).await?;
println!("{}", result.response);
println!("Turns: {}, Tokens: {} in / {} out",
result.turns, result.usage.input_tokens, result.usage.output_tokens);
Ok(())
}
This pattern is identical for every Provider implementation. Replace
OpenAi::from_env()? with Anthropic::from_env()? or Ollama::from_env()?
and nothing else changes – the builder, context strategy, tool registry, and
run call all stay the same.
Implementing a custom provider
If none of the built-in provider crates fit your needs – for example, you want
to integrate a proprietary LLM service or a local inference engine – you can
implement the Provider trait directly:
use std::future::Future;
use neuron_types::{
CompletionRequest, CompletionResponse, ContentBlock, Message,
Provider, ProviderError, Role, StopReason, StreamHandle, TokenUsage,
};
/// A minimal provider that calls a hypothetical LLM API.
pub struct MyProvider {
api_key: String,
}
impl MyProvider {
pub fn new(api_key: impl Into<String>) -> Self {
Self {
api_key: api_key.into(),
}
}
}
impl Provider for MyProvider {
fn complete(
&self,
request: CompletionRequest,
) -> impl Future<Output = Result<CompletionResponse, ProviderError>> + Send {
let api_key = self.api_key.clone();
async move {
// In a real implementation, serialize `request` and send it
// to your LLM API using reqwest, hyper, etc.
let response_text = format!(
"Echo: {}",
request.messages.last().map(|m| m.content[0].to_string()).unwrap_or_default()
);
Ok(CompletionResponse {
id: "resp-001".to_string(),
model: "my-model-v1".to_string(),
message: Message::assistant(response_text),
usage: TokenUsage {
input_tokens: 10,
output_tokens: 20,
..Default::default()
},
stop_reason: StopReason::EndTurn,
})
}
}
fn complete_stream(
&self,
_request: CompletionRequest,
) -> impl Future<Output = Result<StreamHandle, ProviderError>> + Send {
async {
Err(ProviderError::InvalidRequest(
"streaming not supported".to_string(),
))
}
}
}
CompletionResponse fields
| Field | Type | Meaning |
|---|---|---|
id | String | Unique identifier from the LLM API (e.g., "msg_01XFDUDYJgAACzvnptvVoYEL"). Used for logging and deduplication. |
model | String | The model name that processed the request (e.g., "gpt-4o", "claude-sonnet-4-20250514"). |
message | Message | The assistant response. Construct with Message::assistant("text") or manually as Message { role: Role::Assistant, content: vec![ContentBlock::Text(...)] }. |
usage | TokenUsage | Token counts for the request. Use ..Default::default() for optional fields (cache_read_tokens, cache_creation_tokens, reasoning_tokens, iterations) when your API does not report them. |
stop_reason | StopReason | Why generation stopped. EndTurn for normal completion, ToolUse when the model wants to call tools, MaxTokens if the response was truncated by the token limit. |
The Provider trait requires WasmCompatSend + WasmCompatSync, which are
equivalent to Send + Sync on native targets. On WASM, these bounds are
automatically satisfied so your provider can compile for both environments.
Error handling
All providers map errors to ProviderError, which classifies errors as retryable
or terminal:
use neuron_types::ProviderError;
match provider.complete(request).await {
Ok(response) => { /* ... */ }
Err(e) if e.is_retryable() => {
// Network, RateLimit, ModelLoading, Timeout, ServiceUnavailable
// Safe to retry with backoff
}
Err(e) => {
// Authentication, InvalidRequest, ModelNotFound, InsufficientResources
// Do not retry -- fix the root cause
}
}
ProviderError variants
| Variant | Retryable | Description |
|---|---|---|
Network(source) | Yes | Connection reset, DNS failure |
RateLimit { retry_after } | Yes | Provider rate limit hit |
ModelLoading(msg) | Yes | Cold start, model still loading |
Timeout(duration) | Yes | Request timed out |
ServiceUnavailable(msg) | Yes | Temporary provider outage |
Authentication(msg) | No | Bad API key or permissions |
InvalidRequest(msg) | No | Malformed request |
ModelNotFound(msg) | No | Requested model does not exist |
InsufficientResources(msg) | No | Quota or limit exceeded |
StreamError(msg) | No | Error during streaming |
Other(source) | No | Catch-all |
neuron does not include built-in retry logic. Use is_retryable() with your
own retry strategy, tower middleware, or a durable execution engine.
Streaming
All providers support streaming via complete_stream(), which returns a
StreamHandle:
use futures::StreamExt;
use neuron_types::StreamEvent;
let handle = provider.complete_stream(request).await?;
let mut stream = handle.receiver;
while let Some(event) = stream.recv().await {
match event {
StreamEvent::TextDelta(text) => print!("{text}"),
StreamEvent::ToolUse { id, name, input } => { /* tool call */ }
StreamEvent::Usage(usage) => { /* token stats */ }
StreamEvent::MessageComplete(message) => { /* final assembled message */ }
StreamEvent::Error(err) => { /* stream error */ }
_ => {}
}
}
The transport differs by provider:
| Provider | Transport | Format |
|---|---|---|
| Anthropic | Server-Sent Events (SSE) | event: + data: lines |
| OpenAI | Server-Sent Events (SSE) | data: lines, data: [DONE] sentinel |
| Ollama | NDJSON | One JSON object per line |
Swapping providers
Because all providers implement the same Provider trait, swapping is a
one-line change:
use neuron_context::SlidingWindowStrategy;
use neuron_loop::AgentLoop;
// Switch from Anthropic...
// let provider = Anthropic::from_env()?;
// ...to OpenAI:
let provider = OpenAi::from_env()?;
// Everything else stays the same
let agent = AgentLoop::builder(provider, SlidingWindowStrategy::new(20, 100_000))
.system_prompt("You are a helpful assistant.")
.build();
The model field in CompletionRequest defaults to empty, which makes the
provider use its configured default model. Set it explicitly when you need
a specific model within a run.
API reference
neuron_provider_anthropicon docs.rsneuron_provider_openaion docs.rsneuron_provider_ollamaon docs.rsProvidertrait inneuron_typesEmbeddingProvidertrait inneuron_types
The agent loop
AgentLoop is the commodity while loop at the center of every agent. It
composes a Provider, a ToolRegistry, and a ContextStrategy into a loop
that calls the LLM, executes tools, manages context, and repeats until the model
returns a final text response or a limit is reached.
Quick example
use neuron_context::SlidingWindowStrategy;
use neuron_loop::AgentLoop;
use neuron_tool::ToolRegistry;
use neuron_types::ToolContext;
let provider = Anthropic::from_env()?;
let context = SlidingWindowStrategy::new(20, 100_000);
let mut tools = ToolRegistry::new();
tools.register(MySearchTool);
tools.register(MyCalculateTool);
let mut agent = AgentLoop::builder(provider, context)
.tools(tools)
.system_prompt("You are a helpful research assistant.")
.max_turns(15)
.parallel_tool_execution(true)
.build();
let ctx = ToolContext::default();
let result = agent.run_text("Find the population of Tokyo", &ctx).await?;
println!("Response: {}", result.response);
println!("Turns: {}, Tokens: {} in / {} out",
result.turns, result.usage.input_tokens, result.usage.output_tokens);
Building an AgentLoop
The builder pattern
AgentLoop::builder(provider, context) returns an AgentLoopBuilder with
sensible defaults. Only the provider and context strategy are required.
let agent = AgentLoop::builder(provider, context)
.tools(registry) // ToolRegistry (default: empty)
.system_prompt("You are helpful.") // SystemPrompt (default: empty)
.max_turns(10) // Option<usize> (default: None = unlimited)
.parallel_tool_execution(true) // bool (default: false)
.usage_limits(limits) // UsageLimits (default: no limits)
.hook(my_logging_hook) // ObservabilityHook (can add multiple)
.durability(my_durable_ctx) // DurableContext (optional)
.build();
Direct construction
You can also construct directly when you need to set the full LoopConfig:
use neuron_loop::{AgentLoop, LoopConfig};
use neuron_types::SystemPrompt;
let config = LoopConfig {
system_prompt: SystemPrompt::Text("You are a code reviewer.".into()),
max_turns: Some(20),
parallel_tool_execution: true,
..Default::default()
};
let agent = AgentLoop::new(provider, tools, context, config);
Running the loop
run() – drive to completion
Appends the user message, then loops until the model returns a text-only response or the turn limit is reached.
let result = agent.run(Message::user("Hello!"), &tool_ctx).await?;
// result: AgentResult { response, messages, usage, turns }
run_text() – convenience for text input
Wraps a &str into a Message::user() and calls run():
let result = agent.run_text("What is 2 + 2?", &tool_ctx).await?;
run_stream() – streaming output
Uses provider.complete_stream() for real-time token output. Returns a
channel receiver that yields StreamEvents:
let mut rx = agent.run_stream(Message::user("Explain Rust ownership"), &tool_ctx).await;
while let Some(event) = rx.recv().await {
match event {
StreamEvent::TextDelta(text) => print!("{text}"),
StreamEvent::ToolUse { name, .. } => println!("\n[calling {name}...]"),
StreamEvent::Usage(usage) => println!("\n[{} tokens]", usage.output_tokens),
StreamEvent::MessageComplete(_) => println!("\n[done]"),
StreamEvent::Error(err) => eprintln!("Error: {err}"),
_ => {}
}
}
Tool execution is handled between streaming turns. The loop streams the LLM response, executes any tool calls, appends results, and streams the next turn.
run_step() – one turn at a time
Returns a StepIterator that lets you advance the loop manually. Between turns
you can inspect messages, inject new ones, and modify the tool registry.
let mut steps = agent.run_step(Message::user("Plan a trip"), &tool_ctx);
while let Some(turn) = steps.next().await {
match turn {
TurnResult::ToolsExecuted { calls, results } => {
println!("Executed {} tools", calls.len());
// Optionally inject guidance between turns
steps.inject_message(Message::user("Focus on budget options."));
}
TurnResult::FinalResponse(result) => {
println!("Final: {}", result.response);
}
TurnResult::CompactionOccurred { old_tokens, new_tokens } => {
println!("Compacted: {old_tokens} -> {new_tokens} tokens");
}
TurnResult::MaxTurnsReached => {
println!("Hit turn limit");
}
TurnResult::Error(e) => {
eprintln!("Error: {e}");
}
}
}
StepIterator exposes:
next()– advance one turnmessages()– view current conversationinject_message(msg)– add a message between turnstools_mut()– modify the tool registry between turns
Distinguishing text responses from tool calls
TurnResult is the key abstraction for telling apart a direct LLM message from
a tool-call round trip. When the model returns plain text and no tool calls, the
iterator yields TurnResult::FinalResponse containing the finished
AgentResult. When the model requests one or more tool calls, the loop executes
them and yields TurnResult::ToolsExecuted with the calls and their results.
The loop handles dispatch automatically — you just match on the variant.
let mut steps = agent.run_step(Message::user("What's 2 + 2?"), &tool_ctx);
while let Some(turn) = steps.next().await {
match turn {
TurnResult::ToolsExecuted { calls, results } => {
// The model requested tool calls — they've been executed
for (call_id, tool_name, input) in &calls {
println!("Model called tool '{tool_name}' with {input}");
}
// results contains the ContentBlock::ToolResult for each call
// The loop automatically sends these back to the model
}
TurnResult::FinalResponse(result) => {
// The model returned a text response — no more tool calls
println!("Final answer: {}", result.response);
println!("Total turns: {}", result.turns);
}
TurnResult::CompactionOccurred { old_tokens, new_tokens } => {
println!("Context compacted: {old_tokens} → {new_tokens} tokens");
// Loop continues automatically
}
TurnResult::MaxTurnsReached => {
println!("Turn limit reached without a final response");
}
TurnResult::Error(e) => {
eprintln!("Loop error: {e}");
}
}
}
If you only need the final result and don’t need turn-by-turn control, use
run() or run_text() instead — they drive the loop to completion and return
AgentResult directly.
AgentResult
Returned by run(), run_text(), and TurnResult::FinalResponse:
pub struct AgentResult {
pub response: String, // Final text response from the model
pub messages: Vec<Message>, // Full conversation history
pub usage: TokenUsage, // Cumulative token usage across all turns
pub turns: usize, // Number of turns completed
}
Loop lifecycle
Each iteration of the loop follows this sequence:
- Check cancellation – if
tool_ctx.cancellation_tokenis cancelled, returnLoopError::Cancelled - Check max turns – if the turn limit is reached, return
LoopError::MaxTurns - Check usage limits – if any token, request, or tool call limit is
exceeded, return
LoopError::UsageLimitExceeded - Fire
LoopIterationhooks - Check context compaction – call
context.should_compact()andcontext.compact()if needed - Build
CompletionRequestfrom current messages, system prompt, and tool definitions - Fire
PreLlmCallhooks - Call the provider (or durable context if set)
- Fire
PostLlmCallhooks - Accumulate token usage
- Check stop reason:
StopReason::Compaction– append message and continue the loopStopReason::EndTurnor no tool calls – extract text and returnAgentResultStopReason::ToolUse– proceed to tool execution
- Check cancellation again before tool execution
- Execute tool calls (parallel or sequential), firing
PreToolExecutionandPostToolExecutionhooks for each - Check usage limits – verify tool call count against limit
- Append tool results as a user message and loop back to step 1
How tool call processing works
When the LLM decides to use a tool, the loop handles the entire dispatch cycle automatically. Here is exactly what happens at the code level:
Step 1: LLM returns tool calls. The provider responds with
StopReason::ToolUse and one or more ContentBlock::ToolUse blocks in the
assistant message. Each block contains a name, input (JSON arguments), and
a unique id.
// Inside the loop — the LLM response contains tool calls:
// response.stop_reason == StopReason::ToolUse
// response.message.content == [
// ContentBlock::Text("Let me look that up."),
// ContentBlock::ToolUse { id: "call_1", name: "get_weather", input: {"city": "Tokyo"} },
// ]
Step 2: Extract tool calls. The loop filters the assistant message for
ContentBlock::ToolUse blocks and collects them as (id, name, input) tuples.
The full assistant message (including any text) is appended to the conversation.
let tool_calls: Vec<_> = response.message.content.iter()
.filter_map(|block| {
if let ContentBlock::ToolUse { id, name, input } = block {
Some((id.clone(), name.clone(), input.clone()))
} else {
None
}
})
.collect();
// Append the assistant message (with both text and tool use blocks)
self.messages.push(response.message.clone());
Step 3: Execute tools via the registry. Each tool call is dispatched to the
ToolRegistry, which finds the matching tool by name, deserializes the JSON
input, and calls the tool’s call() method. Pre- and post-execution hooks fire
around each call.
// For each tool call, the loop calls execute_single_tool:
// 1. Fire PreToolExecution hooks (can Skip or Terminate)
// 2. Call self.tools.execute(tool_name, input, tool_ctx)
// 3. Fire PostToolExecution hooks
// 4. Wrap the ToolOutput into a ContentBlock::ToolResult
If parallel_tool_execution is true and there are multiple tool calls, all
calls run concurrently via futures::future::join_all. Otherwise they execute
sequentially.
Step 4: Append results and continue. The tool results are collected into
ContentBlock::ToolResult blocks (each linked back to the original call by
tool_use_id) and appended as a User message. The loop then continues —
the LLM sees the tool results and can respond with text or call more tools.
// Each tool result looks like:
// ContentBlock::ToolResult {
// tool_use_id: "call_1",
// content: [ContentItem::Text("{\"temp\": 22, \"conditions\": \"sunny\"}")],
// is_error: false,
// }
// All results are appended as a single user message
self.messages.push(Message {
role: Role::User,
content: tool_result_blocks,
});
// Loop continues → LLM sees results → responds or calls more tools
Special case — ToolError::ModelRetry. If a tool returns
Err(ToolError::ModelRetry(hint)), the loop does not propagate an error.
Instead, it converts the hint into a ToolResult with is_error: true. The
model receives the hint and can retry with corrected arguments:
// Tool returns: Err(ToolError::ModelRetry("city must be a valid name, got '123'"))
// Loop converts to: ContentBlock::ToolResult {
// tool_use_id: "call_1",
// content: [ContentItem::Text("city must be a valid name, got '123'")],
// is_error: true,
// }
// Model sees the error and retries: get_weather({"city": "Tokyo"})
Complete flow diagram
User: "What's the weather in Tokyo?"
│
▼
┌─────────────────────────────────────────────┐
│ Turn 1: LLM call │
│ Request: [User: "What's the weather..."] │
│ Response: ToolUse(get_weather, {city: │
│ "Tokyo"}) │
│ StopReason: ToolUse │
└──────────────────┬──────────────────────────┘
│
▼
┌─────────────────────────────────────────────┐
│ Tool execution │
│ Registry dispatches get_weather │
│ Tool returns: {temp: 22, conditions: │
│ "sunny"} │
│ Result appended as ToolResult message │
└──────────────────┬──────────────────────────┘
│
▼
┌─────────────────────────────────────────────┐
│ Turn 2: LLM call │
│ Request: [User, Assistant(ToolUse), │
│ User(ToolResult)] │
│ Response: "It's 22°C and sunny in Tokyo." │
│ StopReason: EndTurn │
└──────────────────┬──────────────────────────┘
│
▼
AgentResult {
response: "It's 22°C and sunny in Tokyo.",
turns: 2,
...
}
Cancellation
The loop checks ToolContext.cancellation_token at two points:
- Top of each iteration (before the max turns check)
- Before tool execution (after the LLM returns tool calls)
use tokio_util::sync::CancellationToken;
let token = CancellationToken::new();
let ctx = ToolContext {
cancellation_token: token.clone(),
..Default::default()
};
// Cancel from another task
tokio::spawn(async move {
tokio::time::sleep(Duration::from_secs(30)).await;
token.cancel();
});
match agent.run_text("Long task...", &ctx).await {
Err(LoopError::Cancelled) => println!("Cancelled!"),
Ok(result) => println!("{}", result.response),
Err(e) => eprintln!("{e}"),
}
Parallel tool execution
When LoopConfig.parallel_tool_execution is true and the LLM returns multiple
tool calls in a single response, all calls execute concurrently via
futures::future::join_all. When false (the default), tools execute
sequentially in order.
let agent = AgentLoop::builder(provider, context)
.parallel_tool_execution(true)
.tools(registry)
.build();
Parallel execution applies to run() and run_step(). Streaming (run_stream())
always executes tools sequentially.
Usage limits
UsageLimits enforces token and request budgets on the agent loop. When any
limit is exceeded, the loop returns LoopError::UsageLimitExceeded with a
message describing which limit was hit.
use neuron_loop::AgentLoop;
use neuron_types::UsageLimits;
let limits = UsageLimits::default()
.with_input_tokens_limit(500_000)
.with_output_tokens_limit(50_000)
.with_total_tokens_limit(600_000)
.with_request_limit(25)
.with_tool_calls_limit(100);
let agent = AgentLoop::builder(provider, context)
.tools(registry)
.usage_limits(limits)
.build();
Each field is optional – set only the limits you care about. Unset limits are not enforced.
| Limit | Checked against |
|---|---|
input_tokens_limit | Cumulative TokenUsage.input_tokens across all turns |
output_tokens_limit | Cumulative TokenUsage.output_tokens across all turns |
total_tokens_limit | Sum of cumulative input + output tokens |
request_limit | Number of LLM calls made (incremented each turn) |
tool_calls_limit | Number of tool executions (incremented per tool call) |
The loop checks limits at two points:
- Before each LLM call – checks token and request limits against accumulated usage
- After tool execution – checks the tool call count against the limit
When a limit is exceeded, the loop stops immediately and returns
LoopError::UsageLimitExceeded with a descriptive message (e.g.,
"output token limit exceeded: 50123 > 50000").
You can also construct UsageLimits directly:
use neuron_types::UsageLimits;
let limits = UsageLimits {
input_tokens_limit: Some(500_000),
output_tokens_limit: Some(50_000),
total_tokens_limit: None,
request_limit: Some(25),
tool_calls_limit: None,
};
Or use LoopConfig directly:
use neuron_loop::LoopConfig;
use neuron_types::UsageLimits;
let config = LoopConfig {
usage_limits: Some(UsageLimits::default()
.with_total_tokens_limit(1_000_000)
.with_request_limit(50)),
..Default::default()
};
Context compaction
The loop supports two independent compaction mechanisms:
Client-side compaction
Uses the ContextStrategy you provide. Between turns, the loop calls
should_compact() and compact() to reduce message history when tokens exceed
the configured threshold.
// SlidingWindow compacts by dropping old messages
let agent = AgentLoop::builder(provider, SlidingWindowStrategy::new(20, 100_000))
.build();
Server-side compaction
When the provider returns StopReason::Compaction, the loop automatically
continues without treating it as a final response. The compacted content
arrives in ContentBlock::Compaction within the assistant’s message.
No configuration is needed in the loop – it handles this transparently. Set
CompletionRequest.context_management on the provider side to enable it.
ToolError::ModelRetry
When a tool returns Err(ToolError::ModelRetry(hint)), the loop converts it
to a ToolOutput with is_error: true and the hint as content. The model
receives the hint and can retry with corrected arguments.
This does not propagate as LoopError::Tool. The loop continues normally,
giving the model a chance to self-correct.
Observability hooks
Add hooks to observe or control loop behavior. Hooks receive events at each step
and return HookAction::Continue, HookAction::Skip, or HookAction::Terminate.
use neuron_types::{ObservabilityHook, HookEvent, HookAction, HookError};
struct TokenBudgetHook { max_tokens: usize }
impl ObservabilityHook for TokenBudgetHook {
async fn on_event(&self, event: HookEvent<'_>) -> Result<HookAction, HookError> {
match event {
HookEvent::PostLlmCall { response } => {
if response.usage.output_tokens > self.max_tokens {
return Ok(HookAction::Terminate {
reason: "token budget exceeded".into(),
});
}
}
_ => {}
}
Ok(HookAction::Continue)
}
}
let agent = AgentLoop::builder(provider, context)
.hook(TokenBudgetHook { max_tokens: 10_000 })
.build();
Hook events
| Event | Fired when | Skip/Terminate behavior |
|---|---|---|
LoopIteration { turn } | Start of each turn | Terminate stops the loop |
PreLlmCall { request } | Before calling the provider | Terminate stops the loop |
PostLlmCall { response } | After receiving the response | Terminate stops the loop |
PreToolExecution { tool_name, input } | Before each tool call | Skip returns rejection as tool result |
PostToolExecution { tool_name, output } | After each tool call | Terminate stops the loop |
ContextCompaction { old_tokens, new_tokens } | After context is compacted | Terminate stops the loop |
Durable execution
For crash-recoverable agents, set a DurableContext on the loop. When present,
LLM calls go through DurableContext::execute_llm_call and tool calls go
through DurableContext::execute_tool, enabling journaling and replay by engines
like Temporal, Restate, or Inngest.
let agent = AgentLoop::builder(provider, context)
.durability(my_temporal_context)
.build();
The loop handles the durable/non-durable split transparently. All other behavior (hooks, compaction, cancellation) works the same way.
Error handling
run() and run_text() return Result<AgentResult, LoopError>:
| Variant | Cause |
|---|---|
LoopError::Provider(e) | LLM call failed |
LoopError::Tool(e) | Tool execution failed (except ModelRetry) |
LoopError::Context(e) | Context compaction failed |
LoopError::MaxTurns(n) | Turn limit reached |
LoopError::UsageLimitExceeded(msg) | Token, request, or tool call budget exceeded |
LoopError::HookTerminated(reason) | A hook returned Terminate |
LoopError::Cancelled | Cancellation token was triggered |
run_stream() sends errors as StreamEvent::Error on the channel instead of
returning them as Result.
API reference
MCP Integration
neuron-mcp connects your agent to external tool servers using the
Model Context Protocol (MCP). It wraps the
rmcp crate (the official Rust MCP SDK) and bridges MCP tools into neuron’s
ToolRegistry so they appear like any other tool to the agent loop.
Quick Example
use std::sync::Arc;
use neuron_mcp::{McpClient, McpToolBridge, StdioConfig};
use neuron_tool::ToolRegistry;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Connect to an MCP server via stdio
let client = Arc::new(McpClient::connect_stdio(StdioConfig {
command: "npx".to_string(),
args: vec!["-y".to_string(), "@modelcontextprotocol/server-filesystem".to_string(), "/tmp".to_string()],
env: vec![],
}).await?);
// Discover tools and register them
let tools = McpToolBridge::discover(&client).await?;
let mut registry = ToolRegistry::new();
for tool in tools {
registry.register_dyn(tool);
}
Ok(())
}
API Walkthrough
McpClient
McpClient manages the connection to an MCP server. Two transports are
supported:
Stdio – spawns a child process and communicates over stdin/stdout:
use neuron_mcp::{McpClient, StdioConfig};
let client = McpClient::connect_stdio(StdioConfig {
command: "npx".to_string(),
args: vec!["-y".to_string(), "@modelcontextprotocol/server-everything".to_string()],
env: vec![("NODE_ENV".to_string(), "production".to_string())],
}).await?;
Streamable HTTP – connects to a remote MCP server over HTTP with SSE:
use neuron_mcp::{McpClient, HttpConfig};
let client = McpClient::connect_http(HttpConfig {
url: "http://localhost:8080/mcp".to_string(),
auth_header: Some("Bearer my-token".to_string()),
headers: vec![],
}).await?;
Once connected, McpClient provides methods for all MCP operations:
| Method | Description |
|---|---|
list_tools(cursor) | List available tools (paginated) |
list_all_tools() | List all tools (fetches every page) |
call_tool(name, arguments) | Call a tool with a JSON argument map |
call_tool_json(name, value) | Convenience: accepts serde_json::Value |
list_resources(cursor) | List available resources |
read_resource(uri) | Read a resource by URI |
list_prompts(cursor) | List available prompt templates |
get_prompt(name, arguments) | Retrieve an expanded prompt |
is_closed() | Check if the transport is closed |
peer() | Access the underlying rmcp peer for advanced use |
McpToolBridge
McpToolBridge bridges a single MCP tool into neuron’s ToolDyn trait. When
the agent loop calls a bridged tool, the call is forwarded to the MCP server
via McpClient::call_tool.
The typical workflow uses McpToolBridge::discover(), which lists all tools
from the server and returns them as Arc<dyn ToolDyn> ready for registration:
use std::sync::Arc;
use neuron_mcp::{McpClient, McpToolBridge};
use neuron_tool::ToolRegistry;
let client = Arc::new(McpClient::connect_stdio(config).await?);
// Discover returns Vec<Arc<dyn ToolDyn>>
let bridges = McpToolBridge::discover(&client).await?;
let mut registry = ToolRegistry::new();
for bridge in bridges {
registry.register_dyn(bridge);
}
An equivalent convenience method exists on McpClient itself:
let tools = McpClient::discover_tools(&client).await?;
You can also bridge a single known tool manually:
use neuron_mcp::McpToolBridge;
let bridge = McpToolBridge::new(Arc::clone(&client), tool_definition);
registry.register_dyn(Arc::new(bridge));
McpServer
McpServer does the reverse: it exposes a neuron ToolRegistry as an MCP
server, making your tools available to any MCP client.
use neuron_mcp::McpServer;
use neuron_tool::ToolRegistry;
let mut registry = ToolRegistry::new();
// ... register your tools ...
let server = McpServer::new(registry)
.with_name("my-agent-tools")
.with_version("1.0.0")
.with_instructions("Tools for file manipulation");
// Serve over stdio (blocks until client disconnects)
server.serve_stdio().await?;
The server handles tools/list and tools/call MCP requests by delegating
to the underlying ToolRegistry.
Configuration Types
StdioConfig–command,args,envfor spawning a child processHttpConfig–url,auth_header,headersfor HTTP connectionsPaginatedList<T>– generic wrapper withitemsandnext_cursor
MCP-Specific Types
These types represent MCP protocol objects:
McpResource–uri,name,title,description,mime_typeMcpResourceContents–uri,mime_type,textorblobMcpPrompt–name,title,description,argumentsMcpPromptArgument–name,description,required
Error Handling
All MCP operations return Result<_, McpError>. The variants are:
McpError::Connection– failed to connect (process spawn or HTTP)McpError::Initialization– MCP handshake failedMcpError::ToolCall– a tool call returned an errorMcpError::Transport– transport-level communication error
Advanced Usage
Mixing MCP and Native Tools
MCP tools and native tools live side by side in the same ToolRegistry. The
agent loop cannot tell the difference:
use neuron_mcp::{McpClient, McpToolBridge, StdioConfig};
use neuron_tool::ToolRegistry;
let mut registry = ToolRegistry::new();
// Register a native tool
registry.register(MyNativeTool);
// Register MCP tools from a filesystem server
let fs_client = Arc::new(McpClient::connect_stdio(fs_config).await?);
for tool in McpToolBridge::discover(&fs_client).await? {
registry.register_dyn(tool);
}
// Register MCP tools from a different server
let db_client = Arc::new(McpClient::connect_http(db_config).await?);
for tool in McpToolBridge::discover(&db_client).await? {
registry.register_dyn(tool);
}
// All tools are now available to the agent loop
Accessing the Raw rmcp Peer
For operations not covered by McpClient’s methods, access the underlying
rmcp::Peer directly:
let peer = client.peer();
// Use any rmcp method directly
Tool Annotations
MCP tools can carry behavioral annotations (read-only, destructive, idempotent,
open-world). These are preserved during bridging and available on the
ToolDefinition:
let tools = client.list_all_tools().await?;
for tool in &tools {
if let Some(ann) = &tool.annotations {
println!("{}: read_only={:?}", tool.name, ann.read_only_hint);
}
}
API Docs
Full API documentation: neuron-mcp on docs.rs
Runtime
neuron-runtime provides production infrastructure for agents: session
persistence, input/output guardrails, structured observability, durable
execution, and sandboxed tool execution.
Quick Example
use std::path::PathBuf;
use neuron_runtime::*;
use neuron_types::Message;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Persist sessions to disk
let storage = FileSessionStorage::new(PathBuf::from("./sessions"));
let mut session = Session::new("s-1", PathBuf::from("."));
session.messages.push(Message::user("Hello"));
storage.save(&session).await?;
// Load it back later
let loaded = storage.load("s-1").await?;
println!("{} messages", loaded.messages.len());
Ok(())
}
Sessions
Sessions store conversation message history along with metadata (timestamps,
token usage, custom state). The SessionStorage trait defines how sessions
are persisted.
Session Type
use neuron_runtime::Session;
let mut session = Session::new("chat-42", "/home/user/project".into());
session.messages.push(Message::user("What is Rust?"));
session.state.custom.insert("theme".to_string(), serde_json::json!("dark"));
A Session contains:
| Field | Type | Description |
|---|---|---|
id | String | Unique session identifier |
messages | Vec<Message> | Conversation history |
state | SessionState | Working directory, token usage, event count, custom metadata |
created_at | DateTime<Utc> | Creation timestamp |
updated_at | DateTime<Utc> | Last update timestamp |
SessionState holds mutable runtime data: cwd, token_usage, event_count,
and a custom map for arbitrary key-value metadata.
SessionStorage Trait
pub trait SessionStorage: Send + Sync {
async fn save(&self, session: &Session) -> Result<(), StorageError>;
async fn load(&self, id: &str) -> Result<Session, StorageError>;
async fn list(&self) -> Result<Vec<SessionSummary>, StorageError>;
async fn delete(&self, id: &str) -> Result<(), StorageError>;
}
Two implementations ship with the crate:
InMemorySessionStorage – backed by Arc<RwLock<HashMap>>, suitable for
testing and short-lived processes:
let storage = InMemorySessionStorage::new();
storage.save(&session).await?;
FileSessionStorage – one JSON file per session at
{directory}/{session_id}.json. Creates the directory on first save:
let storage = FileSessionStorage::new(PathBuf::from("./sessions"));
storage.save(&session).await?;
// Creates ./sessions/chat-42.json
Session Summaries
Session::summary() returns a lightweight SessionSummary without the full
message history – useful for listing sessions:
let summaries = storage.list().await?;
for s in &summaries {
println!("{}: {} messages, created {}", s.id, s.message_count, s.created_at);
}
Session persistence with AgentLoop
Session persistence and durable execution are two complementary layers.
SessionStorage saves conversation state between runs — when the process exits,
the session is written to disk (or another backend), and a new process can load
it later to resume. DurableContext protects individual operations during a
run — if the process crashes mid-tool-call, the durable engine journals and
replays to recover. They compose naturally: DurableContext protects during a
run, SessionStorage saves between runs.
use neuron_runtime::{Session, FileSessionStorage, SessionStorage};
use neuron_loop::AgentLoop;
use neuron_types::Message;
// --- Save after a conversation ---
let result = agent.run_text("Hello!", &ctx).await?;
let mut session = Session::new("session-123", std::env::current_dir()?);
session.messages = result.messages.clone();
session.state.token_usage = result.usage.clone();
let storage = FileSessionStorage::new("./sessions".into());
storage.save(&session).await?;
// --- Resume later (new process) ---
let storage = FileSessionStorage::new("./sessions".into());
let loaded = storage.load("session-123").await?;
// Build a new agent and continue the conversation
let mut agent = AgentLoop::builder(provider, context)
.tools(tools)
.system_prompt("You are a helpful assistant.")
.build();
// Feed the loaded history back by running with the conversation context
// The previous messages provide continuity
let resume_msg = Message::user("Continue where we left off.");
let result = agent.run(resume_msg, &ctx).await?;
AgentResult.messages contains the full conversation history including tool
calls and results, so saving it preserves the complete context. When you load
and resume, the model sees the entire prior exchange — tool invocations, tool
outputs, assistant reasoning — giving it full continuity without re-executing
any previous steps.
Guardrails
Guardrails are safety checks that run on input (before it reaches the LLM) or output (before it reaches the user).
GuardrailResult
Every guardrail check returns one of three outcomes:
Pass– input/output is acceptableTripwire(reason)– immediately halt executionWarn(reason)– allow execution but log a warning
InputGuardrail and OutputGuardrail
use std::future::Future;
use neuron_runtime::{InputGuardrail, GuardrailResult};
struct NoSecrets;
impl InputGuardrail for NoSecrets {
fn check(&self, input: &str) -> impl Future<Output = GuardrailResult> + Send {
async move {
if input.contains("API_KEY") || input.contains("sk-") {
GuardrailResult::Tripwire("Input contains a secret".to_string())
} else {
GuardrailResult::Pass
}
}
}
}
Output guardrails use the same pattern via the OutputGuardrail trait.
Running Multiple Guardrails
Use run_input_guardrails and run_output_guardrails to evaluate a sequence.
They return the first non-Pass result, or Pass if all checks pass:
use neuron_runtime::{run_input_guardrails, ErasedInputGuardrail};
let no_secrets = NoSecrets;
let no_sql = NoSqlInjection;
let guardrails: Vec<&dyn ErasedInputGuardrail> = vec![&no_secrets, &no_sql];
let result = run_input_guardrails(&guardrails, user_input).await;
if result.is_tripwire() {
// Reject the input
}
GuardrailHook
GuardrailHook wraps guardrails as an ObservabilityHook, integrating them
directly into the agent loop lifecycle:
- Input guardrails fire on
HookEvent::PreLlmCall - Output guardrails fire on
HookEvent::PostLlmCall Tripwiremaps toHookAction::TerminateWarnlogs viatracing::warn!and returnsHookAction::ContinuePassreturnsHookAction::Continue
use neuron_runtime::GuardrailHook;
use neuron_loop::AgentLoop;
let hook = GuardrailHook::new()
.input_guardrail(NoSecrets)
.output_guardrail(NoProfanity);
let mut agent = AgentLoop::builder(provider, context)
.tools(registry)
.build();
agent.add_hook(hook);
Complete guardrail integration with AgentLoop
The InputGuardrail example above shows how to check user input. Output
guardrails follow the same trait pattern via OutputGuardrail. Here is a
complete output guardrail that detects PII (email addresses and phone numbers)
in the model’s response, wired into AgentLoop end-to-end.
Implement the guardrail:
use std::future::Future;
use neuron_runtime::{OutputGuardrail, GuardrailResult};
struct NoPiiOutput;
impl OutputGuardrail for NoPiiOutput {
fn check(&self, output: &str) -> impl Future<Output = GuardrailResult> + Send {
async move {
// Check for email addresses
if output.contains('@') && output.contains('.') {
return GuardrailResult::Tripwire(
"Response contains a potential email address".to_string(),
);
}
// Check for phone number patterns (sequences of 10+ digits)
let digit_count = output.chars().filter(|c| c.is_ascii_digit()).count();
if digit_count >= 10 {
return GuardrailResult::Tripwire(
"Response contains a potential phone number".to_string(),
);
}
GuardrailResult::Pass
}
}
}
Wire it into AgentLoop:
use neuron_runtime::GuardrailHook;
use neuron_loop::{AgentLoop, LoopError};
let guardrail_hook = GuardrailHook::builder()
.output_guardrail(NoPiiOutput)
.build();
let mut agent = AgentLoop::builder(provider, context)
.tools(tools)
.hook(guardrail_hook)
.build();
// Handle guardrail rejection
match agent.run_text("What's John's email?", &ctx).await {
Ok(result) => println!("Response: {}", result.response),
Err(LoopError::HookTerminated(reason)) => {
println!("Guardrail blocked: {reason}");
// Present safe fallback to user
}
Err(e) => eprintln!("Other error: {e}"),
}
Guardrails are gates, not transformers — they accept (Pass), reject
(Tripwire), or flag (Warn), but do not modify content. To transform output,
post-process the AgentResult after run() returns.
TracingHook
TracingHook is a concrete ObservabilityHook that emits structured
tracing events for every stage of the agent loop.
Wire it to any tracing-compatible subscriber for stdout logging,
OpenTelemetry export, or custom collectors.
use neuron_runtime::TracingHook;
let hook = TracingHook::new();
// Add to agent loop: agent.add_hook(hook);
TracingHook always returns HookAction::Continue – it observes but never
controls execution. It maps 8 hook events to structured spans:
| Event | Level | Span name |
|---|---|---|
LoopIteration | DEBUG | neuron.loop.iteration |
PreLlmCall | DEBUG | neuron.llm.pre_call |
PostLlmCall | DEBUG | neuron.llm.post_call |
PreToolExecution | DEBUG | neuron.tool.pre_execution |
PostToolExecution | DEBUG | neuron.tool.post_execution |
ContextCompaction | INFO | neuron.context.compaction |
SessionStart | INFO | neuron.session.start |
SessionEnd | INFO | neuron.session.end |
Set RUST_LOG=debug to see all events:
RUST_LOG=debug cargo run --example tracing_hook -p neuron-runtime
PermissionPolicy
The PermissionPolicy trait approves or denies tool calls before execution.
It returns a PermissionDecision:
Allow– proceed with the tool callDeny(reason)– reject the callAsk(prompt)– ask the user for confirmation
use neuron_types::{PermissionPolicy, PermissionDecision};
struct ReadOnlyPolicy;
impl PermissionPolicy for ReadOnlyPolicy {
fn check(&self, tool_name: &str, _input: &serde_json::Value) -> PermissionDecision {
match tool_name {
"read_file" | "list_dir" => PermissionDecision::Allow,
_ => PermissionDecision::Deny(format!("{tool_name} is not allowed in read-only mode")),
}
}
}
DurableContext
DurableContext wraps LLM calls and tool execution so durable engines
(Temporal, Restate, Inngest) can journal, replay, and recover from crashes.
The Trait
pub trait DurableContext: Send + Sync {
async fn execute_llm_call(&self, request: CompletionRequest, options: ActivityOptions) -> Result<CompletionResponse, DurableError>;
async fn execute_tool(&self, tool_name: &str, input: Value, ctx: &ToolContext, options: ActivityOptions) -> Result<ToolOutput, DurableError>;
async fn wait_for_signal<T: DeserializeOwned>(&self, signal_name: &str, timeout: Duration) -> Result<Option<T>, DurableError>;
fn should_continue_as_new(&self) -> bool;
async fn continue_as_new(&self, state: Value) -> Result<(), DurableError>;
async fn sleep(&self, duration: Duration);
fn now(&self) -> DateTime<Utc>;
}
LocalDurableContext
For local development and testing, LocalDurableContext passes through to the
provider and tools directly – no journaling, no replay:
use std::sync::Arc;
use neuron_runtime::LocalDurableContext;
use neuron_tool::ToolRegistry;
let provider = Arc::new(my_provider);
let tools = Arc::new(ToolRegistry::new());
let durable = LocalDurableContext::new(provider, tools);
// Use in the agent loop
agent.set_durability(durable);
In production, swap LocalDurableContext for a Temporal or Restate
implementation. The calling code stays the same.
ActivityOptions
Controls timeout and retry behavior for durable activities:
use neuron_types::{ActivityOptions, RetryPolicy};
use std::time::Duration;
let options = ActivityOptions {
start_to_close_timeout: Duration::from_secs(30),
heartbeat_timeout: Some(Duration::from_secs(10)),
retry_policy: Some(RetryPolicy {
initial_interval: Duration::from_secs(1),
backoff_coefficient: 2.0,
maximum_attempts: 3,
maximum_interval: Duration::from_secs(30),
non_retryable_errors: vec!["Authentication".to_string()],
}),
};
Sandbox
The Sandbox trait wraps tool execution with isolation – filesystem
restrictions, network limits, or container boundaries:
use neuron_runtime::{Sandbox, NoOpSandbox};
// NoOpSandbox passes through directly (no isolation)
let sandbox = NoOpSandbox;
let output = sandbox.execute_tool(&*tool, input, &ctx).await?;
Implement Sandbox for your own isolation strategy:
use neuron_runtime::Sandbox;
use neuron_types::{ToolDyn, ToolContext, ToolOutput, SandboxError};
struct DockerSandbox { image: String }
impl Sandbox for DockerSandbox {
async fn execute_tool(
&self,
tool: &dyn ToolDyn,
input: serde_json::Value,
ctx: &ToolContext,
) -> Result<ToolOutput, SandboxError> {
// Spawn a container, execute tool inside, return output
todo!()
}
}
API Docs
Full API documentation: neuron-runtime on docs.rs
Embeddings
neuron provides a provider-agnostic EmbeddingProvider trait for generating
text embeddings, with an OpenAI implementation in neuron-provider-openai.
Quick Example
use neuron_provider_openai::OpenAi;
use neuron_types::{EmbeddingProvider, EmbeddingRequest};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let client = OpenAi::from_env()?;
let response = client.embed(EmbeddingRequest {
model: "text-embedding-3-small".to_string(),
input: vec![
"Rust is a systems programming language.".to_string(),
"Python is great for scripting.".to_string(),
],
dimensions: None,
..Default::default()
}).await?;
println!("Got {} embeddings", response.embeddings.len());
println!("First vector has {} dimensions", response.embeddings[0].len());
println!("Tokens used: {}", response.usage.total_tokens);
Ok(())
}
API Walkthrough
EmbeddingProvider Trait
The trait is defined in neuron-types and kept separate from Provider
because not all embedding models support chat completions and not all chat
providers support embeddings. Implement both on a single struct when a provider
supports both capabilities.
pub trait EmbeddingProvider: Send + Sync {
async fn embed(
&self,
request: EmbeddingRequest,
) -> Result<EmbeddingResponse, EmbeddingError>;
}
The trait uses RPITIT (return position impl trait in trait) and is not
object-safe. Use generics <E: EmbeddingProvider> for composition.
EmbeddingRequest
pub struct EmbeddingRequest {
/// The embedding model (e.g. "text-embedding-3-small").
pub model: String,
/// Text inputs to embed. Multiple strings are batched into one API call.
pub input: Vec<String>,
/// Optional output dimensionality (not all models support this).
pub dimensions: Option<usize>,
/// Provider-specific extra fields forwarded verbatim.
pub extra: HashMap<String, serde_json::Value>,
}
model– the model identifier. The OpenAI implementation defaults totext-embedding-3-smallwhen this is empty.input– a batch of strings. Each string produces one embedding vector in the response. Batching multiple inputs in a single request is more efficient than making separate calls.dimensions– reduces the output dimensionality when supported by the model (e.g., OpenAI’stext-embedding-3-smallsupports 256, 512, or 1536).extra– a map of provider-specific fields merged directly into the request body. Useful for options not covered by the common fields.
EmbeddingResponse
pub struct EmbeddingResponse {
/// One embedding vector per input string, in the same order.
pub embeddings: Vec<Vec<f32>>,
/// The model that generated the embeddings.
pub model: String,
/// Token usage statistics.
pub usage: EmbeddingUsage,
}
The embeddings vector is always in the same order as the input vector in
the request. Each inner Vec<f32> is a dense floating-point embedding.
EmbeddingUsage
pub struct EmbeddingUsage {
/// Number of tokens in the input.
pub prompt_tokens: usize,
/// Total tokens consumed.
pub total_tokens: usize,
}
EmbeddingError
All embedding operations return Result<_, EmbeddingError>. The variants are:
| Variant | Description | Retryable? |
|---|---|---|
Authentication(String) | Invalid API key or forbidden | No |
RateLimit { retry_after } | Provider rate limit hit | Yes |
InvalidRequest(String) | Bad model name, empty input, etc. | No |
Network(source) | Connection failure, DNS error | Yes |
Other(source) | Catch-all for unexpected errors | Depends |
Use error.is_retryable() to decide whether to retry:
match client.embed(request).await {
Ok(response) => { /* use embeddings */ }
Err(e) if e.is_retryable() => { /* back off and retry */ }
Err(e) => { /* terminal error, report to user */ }
}
OpenAI Implementation
neuron-provider-openai implements EmbeddingProvider on the same OpenAi
struct that implements Provider. No additional setup is needed – the
embedding calls reuse the same API key, base URL, and HTTP client.
use neuron_provider_openai::OpenAi;
use neuron_types::{EmbeddingProvider, EmbeddingRequest};
// Same client for both chat completions and embeddings
let client = OpenAi::new("sk-...")
.base_url("https://api.openai.com");
// Chat completion
let chat_response = client.complete(completion_request).await?;
// Embedding
let embed_response = client.embed(EmbeddingRequest {
model: "text-embedding-3-small".to_string(),
input: vec!["Hello world".to_string()],
..Default::default()
}).await?;
Default Model
When EmbeddingRequest.model is empty, the OpenAI implementation defaults to
text-embedding-3-small.
Controlling Dimensions
Use the dimensions field to reduce output size. Smaller embeddings use less
storage and are faster to compare, at the cost of some accuracy:
let response = client.embed(EmbeddingRequest {
model: "text-embedding-3-small".to_string(),
input: vec!["hello".to_string()],
dimensions: Some(256), // Default is 1536 for this model
..Default::default()
}).await?;
assert_eq!(response.embeddings[0].len(), 256);
Provider-Specific Options
Pass extra fields that the OpenAI API supports but neuron does not model explicitly:
use std::collections::HashMap;
let mut extra = HashMap::new();
extra.insert("user".to_string(), serde_json::json!("user-123"));
let response = client.embed(EmbeddingRequest {
model: "text-embedding-3-large".to_string(),
input: vec!["text to embed".to_string()],
extra,
..Default::default()
}).await?;
Implementing a Custom EmbeddingProvider
To add embedding support for a new provider, implement the trait in your provider crate:
use std::future::Future;
use neuron_types::{EmbeddingProvider, EmbeddingRequest, EmbeddingResponse, EmbeddingError};
struct MyEmbeddingProvider { /* ... */ }
impl EmbeddingProvider for MyEmbeddingProvider {
fn embed(
&self,
request: EmbeddingRequest,
) -> impl Future<Output = Result<EmbeddingResponse, EmbeddingError>> + Send {
async move {
// Call your embedding API
let vectors = call_my_api(&request.input).await?;
Ok(EmbeddingResponse {
embeddings: vectors,
model: request.model,
usage: EmbeddingUsage {
prompt_tokens: 0,
total_tokens: 0,
},
})
}
}
}
API Docs
Full API documentation:
- Trait and types: neuron-types on docs.rs
- OpenAI implementation: neuron-provider-openai on docs.rs
Testing Agents
neuron is designed for testability. Every block – providers, tools, context strategies, guardrails – can be tested independently without real API calls.
Quick Example
use std::sync::Mutex;
use neuron_types::*;
struct MockProvider {
responses: Mutex<Vec<CompletionResponse>>,
}
impl Provider for MockProvider {
async fn complete(&self, _req: CompletionRequest) -> Result<CompletionResponse, ProviderError> {
let mut responses = self.responses.lock().unwrap();
Ok(responses.remove(0))
}
async fn complete_stream(&self, _req: CompletionRequest) -> Result<StreamHandle, ProviderError> {
Err(ProviderError::InvalidRequest("mock does not stream".into()))
}
}
Testing Strategies
1. Mock Providers
A mock provider returns fixed CompletionResponse values in sequence. This
lets you test agent behavior without network calls or API keys.
Single-turn response (model ends the conversation):
fn end_turn_response(text: &str) -> CompletionResponse {
CompletionResponse {
id: "mock-1".to_string(),
model: "mock".to_string(),
message: Message::assistant(text),
usage: TokenUsage::default(),
stop_reason: StopReason::EndTurn,
}
}
Tool-calling response (model requests a tool call):
fn tool_call_response(tool_name: &str, tool_id: &str, args: serde_json::Value) -> CompletionResponse {
CompletionResponse {
id: "mock-2".to_string(),
model: "mock".to_string(),
message: Message {
role: Role::Assistant,
content: vec![ContentBlock::ToolUse {
id: tool_id.to_string(),
name: tool_name.to_string(),
input: args,
}],
},
usage: TokenUsage::default(),
stop_reason: StopReason::ToolUse,
}
}
Multi-turn mock – queue responses to simulate a full conversation:
let provider = MockProvider {
responses: Mutex::new(vec![
// Turn 1: model calls a tool
tool_call_response("get_weather", "call-1", serde_json::json!({"city": "Tokyo"})),
// Turn 2: model responds with the final answer
end_turn_response("The weather in Tokyo is 72F and sunny."),
]),
};
2. Testing Tools Independently
Tools implement a trait with typed arguments and outputs. Test them directly without involving a provider or loop:
use neuron_types::{Tool, ToolContext};
#[tokio::test]
async fn test_weather_tool() {
let tool = GetWeather;
let ctx = ToolContext::default();
let result = tool.call(WeatherArgs { city: "Tokyo".to_string() }, &ctx).await;
assert!(result.is_ok());
assert!(result.unwrap().contains("Tokyo"));
}
ToolContext::default() provides sensible defaults (cwd from the environment,
empty session ID, fresh cancellation token). Override fields when your tool
depends on them:
let ctx = ToolContext {
session_id: "test-session".to_string(),
cwd: PathBuf::from("/tmp/test"),
..Default::default()
};
3. Testing Tools via the Registry
To test the full JSON serialization/deserialization path through the
ToolRegistry:
use neuron_tool::ToolRegistry;
use neuron_types::ToolContext;
#[tokio::test]
async fn test_tool_via_registry() {
let mut registry = ToolRegistry::new();
registry.register(GetWeather);
let ctx = ToolContext::default();
let input = serde_json::json!({"city": "London"});
let output = registry.execute("get_weather", input, &ctx).await.unwrap();
assert!(!output.is_error);
// Check structured output
let text = &output.content[0];
match text {
neuron_types::ContentItem::Text(t) => assert!(t.contains("London")),
_ => panic!("expected text content"),
}
}
4. Testing Context Strategies
Context strategies are pure functions on message lists. Test them with synthetic data:
use neuron_context::SlidingWindowStrategy;
use neuron_types::{ContextStrategy, Message};
#[tokio::test]
async fn test_sliding_window() {
let strategy = SlidingWindowStrategy::new(3, 100_000);
// Create a long conversation
let messages: Vec<Message> = (0..10)
.map(|i| Message::user(format!("Message {i}")))
.collect();
assert!(strategy.should_compact(&messages, 150_000));
let compacted = strategy.compact(messages).await.unwrap();
assert!(compacted.len() <= 3);
}
5. Testing Guardrails
Guardrails are async functions on strings – no provider needed:
use neuron_runtime::{InputGuardrail, GuardrailResult};
#[tokio::test]
async fn test_no_secrets_guardrail() {
let guardrail = NoSecrets;
let result = guardrail.check("What is Rust?").await;
assert!(result.is_pass());
let result = guardrail.check("My API_KEY is abc123").await;
assert!(result.is_tripwire());
}
6. Testing the Full Agent Loop
Combine a mock provider with real tools to test the complete agent loop:
use neuron_loop::AgentLoop;
use neuron_tool::ToolRegistry;
use neuron_context::SlidingWindowStrategy;
use neuron_types::*;
#[tokio::test]
async fn test_agent_loop_with_tool_call() {
// Set up mock provider with two responses:
// 1. Model calls the echo tool
// 2. Model produces a final answer
let provider = MockProvider {
responses: Mutex::new(vec![
tool_call_response("echo", "call-1", serde_json::json!({"text": "hello"})),
end_turn_response("The echo tool returned: hello"),
]),
};
let mut tools = ToolRegistry::new();
tools.register(EchoTool);
let context = SlidingWindowStrategy::new(10, 100_000);
let mut agent = AgentLoop::builder(provider, context)
.tools(tools)
.system_prompt("You are a test agent.")
.max_turns(5)
.build();
let ctx = ToolContext::default();
let result = agent.run(Message::user("Echo hello"), &ctx).await.unwrap();
assert_eq!(result.turns, 2);
assert!(result.response.contains("hello"));
}
7. HTTP-Level Integration Tests with wiremock
For testing actual HTTP request/response mapping without calling the real API,
use wiremock to stand up a local mock server:
use wiremock::{Mock, MockServer, ResponseTemplate};
use wiremock::matchers::{method, path};
use neuron_provider_openai::OpenAi;
use neuron_types::*;
#[tokio::test]
async fn test_openai_provider_http() {
let server = MockServer::start().await;
// Mock the OpenAI completions endpoint
Mock::given(method("POST"))
.and(path("/v1/chat/completions"))
.respond_with(ResponseTemplate::new(200).set_body_json(serde_json::json!({
"id": "chatcmpl-123",
"model": "gpt-4o",
"choices": [{
"index": 0,
"message": { "role": "assistant", "content": "Hello!" },
"finish_reason": "stop"
}],
"usage": { "prompt_tokens": 10, "completion_tokens": 5, "total_tokens": 15 }
})))
.mount(&server)
.await;
let client = OpenAi::new("test-key").base_url(server.uri());
let response = client.complete(CompletionRequest {
model: "gpt-4o".to_string(),
messages: vec![Message::user("Hi")],
..Default::default()
}).await.unwrap();
assert_eq!(response.stop_reason, StopReason::EndTurn);
}
This tests the full serialization/deserialization path through the provider implementation without any network calls to OpenAI.
Testing Patterns Summary
| What to test | Approach | Needs API key? |
|---|---|---|
| Individual tools | Call tool.call(args, ctx) directly | No |
| Tool JSON path | Use ToolRegistry::execute() | No |
| Context strategy | Call should_compact() / compact() with synthetic messages | No |
| Guardrails | Call guardrail.check(text) | No |
| Single-turn agent | Mock provider + AgentLoop::run() | No |
| Multi-turn agent | Mock provider with queued responses | No |
| Provider HTTP mapping | wiremock + real provider | No |
| End-to-end integration | Real provider + real tools | Yes |
Tips
- Use
..Default::default()onCompletionRequest,TokenUsage, andToolContextto avoid breaking tests when new fields are added. - Keep mock providers simple:
Mutex<Vec<CompletionResponse>>covers most patterns. - Test
ToolError::ModelRetryby returning it from a mock tool – verify the loop converts it to an error tool result and the model gets another chance. - Use
StopReason::EndTurnfor final responses andStopReason::ToolUsefor tool-calling turns in your mock data.
API Docs
Full API documentation:
- Types: neuron-types on docs.rs
- Tool registry: neuron-tool on docs.rs
- Agent loop: neuron-loop on docs.rs
Observability
neuron provides observability through the ObservabilityHook trait and the
neuron-otel crate, which implements OpenTelemetry instrumentation following
the GenAI semantic conventions.
The ObservabilityHook trait
The ObservabilityHook trait (defined in neuron-types) is the extension point
for logging, metrics, and telemetry. Hooks receive events at each step of the
agent loop and can observe or control execution.
pub trait ObservabilityHook: Send + Sync {
fn on_event(&self, event: HookEvent<'_>) -> impl Future<Output = Result<HookAction, HookError>> + Send;
}
See the agent loop guide for details on hook
events and the HookAction enum.
neuron-otel – OpenTelemetry instrumentation
neuron-otel provides OtelHook, an ObservabilityHook implementation that
emits structured tracing spans using the OpenTelemetry GenAI semantic conventions
(gen_ai.* attributes).
Quick example
use neuron_otel::OtelHook;
use neuron_loop::AgentLoop;
let agent = AgentLoop::builder(provider, context)
.tools(registry)
.hook(OtelHook::default())
.build();
That’s it. OtelHook emits spans for every LLM call, tool execution, and loop
iteration, with attributes following the gen_ai.* namespace.
What gets traced
OtelHook emits spans at each hook event:
| Event | Span name | Key attributes |
|---|---|---|
LoopIteration | gen_ai.loop.iteration | gen_ai.loop.turn |
PreLlmCall / PostLlmCall | gen_ai.chat | gen_ai.request.model, gen_ai.usage.input_tokens, gen_ai.usage.output_tokens, gen_ai.response.stop_reason |
PreToolExecution / PostToolExecution | gen_ai.execute_tool | gen_ai.tool.name, gen_ai.tool.is_error |
ContextCompaction | gen_ai.context.compaction | gen_ai.context.old_tokens, gen_ai.context.new_tokens |
Configuration
OtelHook uses the standard tracing crate subscriber model. Configure your
tracing pipeline as usual with tracing-opentelemetry and the OpenTelemetry
SDK:
use neuron_otel::OtelHook;
use opentelemetry::trace::TracerProvider;
use tracing_subscriber::prelude::*;
// Set up your OpenTelemetry pipeline (exporter, batch processor, etc.)
let tracer_provider = /* your OTel setup */;
// Install tracing-opentelemetry layer
tracing_subscriber::registry()
.with(tracing_opentelemetry::layer().with_tracer(
tracer_provider.tracer("neuron")
))
.init();
// Add the hook to your agent
let agent = AgentLoop::builder(provider, context)
.hook(OtelHook::default())
.build();
OtelHook does not configure the OpenTelemetry pipeline itself – it only
emits tracing spans. You bring your own exporter (Jaeger, OTLP, Zipkin, etc.)
and configure it through the standard OpenTelemetry SDK.
GenAI semantic conventions
The span attributes follow the emerging OpenTelemetry GenAI semantic conventions specification. Key attributes include:
gen_ai.system– the provider system (e.g.,"anthropic","openai")gen_ai.request.model– the model identifiergen_ai.usage.input_tokens– input token countgen_ai.usage.output_tokens– output token countgen_ai.response.stop_reason– why the model stopped generatinggen_ai.tool.name– the name of the tool being called
Using with neuron-runtime’s TracingHook
neuron-runtime also ships a TracingHook for basic tracing span emission.
OtelHook and TracingHook serve different purposes:
TracingHook– lightweight, emits simpletracingspans for local debugging. No GenAI semantic conventions. Ships withneuron-runtime.OtelHook– full OpenTelemetry instrumentation with GenAI semantic conventions. Designed for production observability pipelines. Ships withneuron-otel.
You can use both simultaneously – they are independent hooks:
use neuron_otel::OtelHook;
use neuron_runtime::TracingHook;
let agent = AgentLoop::builder(provider, context)
.hook(TracingHook::default()) // Local debug logging
.hook(OtelHook::default()) // Production OTel export
.build();
Installation
Add neuron-otel directly:
[dependencies]
neuron-otel = "*"
Or use the umbrella crate with the otel feature:
[dependencies]
neuron = { features = ["anthropic", "otel"] }
API reference
Design Decisions
neuron’s architecture reflects a set of deliberate trade-offs. This page explains the key decisions and the reasoning behind them.
“serde, not serde_json”
neuron is a library of building blocks, not a framework.
The serde crate defines the Serialize and Deserialize traits.
serde_json implements them for JSON. neuron follows the same pattern:
neuron-types defines the Provider, Tool, and ContextStrategy traits.
Provider crates (neuron-provider-anthropic, neuron-provider-openai, etc.)
implement them.
This means you can pull in a single block – say, neuron-tool for the tool
registry and middleware pipeline – without buying into an opinionated agent
framework. You compose the blocks yourself, or use a framework built on top.
The scope test: If removing a feature forces every user to reimplement 200+ lines of non-trivial code (type erasure, middleware chaining, protocol handling), it belongs in neuron. If removing it forces 20-50 lines of straightforward composition, it belongs in an SDK layer above.
Block decomposition: one crate, one concern
Each crate owns exactly one concern:
| Crate | Concern |
|---|---|
neuron-types | Types and trait definitions (zero logic) |
neuron-provider-anthropic | Anthropic API implementation |
neuron-provider-openai | OpenAI API implementation |
neuron-provider-ollama | Ollama (local models) implementation |
neuron-tool | Tool registry, type erasure, middleware |
neuron-mcp | MCP protocol bridge (wraps rmcp) |
neuron-context | Context compaction strategies |
neuron-loop | The agentic while-loop |
neuron-runtime | Sessions, guardrails, durability |
neuron | Umbrella re-export |
Crates depend only on neuron-types and the crates directly below them in the
dependency graph. No circular dependencies. Adding a new provider never touches
the tool system. Adding a new compaction strategy never touches the loop.
Provider-per-crate (the serde pattern)
The Provider trait lives in neuron-types. Each cloud API gets its own crate:
// neuron-types/src/traits.rs
pub trait Provider: Send + Sync {
fn complete(
&self,
request: CompletionRequest,
) -> impl Future<Output = Result<CompletionResponse, ProviderError>> + Send;
fn complete_stream(
&self,
request: CompletionRequest,
) -> impl Future<Output = Result<StreamHandle, ProviderError>> + Send;
}
The trait is intentionally not object-safe (it uses RPITIT). You compose with
generics (fn run<P: Provider>(provider: &P)), which gives the compiler full
visibility for optimization.
Why not a single provider crate with feature flags? Because provider APIs evolve independently. An Anthropic-specific feature (prompt caching, extended thinking) should not force a recompile of OpenAI code. Separate crates give you separate version timelines.
Message structure: flat struct over variant-per-role
neuron uses a flat Message struct:
pub struct Message {
pub role: Role,
pub content: Vec<ContentBlock>,
}
The alternative – one enum variant per role (UserMessage, AssistantMessage,
SystemMessage) – creates a combinatorial explosion of conversion code. Rig
uses the variant-per-role approach and needs roughly 300 lines of conversion
logic per provider. The flat struct maps naturally to every provider API we
studied (Anthropic, OpenAI, Ollama) with minimal translation.
Tool middleware: axum’s from_fn, not tower’s Service/Layer
The tool middleware pipeline uses a callback-based pattern identical to axum’s
middleware::from_fn:
async fn logging_middleware(
tool_name: &str,
input: serde_json::Value,
ctx: &ToolContext,
next: ToolMiddlewareNext<'_>,
) -> Result<ToolOutput, ToolError> {
println!("calling {tool_name}");
let result = next.run(tool_name, input, ctx).await;
println!("result: {result:?}");
result
}
tower’s Service and Layer traits are designed for high-throughput
request/response pipelines where the overhead of trait objects and Pin<Box<...>>
matters. Tool calls happen at most a few times per LLM turn. The axum-style
callback is simpler to write, simpler to read, and validated by the tokio team
for exactly this kind of middleware.
DurableContext wraps side effects, not just observes them
Early designs had a single DurabilityHook that observed LLM calls and tool
executions. This fails for Temporal replay: an observation hook cannot prevent
a side effect from re-executing during replay.
The solution is DurableContext, which wraps side effects:
pub trait DurableContext: Send + Sync {
fn execute_llm_call(
&self,
request: CompletionRequest,
options: ActivityOptions,
) -> impl Future<Output = Result<CompletionResponse, DurableError>> + Send;
fn execute_tool(
&self,
tool_name: &str,
input: serde_json::Value,
ctx: &ToolContext,
options: ActivityOptions,
) -> impl Future<Output = Result<ToolOutput, DurableError>> + Send;
}
When a DurableContext is present, the agentic loop calls through it instead of
directly calling the provider or tools. The durable engine (Temporal, Restate,
Inngest) can journal the result, and on replay, return the journaled result
without re-executing the side effect.
A separate ObservabilityHook trait handles logging, metrics, and telemetry.
It returns HookAction (Continue, Skip, or Terminate) but does not wrap
execution.
RPITIT native async traits
neuron uses Rust 2024 edition with native impl Future return types in traits
(RPITIT). There is no #[async_trait] anywhere in the codebase:
pub trait Provider: Send + Sync {
fn complete(
&self,
request: CompletionRequest,
) -> impl Future<Output = Result<CompletionResponse, ProviderError>> + Send;
}
This avoids the heap allocation that #[async_trait] forces (one Box::pin
per call). The trade-off is that these traits are not object-safe – you must
use generics, not dyn Provider. For type-erased dispatch, neuron provides
ToolDyn with an explicit Box::pin at the erasure boundary only.
ToolError::ModelRetry for self-correction
Adopted from Pydantic AI’s pattern, ModelRetry lets a tool tell the model
to try again with different arguments:
pub enum ToolError {
NotFound(String),
InvalidInput(String),
ExecutionFailed(Box<dyn std::error::Error + Send + Sync>),
PermissionDenied(String),
Cancelled,
ModelRetry(String), // <-- hint for the model
}
When a tool returns ModelRetry("date must be in YYYY-MM-DD format"), the
loop does not propagate this as an error. Instead, it converts the hint into
an error tool result and sends it back to the model. The model sees the hint,
adjusts its arguments, and calls the tool again.
This keeps self-correction logic out of the tool implementation. The tool just says “try again, here’s why” and the loop handles the retry protocol.
Server-side context compaction
The Anthropic API supports server-side context management: the client sends a
context_management field, and the server may respond with
StopReason::Compaction plus a ContentBlock::Compaction summary.
neuron models this with dedicated types:
pub struct ContextManagement {
pub edits: Vec<ContextEdit>,
}
pub enum ContextEdit {
Compact { strategy: String },
}
pub enum StopReason {
EndTurn,
ToolUse,
MaxTokens,
StopSequence,
ContentFilter,
Compaction, // <-- server compacted context
}
pub enum ContentBlock {
// ...
Compaction { content: String },
}
When the loop receives StopReason::Compaction, it continues automatically –
the server has already compacted the context, and the response contains the
compaction summary. Token usage during compaction is tracked per-iteration via
UsageIteration.
This is distinct from client-side compaction (the ContextStrategy trait),
which the loop manages locally. Both can coexist: the provider handles
server-side compaction transparently, while the context strategy handles
client-side compaction when needed.
Dependency Graph
neuron’s crates form a strict upward-pointing dependency tree. Every arrow
points toward the foundation (neuron-types), never downward. There are no
circular dependencies.
The graph
neuron-types (zero deps, the foundation)
neuron-tool-macros (zero deps, proc macro)
^
|-- neuron-provider-* (each implements Provider trait)
|-- neuron-otel (OTel instrumentation, GenAI semantic conventions)
|-- neuron-context (compaction strategies, token counting)
+-- neuron-tool (Tool trait, registry, middleware; optional dep on neuron-tool-macros)
^
|-- neuron-mcp (wraps rmcp, bridges to Tool trait)
|-- neuron-loop (provider loop with tool dispatch)
+-- neuron-runtime (sessions, DurableContext, guardrails, sandbox)
^
neuron (umbrella re-export)
^
YOUR PROJECT (SDK, CLI, TUI, GUI)
Layer by layer
neuron-types (foundation)
Zero dependencies on other neuron crates. Contains all types and trait definitions:
- Types:
Message,CompletionRequest,CompletionResponse,TokenUsage,ToolDefinition,ToolOutput,ContentBlock,StopReason - Traits:
Provider,EmbeddingProvider,Tool,ToolDyn,ContextStrategy,ObservabilityHook,DurableContext,PermissionPolicy - Errors:
ProviderError,ToolError,LoopError,ContextError,DurableError,HookError,McpError,EmbeddingError,StorageError,SandboxError
Every other crate depends on neuron-types. Nothing else.
Provider crates (leaf nodes)
Each provider crate implements the Provider trait for one API:
| Crate | Provider |
|---|---|
neuron-provider-anthropic | Anthropic Messages API |
neuron-provider-openai | OpenAI Chat Completions / Responses API |
neuron-provider-ollama | Ollama local inference |
Provider crates depend only on neuron-types (plus their HTTP client and
API-specific serialization). They never depend on each other or on higher-level
neuron crates.
Adding a new provider means creating a new crate that implements Provider.
No existing code changes.
neuron-otel (leaf node)
Implements the ObservabilityHook trait using OpenTelemetry tracing spans with
gen_ai.* GenAI semantic conventions. Emits structured spans for LLM calls,
tool executions, and loop iterations following the emerging OpenTelemetry GenAI
semantic conventions specification.
Depends only on neuron-types (plus tracing and opentelemetry for span
emission). Like provider crates, it is a leaf node with no knowledge of other
neuron crates.
neuron-tool-macros (leaf node)
Proc macro crate providing #[neuron_tool] for deriving Tool implementations
from annotated async functions. Zero workspace dependencies.
neuron-tool (leaf node)
Implements the tool system:
ToolRegistry– storesArc<dyn ToolDyn>for dynamic dispatch- Tool middleware pipeline (axum-style
from_fn) - Type erasure via the
ToolDynblanket impl
Depends on neuron-types and optionally on neuron-tool-macros (via macros
feature flag).
neuron-mcp
Wraps the rmcp crate (the official Rust MCP SDK) and bridges MCP tools into
neuron’s ToolDyn trait. Depends on neuron-types, neuron-tool, and rmcp.
neuron-context (leaf node)
Implements ContextStrategy for client-side context compaction. Some strategies
(like summarization) optionally use a Provider for LLM calls, but the
dependency is on the trait, not on any concrete provider crate.
neuron-loop
The agentic while-loop that composes a provider and tool registry. This is the ~300-line commodity loop that every agent framework converges on. It depends on:
neuron-types(for trait definitions)neuron-tool(forToolRegistry)
The loop is generic over <P: Provider, C: ContextStrategy> and accepts a
ToolRegistry. neuron-context is a dev-dependency only (for tests).
neuron-runtime
Adds cross-cutting runtime concerns:
- Sessions – persistent conversation state via
StorageError-aware backends - DurableContext – wraps side effects for Temporal/Restate replay
- ObservabilityHook – logging, metrics, telemetry
- Guardrails – input/output validation
- PermissionPolicy – tool call authorization
- Sandbox – isolated tool execution environments
Depends on neuron-types and neuron-tool. neuron-loop and neuron-context
are dev-dependencies only (for tests).
neuron (umbrella)
Re-exports public items from all crates under a single neuron dependency.
Feature flags control which provider crates are included:
[dependencies]
neuron = { version = "0.2", features = ["anthropic", "openai"] }
Design rules
Arrows only point up. A crate at layer N may depend on crates at layer N-1
or below, never at layer N or above. This is enforced by Cargo.toml
dependencies – circular dependencies are a compile error in Rust.
Each block knows only about neuron-types and the blocks it directly depends on.
neuron-tool has no idea that neuron-loop exists. neuron-provider-anthropic
has no idea that neuron-runtime exists. This means you can use any block
independently.
Provider crates are fully independent. Provider crates do not depend on the
tool crate, the MCP crate, or each other. neuron-mcp, neuron-loop, and
neuron-runtime share a dependency on neuron-tool but are independent of
each other.
Practical implications
Using just the tool system:
[dependencies]
neuron-types = "0.2"
neuron-tool = "0.2"
Using just a provider for raw LLM calls:
[dependencies]
neuron-types = "0.2"
neuron-provider-anthropic = "0.2"
Using the full stack:
[dependencies]
neuron = { version = "0.2", features = ["anthropic", "openai", "mcp"] }
The dependency graph ensures that pulling in one block never forces you to compile unrelated blocks.
Comparison with Other Frameworks
neuron takes a different approach from most agent frameworks. This page compares its architecture with other popular options in the Rust and Python ecosystems.
Honest note: neuron is at an early stage. This comparison focuses on architectural differences, not feature completeness. Where other frameworks have more mature implementations, we say so.
Summary matrix
| neuron (Rust) | Rig (Rust) | ADK-Rust (Google) | OpenAI Agents SDK (Python) | Pydantic AI (Python) | |
|---|---|---|---|---|---|
| Architecture | Independent crates (building blocks) | Monolithic library | Multi-crate with DAG engine | Single package | Single package |
| Provider abstraction | Trait in types crate, impl per crate | Trait + built-in impls | Google-focused, extensible | OpenAI-only | Multi-provider |
| Tool system | Typed trait + type erasure + middleware | Typed trait, no middleware | Typed with annotations | Function decorators | Function decorators with typed args |
| Middleware | axum-style from_fn pipeline | None | None | Hooks | None |
| Usage limits | UsageLimits (tokens, requests, tool calls) | None | None | None | UsageLimits (tokens, requests) |
| Tool timeouts | TimeoutMiddleware (per-tool configurable) | None | None | None | None |
| Context management | Client-side + server-side compaction | Manual | Built-in | Built-in | Manual |
| Durable execution | DurableContext trait (Temporal/Restate) | None | None | None | None |
| Async model | RPITIT (native, no alloc) | #[async_trait] (boxed) | #[async_trait] (boxed) | Python async | Python async |
| OpenTelemetry | neuron-otel with GenAI semantic conventions | None | None | Built-in tracing | None |
| MCP support | Via neuron-mcp (wraps rmcp) | Community | Limited | Built-in | Limited |
| Graph/DAG | Not included (SDK layer) | Not included | LangGraph port | Not included | Not included |
| Maturity | Early | Established | Early | Established | Established |
Detailed comparisons
Rig (Rust)
Rig is the most established Rust agent framework. It provides a solid multi-provider abstraction and a typed tool system.
Where Rig excels:
- Mature ecosystem with multiple provider implementations
- Good documentation and examples
- Proven in production use cases
Where neuron differs:
- Crate independence. Rig is a monolithic library – you depend on
rig-coreand get everything. neuron lets you pull in just the tool system, or just a provider, without the rest. - Message model. Rig uses a variant-per-role enum (
UserMessage,AssistantMessage), which requires roughly 300 lines of conversion code per provider. neuron uses a flatMessage { role, content }struct that maps directly to every API. - Tool middleware. Rig has no middleware pipeline. Adding logging, rate limiting, or permission checks requires wrapping each tool individually. neuron’s middleware pipeline applies cross-cutting concerns to all tools.
- Async model. Rig uses
#[async_trait], which heap-allocates on every call. neuron uses RPITIT (native Rust 2024 async traits) with zero overhead for non-erased dispatch.
ADK-Rust (Google’s Agent Development Kit)
Google’s ADK-Rust is a multi-crate Rust framework that includes a port of LangGraph’s DAG execution engine.
Where ADK-Rust excels:
- Comprehensive multi-crate architecture
- Built-in DAG/graph orchestration for complex workflows
- Strong Google Cloud integration
Where neuron differs:
- No graph layer. ADK-Rust’s LangGraph port is its most complex component and, based on community feedback, its least-used. neuron deliberately omits graph orchestration – most agent use cases are sequential loops, not DAGs.
- Block independence. ADK-Rust’s crates have tighter coupling than neuron’s. neuron’s leaf crates (providers, tools, MCP) have zero knowledge of each other.
- Durable execution. neuron’s
DurableContexttrait is designed specifically for Temporal/Restate integration. ADK-Rust does not have a durability abstraction.
OpenAI Agents SDK (Python)
The OpenAI Agents SDK provides a clean Python API for building agents with strong support for handoff protocols between agents.
Where Agents SDK excels:
- Elegant handoff protocol for multi-agent systems
- Built-in MCP support
- Well-documented, easy to get started
- Built-in tracing
Where neuron differs:
- Language. neuron is Rust, giving you compile-time safety, zero-cost abstractions, and predictable performance. The Agents SDK is Python-only.
- Provider lock-in. The Agents SDK is designed for OpenAI’s API. neuron’s
Providertrait is provider-agnostic from the foundation. - Building blocks vs. framework. The Agents SDK is an opinionated framework with a specific agent lifecycle model. neuron gives you the pieces to build your own lifecycle.
Pydantic AI (Python)
Pydantic AI brings typed tool arguments and structured output validation to
Python agents. neuron adopted its ModelRetry self-correction pattern.
Where Pydantic AI excels:
- Typed tool arguments with runtime validation (Pydantic models)
- Multi-provider support
- Clean API for structured output
- The
ModelRetrypattern for tool self-correction
Where neuron differs:
- Compile-time types. Pydantic validates at runtime. neuron’s
Tooltrait usesschemars::JsonSchemafor schema generation andserde::Deserializefor deserialization, both checked at compile time. - ModelRetry adoption. neuron’s
ToolError::ModelRetry(String)is directly inspired by Pydantic AI. When a tool returnsModelRetry, the hint is converted to an error tool result so the model can self-correct. - UsageLimits adoption. neuron’s
UsageLimitsis inspired by Pydantic AI’s budget enforcement, extended with tool call limits. - Middleware. Pydantic AI has no tool middleware pipeline. neuron provides
TimeoutMiddleware,StructuredOutputValidator, andRetryLimitedValidatoras composable middleware.
What neuron does not do
Being honest about scope:
- neuron is not a framework. It does not give you a
run_agent()function that handles everything. You compose the blocks. - neuron does not include a CLI, TUI, or GUI. Those are built on top of the blocks.
- neuron does not include RAG pipelines. Retrieval is a tool or context strategy implementation, not a core block.
- neuron does not include sub-agent orchestration. Multi-agent handoff is
straightforward composition of
AgentLoop+ToolRegistryand belongs in an SDK layer.
Choosing the right tool
- If you want a batteries-included Rust agent framework today: Rig is more mature and has a larger ecosystem.
- If you want composable building blocks you can adopt incrementally: neuron lets you use exactly the pieces you need.
- If you need durable execution (Temporal/Restate): neuron is the only Rust
option with a dedicated
DurableContexttrait. - If you work primarily in Python: Pydantic AI and the OpenAI Agents SDK are excellent choices with larger communities.
- If you need DAG/graph orchestration: ADK-Rust includes a LangGraph port. neuron does not include a graph layer by design.
Error Handling
All neuron error types live in neuron-types and use thiserror for
derivation. This page documents every error enum, its variants, and how to
handle them.
Error hierarchy
LoopError (top-level, from the agentic loop)
|-- ProviderError (LLM provider failures)
|-- ToolError (tool execution failures)
+-- ContextError (context compaction failures)
+-- ProviderError (when summarization fails)
DurableError (durable execution failures)
HookError (observability hook failures)
McpError (MCP protocol failures)
EmbeddingError (embedding provider failures)
StorageError (session storage failures)
SandboxError (sandbox execution failures)
LoopError is the primary error type you encounter when running the agentic
loop. It wraps ProviderError, ToolError, and ContextError via From
implementations, so ? propagation works naturally.
The remaining error types (DurableError, HookError, McpError,
EmbeddingError, StorageError, SandboxError) are standalone – they appear
in their respective subsystems and do not nest under LoopError.
ProviderError
Errors from LLM provider operations (completions and streaming).
pub enum ProviderError {
// --- Retryable ---
Network(Box<dyn std::error::Error + Send + Sync>),
RateLimit { retry_after: Option<Duration> },
ModelLoading(String),
Timeout(Duration),
ServiceUnavailable(String),
// --- Terminal ---
Authentication(String),
InvalidRequest(String),
ModelNotFound(String),
InsufficientResources(String),
// --- Other ---
StreamError(String),
Other(Box<dyn std::error::Error + Send + Sync>),
}
Variants
| Variant | Description | Retryable? |
|---|---|---|
Network | Connection reset, DNS failure, TLS error. Wraps the underlying transport error. | Yes |
RateLimit | Provider returned 429. retry_after contains the suggested delay if the API provided one. | Yes |
ModelLoading | Model is cold-starting (common with Ollama and serverless endpoints). | Yes |
Timeout | Request exceeded the configured timeout. Contains the duration that elapsed. | Yes |
ServiceUnavailable | Provider returned 503 or equivalent. | Yes |
Authentication | Invalid API key, expired token, or insufficient permissions (401/403). | No |
InvalidRequest | Malformed request: bad parameters, unsupported model configuration, schema violations. | No |
ModelNotFound | The requested model identifier does not exist on this provider. | No |
InsufficientResources | Quota exceeded or billing limit reached. Distinct from rate limiting. | No |
StreamError | Error during SSE streaming after the connection was established. | No |
Other | Catch-all for provider-specific errors that do not fit other variants. | No |
is_retryable()
impl ProviderError {
pub fn is_retryable(&self) -> bool {
matches!(
self,
Self::Network(_)
| Self::RateLimit { .. }
| Self::ModelLoading(_)
| Self::Timeout(_)
| Self::ServiceUnavailable(_)
)
}
}
Use is_retryable() to decide whether to retry a failed request. neuron does
not include built-in retry logic – use tower::retry, a durable engine’s retry
policy, or a simple loop:
let mut attempts = 0;
let response = loop {
match provider.complete(request.clone()).await {
Ok(resp) => break resp,
Err(e) if e.is_retryable() && attempts < 3 => {
attempts += 1;
tokio::time::sleep(Duration::from_secs(1 << attempts)).await;
}
Err(e) => return Err(e),
}
};
EmbeddingError
Errors from embedding provider operations.
pub enum EmbeddingError {
Authentication(String),
RateLimit { retry_after: Option<Duration> },
InvalidRequest(String),
Network(Box<dyn std::error::Error + Send + Sync>),
Other(Box<dyn std::error::Error + Send + Sync>),
}
Variants
| Variant | Description | Retryable? |
|---|---|---|
Authentication | Invalid API key or expired token. | No |
RateLimit | Provider returned 429. | Yes |
InvalidRequest | Bad input (e.g., empty input array, unsupported model). | No |
Network | Connection-level failure. | Yes |
Other | Catch-all. | No |
is_retryable()
impl EmbeddingError {
pub fn is_retryable(&self) -> bool {
matches!(self, Self::RateLimit { .. } | Self::Network(_))
}
}
ToolError
Errors from tool operations (registration, validation, execution).
pub enum ToolError {
NotFound(String),
InvalidInput(String),
ExecutionFailed(Box<dyn std::error::Error + Send + Sync>),
PermissionDenied(String),
Cancelled,
ModelRetry(String),
}
Variants
| Variant | Description |
|---|---|
NotFound | The tool name in the model’s ToolUse block does not match any registered tool. |
InvalidInput | The JSON arguments failed deserialization into the tool’s Args type. |
ExecutionFailed | The tool ran but returned an error. Wraps the tool’s specific error type. |
PermissionDenied | The PermissionPolicy denied this tool call. |
Cancelled | The tool execution was cancelled via the CancellationToken in ToolContext. |
ModelRetry | The tool is requesting the model to retry with different arguments. |
ModelRetry: the self-correction pattern
ModelRetry is special. It does not propagate as an error to the caller.
Instead, the agentic loop intercepts it and converts the hint string into an
error tool result that is sent back to the model:
use neuron_types::ToolError;
// Inside a tool implementation:
fn validate_date(input: &str) -> Result<(), ToolError> {
if !input.contains('-') {
return Err(ToolError::ModelRetry(
"Date must be in YYYY-MM-DD format, e.g. 2025-01-15".into()
));
}
Ok(())
}
The model sees the hint as a tool result with is_error: true and can adjust
its next tool call accordingly. This keeps self-correction logic simple: the
tool says what went wrong, and the loop handles the retry protocol.
LoopError
The top-level error type returned by the agentic loop.
pub enum LoopError {
Provider(ProviderError),
Tool(ToolError),
Context(ContextError),
MaxTurns(usize),
UsageLimitExceeded(String),
HookTerminated(String),
Cancelled,
}
Variants
| Variant | Description |
|---|---|
Provider | An LLM call failed. Check is_retryable() on the inner ProviderError. |
Tool | A tool call failed (excluding ModelRetry, which is handled internally). |
Context | Context compaction failed. |
MaxTurns | The loop hit the configured turn limit. Contains the limit value. |
UsageLimitExceeded | A token, request, or tool call budget was exceeded. Contains a descriptive message (e.g., "output token limit exceeded: 50123 > 50000"). |
HookTerminated | An ObservabilityHook returned HookAction::Terminate. Contains the reason. |
Cancelled | The loop’s cancellation token was triggered. |
From implementations
LoopError implements From<ProviderError>, From<ToolError>, and
From<ContextError>, so you can use ? to propagate errors from any of these
subsystems:
use neuron_types::{LoopError, ProviderError};
fn example() -> Result<(), LoopError> {
let provider_result: Result<_, ProviderError> = Err(
ProviderError::Authentication("invalid key".into())
);
provider_result?; // Automatically converted to LoopError::Provider
Ok(())
}
Handling LoopError
use neuron_types::LoopError;
match loop_result {
Ok(response) => { /* success */ }
Err(LoopError::Provider(e)) if e.is_retryable() => {
// Transient provider failure -- retry the whole loop or
// let a durable engine handle it.
}
Err(LoopError::Provider(e)) => {
// Terminal provider failure -- fix config and retry.
eprintln!("Provider error: {e}");
}
Err(LoopError::MaxTurns(limit)) => {
// The agent ran for too many turns without completing.
eprintln!("Hit {limit} turn limit");
}
Err(LoopError::UsageLimitExceeded(msg)) => {
// A token, request, or tool call budget was exceeded.
eprintln!("Usage limit: {msg}");
}
Err(LoopError::HookTerminated(reason)) => {
// A guardrail or hook stopped the loop.
eprintln!("Terminated: {reason}");
}
Err(LoopError::Cancelled) => {
// Graceful shutdown via cancellation token.
}
Err(e) => {
eprintln!("Loop error: {e}");
}
}
ContextError
Errors from context management operations.
pub enum ContextError {
CompactionFailed(String),
Provider(ProviderError),
}
| Variant | Description |
|---|---|
CompactionFailed | The compaction strategy itself failed (e.g., produced invalid output). |
Provider | A provider call during summarization-based compaction failed. Wraps ProviderError, so you can check is_retryable() on the inner error. |
DurableError
Errors from durable execution operations (Temporal, Restate, Inngest).
pub enum DurableError {
ActivityFailed(String),
Cancelled,
SignalTimeout,
ContinueAsNew(String),
Other(Box<dyn std::error::Error + Send + Sync>),
}
| Variant | Description |
|---|---|
ActivityFailed | A durable activity (LLM call or tool execution) failed after exhausting retries. |
Cancelled | The workflow was cancelled externally. |
SignalTimeout | wait_for_signal() timed out waiting for an external signal. |
ContinueAsNew | The workflow needs to continue as a new execution to avoid history bloat. |
Other | Catch-all for engine-specific errors. |
McpError
Errors from MCP (Model Context Protocol) operations.
pub enum McpError {
Connection(String),
Initialization(String),
ToolCall(String),
Transport(String),
Other(Box<dyn std::error::Error + Send + Sync>),
}
| Variant | Description |
|---|---|
Connection | Failed to connect to the MCP server. |
Initialization | The MCP handshake (initialize / initialized) failed. |
ToolCall | An MCP tools/call request failed. |
Transport | Transport-level error (stdio pipe broken, HTTP connection dropped). |
Other | Catch-all. |
HookError
Errors from observability hooks.
pub enum HookError {
Failed(String),
Other(Box<dyn std::error::Error + Send + Sync>),
}
| Variant | Description |
|---|---|
Failed | The hook encountered an error during execution. |
Other | Catch-all for hook-specific errors. |
Hook errors do not stop the loop by default. The loop logs them and continues.
To stop the loop from a hook, return HookAction::Terminate instead of
returning an error.
StorageError
Errors from session storage operations.
pub enum StorageError {
NotFound(String),
Serialization(String),
Io(std::io::Error),
Other(Box<dyn std::error::Error + Send + Sync>),
}
| Variant | Description |
|---|---|
NotFound | The requested session does not exist in storage. |
Serialization | Failed to serialize or deserialize session data. |
Io | Filesystem I/O error (for file-based storage backends). |
Other | Catch-all for backend-specific errors. |
SandboxError
Errors from sandbox operations (isolated tool execution environments).
pub enum SandboxError {
ExecutionFailed(String),
SetupFailed(String),
Other(Box<dyn std::error::Error + Send + Sync>),
}
| Variant | Description |
|---|---|
ExecutionFailed | Tool execution failed within the sandbox. |
SetupFailed | Sandbox creation or teardown failed. |
Other | Catch-all. |
Design principles
Two levels max. Error enums are at most two levels deep. LoopError::Context
wraps ContextError, which wraps ProviderError. There is no deeper nesting.
This keeps match arms readable.
thiserror everywhere. Every error enum derives thiserror::Error. Display
messages are concise and include the variant’s data. Source errors are linked
with #[source] or #[from] for proper error chain reporting.
Retryable classification at the source. ProviderError and EmbeddingError
provide is_retryable() because they know which failures are transient. Callers
do not need to pattern-match on specific variants to decide whether to retry.
No built-in retry. neuron exposes is_retryable() but does not include
retry middleware. Use tower::retry, a durable engine’s retry policy, or write
a simple loop. Retry logic is inherently policy-specific (backoff strategy, max
attempts, circuit breaking) and belongs in the application layer.
ModelRetry is not an error. Despite living in ToolError, ModelRetry is a
control flow signal, not a failure. The loop intercepts it before it reaches
the caller. If you handle ToolError directly (outside the loop), treat
ModelRetry as a hint to feed back to the model, not as an error to log.