Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Error Handling

All neuron error types live in neuron-types and use thiserror for derivation. This page documents every error enum, its variants, and how to handle them.

Error hierarchy

LoopError                       (top-level, from the agentic loop)
    |-- ProviderError           (LLM provider failures)
    |-- ToolError               (tool execution failures)
    +-- ContextError            (context compaction failures)
            +-- ProviderError   (when summarization fails)

DurableError                    (durable execution failures)
HookError                       (observability hook failures)
McpError                        (MCP protocol failures)
EmbeddingError                  (embedding provider failures)
StorageError                    (session storage failures)
SandboxError                    (sandbox execution failures)

LoopError is the primary error type you encounter when running the agentic loop. It wraps ProviderError, ToolError, and ContextError via From implementations, so ? propagation works naturally.

The remaining error types (DurableError, HookError, McpError, EmbeddingError, StorageError, SandboxError) are standalone – they appear in their respective subsystems and do not nest under LoopError.


ProviderError

Errors from LLM provider operations (completions and streaming).

pub enum ProviderError {
    // --- Retryable ---
    Network(Box<dyn std::error::Error + Send + Sync>),
    RateLimit { retry_after: Option<Duration> },
    ModelLoading(String),
    Timeout(Duration),
    ServiceUnavailable(String),

    // --- Terminal ---
    Authentication(String),
    InvalidRequest(String),
    ModelNotFound(String),
    InsufficientResources(String),

    // --- Other ---
    StreamError(String),
    Other(Box<dyn std::error::Error + Send + Sync>),
}

Variants

VariantDescriptionRetryable?
NetworkConnection reset, DNS failure, TLS error. Wraps the underlying transport error.Yes
RateLimitProvider returned 429. retry_after contains the suggested delay if the API provided one.Yes
ModelLoadingModel is cold-starting (common with Ollama and serverless endpoints).Yes
TimeoutRequest exceeded the configured timeout. Contains the duration that elapsed.Yes
ServiceUnavailableProvider returned 503 or equivalent.Yes
AuthenticationInvalid API key, expired token, or insufficient permissions (401/403).No
InvalidRequestMalformed request: bad parameters, unsupported model configuration, schema violations.No
ModelNotFoundThe requested model identifier does not exist on this provider.No
InsufficientResourcesQuota exceeded or billing limit reached. Distinct from rate limiting.No
StreamErrorError during SSE streaming after the connection was established.No
OtherCatch-all for provider-specific errors that do not fit other variants.No

is_retryable()

impl ProviderError {
    pub fn is_retryable(&self) -> bool {
        matches!(
            self,
            Self::Network(_)
                | Self::RateLimit { .. }
                | Self::ModelLoading(_)
                | Self::Timeout(_)
                | Self::ServiceUnavailable(_)
        )
    }
}

Use is_retryable() to decide whether to retry a failed request. neuron does not include built-in retry logic – use tower::retry, a durable engine’s retry policy, or a simple loop:

let mut attempts = 0;
let response = loop {
    match provider.complete(request.clone()).await {
        Ok(resp) => break resp,
        Err(e) if e.is_retryable() && attempts < 3 => {
            attempts += 1;
            tokio::time::sleep(Duration::from_secs(1 << attempts)).await;
        }
        Err(e) => return Err(e),
    }
};

EmbeddingError

Errors from embedding provider operations.

pub enum EmbeddingError {
    Authentication(String),
    RateLimit { retry_after: Option<Duration> },
    InvalidRequest(String),
    Network(Box<dyn std::error::Error + Send + Sync>),
    Other(Box<dyn std::error::Error + Send + Sync>),
}

Variants

VariantDescriptionRetryable?
AuthenticationInvalid API key or expired token.No
RateLimitProvider returned 429.Yes
InvalidRequestBad input (e.g., empty input array, unsupported model).No
NetworkConnection-level failure.Yes
OtherCatch-all.No

is_retryable()

impl EmbeddingError {
    pub fn is_retryable(&self) -> bool {
        matches!(self, Self::RateLimit { .. } | Self::Network(_))
    }
}

ToolError

Errors from tool operations (registration, validation, execution).

pub enum ToolError {
    NotFound(String),
    InvalidInput(String),
    ExecutionFailed(Box<dyn std::error::Error + Send + Sync>),
    PermissionDenied(String),
    Cancelled,
    ModelRetry(String),
}

Variants

VariantDescription
NotFoundThe tool name in the model’s ToolUse block does not match any registered tool.
InvalidInputThe JSON arguments failed deserialization into the tool’s Args type.
ExecutionFailedThe tool ran but returned an error. Wraps the tool’s specific error type.
PermissionDeniedThe PermissionPolicy denied this tool call.
CancelledThe tool execution was cancelled via the CancellationToken in ToolContext.
ModelRetryThe tool is requesting the model to retry with different arguments.

ModelRetry: the self-correction pattern

ModelRetry is special. It does not propagate as an error to the caller. Instead, the agentic loop intercepts it and converts the hint string into an error tool result that is sent back to the model:

use neuron_types::ToolError;

// Inside a tool implementation:
fn validate_date(input: &str) -> Result<(), ToolError> {
    if !input.contains('-') {
        return Err(ToolError::ModelRetry(
            "Date must be in YYYY-MM-DD format, e.g. 2025-01-15".into()
        ));
    }
    Ok(())
}

The model sees the hint as a tool result with is_error: true and can adjust its next tool call accordingly. This keeps self-correction logic simple: the tool says what went wrong, and the loop handles the retry protocol.


LoopError

The top-level error type returned by the agentic loop.

pub enum LoopError {
    Provider(ProviderError),
    Tool(ToolError),
    Context(ContextError),
    MaxTurns(usize),
    UsageLimitExceeded(String),
    HookTerminated(String),
    Cancelled,
}

Variants

VariantDescription
ProviderAn LLM call failed. Check is_retryable() on the inner ProviderError.
ToolA tool call failed (excluding ModelRetry, which is handled internally).
ContextContext compaction failed.
MaxTurnsThe loop hit the configured turn limit. Contains the limit value.
UsageLimitExceededA token, request, or tool call budget was exceeded. Contains a descriptive message (e.g., "output token limit exceeded: 50123 > 50000").
HookTerminatedAn ObservabilityHook returned HookAction::Terminate. Contains the reason.
CancelledThe loop’s cancellation token was triggered.

From implementations

LoopError implements From<ProviderError>, From<ToolError>, and From<ContextError>, so you can use ? to propagate errors from any of these subsystems:

use neuron_types::{LoopError, ProviderError};

fn example() -> Result<(), LoopError> {
    let provider_result: Result<_, ProviderError> = Err(
        ProviderError::Authentication("invalid key".into())
    );
    provider_result?; // Automatically converted to LoopError::Provider
    Ok(())
}

Handling LoopError

use neuron_types::LoopError;

match loop_result {
    Ok(response) => { /* success */ }
    Err(LoopError::Provider(e)) if e.is_retryable() => {
        // Transient provider failure -- retry the whole loop or
        // let a durable engine handle it.
    }
    Err(LoopError::Provider(e)) => {
        // Terminal provider failure -- fix config and retry.
        eprintln!("Provider error: {e}");
    }
    Err(LoopError::MaxTurns(limit)) => {
        // The agent ran for too many turns without completing.
        eprintln!("Hit {limit} turn limit");
    }
    Err(LoopError::UsageLimitExceeded(msg)) => {
        // A token, request, or tool call budget was exceeded.
        eprintln!("Usage limit: {msg}");
    }
    Err(LoopError::HookTerminated(reason)) => {
        // A guardrail or hook stopped the loop.
        eprintln!("Terminated: {reason}");
    }
    Err(LoopError::Cancelled) => {
        // Graceful shutdown via cancellation token.
    }
    Err(e) => {
        eprintln!("Loop error: {e}");
    }
}

ContextError

Errors from context management operations.

pub enum ContextError {
    CompactionFailed(String),
    Provider(ProviderError),
}
VariantDescription
CompactionFailedThe compaction strategy itself failed (e.g., produced invalid output).
ProviderA provider call during summarization-based compaction failed. Wraps ProviderError, so you can check is_retryable() on the inner error.

DurableError

Errors from durable execution operations (Temporal, Restate, Inngest).

pub enum DurableError {
    ActivityFailed(String),
    Cancelled,
    SignalTimeout,
    ContinueAsNew(String),
    Other(Box<dyn std::error::Error + Send + Sync>),
}
VariantDescription
ActivityFailedA durable activity (LLM call or tool execution) failed after exhausting retries.
CancelledThe workflow was cancelled externally.
SignalTimeoutwait_for_signal() timed out waiting for an external signal.
ContinueAsNewThe workflow needs to continue as a new execution to avoid history bloat.
OtherCatch-all for engine-specific errors.

McpError

Errors from MCP (Model Context Protocol) operations.

pub enum McpError {
    Connection(String),
    Initialization(String),
    ToolCall(String),
    Transport(String),
    Other(Box<dyn std::error::Error + Send + Sync>),
}
VariantDescription
ConnectionFailed to connect to the MCP server.
InitializationThe MCP handshake (initialize / initialized) failed.
ToolCallAn MCP tools/call request failed.
TransportTransport-level error (stdio pipe broken, HTTP connection dropped).
OtherCatch-all.

HookError

Errors from observability hooks.

pub enum HookError {
    Failed(String),
    Other(Box<dyn std::error::Error + Send + Sync>),
}
VariantDescription
FailedThe hook encountered an error during execution.
OtherCatch-all for hook-specific errors.

Hook errors do not stop the loop by default. The loop logs them and continues. To stop the loop from a hook, return HookAction::Terminate instead of returning an error.


StorageError

Errors from session storage operations.

pub enum StorageError {
    NotFound(String),
    Serialization(String),
    Io(std::io::Error),
    Other(Box<dyn std::error::Error + Send + Sync>),
}
VariantDescription
NotFoundThe requested session does not exist in storage.
SerializationFailed to serialize or deserialize session data.
IoFilesystem I/O error (for file-based storage backends).
OtherCatch-all for backend-specific errors.

SandboxError

Errors from sandbox operations (isolated tool execution environments).

pub enum SandboxError {
    ExecutionFailed(String),
    SetupFailed(String),
    Other(Box<dyn std::error::Error + Send + Sync>),
}
VariantDescription
ExecutionFailedTool execution failed within the sandbox.
SetupFailedSandbox creation or teardown failed.
OtherCatch-all.

Design principles

Two levels max. Error enums are at most two levels deep. LoopError::Context wraps ContextError, which wraps ProviderError. There is no deeper nesting. This keeps match arms readable.

thiserror everywhere. Every error enum derives thiserror::Error. Display messages are concise and include the variant’s data. Source errors are linked with #[source] or #[from] for proper error chain reporting.

Retryable classification at the source. ProviderError and EmbeddingError provide is_retryable() because they know which failures are transient. Callers do not need to pattern-match on specific variants to decide whether to retry.

No built-in retry. neuron exposes is_retryable() but does not include retry middleware. Use tower::retry, a durable engine’s retry policy, or write a simple loop. Retry logic is inherently policy-specific (backoff strategy, max attempts, circuit breaking) and belongs in the application layer.

ModelRetry is not an error. Despite living in ToolError, ModelRetry is a control flow signal, not a failure. The loop intercepts it before it reaches the caller. If you handle ToolError directly (outside the loop), treat ModelRetry as a hint to feed back to the model, not as an error to log.