Introduction

skelegent is a composable agentic AI architecture implemented as a Rust workspace. It provides the building blocks for constructing agentic systems – from a single LLM call with tool use, to multi-agent orchestration with durable execution, state persistence, and environment isolation.

What skelegent is

skelegent is a set of Rust crates organized into six architectural layers:

Layer 0 defines the stability contract: four protocol traits and two cross-cutting interfaces that every other layer builds on. These traits almost never change.
Layers 1–5 provide swappable implementations of those protocols: providers (Anthropic, OpenAI, Ollama), operators (ReAct loops, single-shot), orchestration, state persistence, environment isolation, and hook-based observation.

The result is a system where you pick the implementations you need and compose them. A local development setup and a globally distributed production deployment use the same trait boundaries – the only difference is which implementations back each protocol.

What skelegent is not

skelegent is not a framework. There is no runtime you boot, no configuration DSL, no workflow engine. It is a collection of crates with well-defined trait boundaries. You compose them in your own application code.

skelegent is not an LLM wrapper library. While it includes provider implementations for making LLM calls, the architecture is designed around the full lifecycle of agentic systems: reasoning loops, tool execution, state management, multi-agent composition, security hooks, and environment isolation.

Key properties

Provider-agnostic. The Provider trait abstracts over Anthropic, OpenAI, and Ollama. Adding a new provider means implementing one trait.
Object-safe protocol boundaries. All Layer 0 traits work behind Box<dyn Trait> and are Send + Sync. You can compose implementations at runtime without generics leaking through your entire application.
Trait-based composition. Every protocol (operator execution, orchestration, state, environment) is a trait. Swap implementations without changing calling code.
Precise cost tracking. All monetary values use rust_decimal::Decimal, avoiding floating-point accumulation errors across thousands of LLM calls.
Serializable boundaries. All protocol messages (OperatorInput, OperatorOutput, effects, signals) implement Serialize + Deserialize. An in-process function call and a cross-network RPC use the same types.

License

skelegent is dual-licensed under MIT and Apache-2.0, following the Rust ecosystem convention.

Source code

The source code is hosted at github.com/secbear/skelegent.

Getting Started

This section covers everything you need to start building with skelegent:

Installation – Setting up your environment and adding skelegent to your project.
Quickstart – A minimal working example: create a provider, register tools, and run a ReAct loop.
Core Concepts – The protocol traits, middleware seams, message/effect substrate, and layering model that make up the architecture.

If you are already familiar with the architecture and want to dive into specific subsystems, skip ahead to the Guides section.

Installation

Requirements

Rust edition 2024, MSRV 1.85
Cargo (included with Rust)

With Nix (recommended for contributors)

If you use Nix, the repository includes a development shell:

nix develop

This provides the correct Rust toolchain, cargo, clippy, rustfmt, and all system dependencies.

Adding skelegent to your project

The skelegent crate is an umbrella that re-exports all layers behind feature flags. Add it to your Cargo.toml:

[dependencies]
skelegent = { version = "0.4", features = ["context-engine", "provider-anthropic", "state-memory"] }

Feature flags

The umbrella crate uses feature flags to control which implementations are compiled:

Feature	What it enables
`core`	Layer 0 protocols + `skg-turn` + `skg-context` + `skg-tool` (included in default)
`context-engine`	Context engine (skg-context-engine)
`op-single-shot`	Single-shot operator (`skg-op-single-shot`)
`provider-anthropic`	Anthropic Claude provider
`provider-openai`	OpenAI provider
`provider-ollama`	Ollama local model provider
`providers-all`	All three providers
`state-memory`	In-memory state store
`state-fs`	Filesystem-backed state store
`orch-local`	In-process orchestrator
`orch-kit`	Orchestration utilities
`env-local`	Local (passthrough) environment
`mcp`	MCP client integration

Using individual crates

You can also depend on individual crates directly if you want finer control over your dependency tree:

[dependencies]
layer0 = "0.4"
skg-turn = "0.4"
skg-tool = "0.4"
skg-context-engine = "0.4"
skg-provider-anthropic = "0.4"

Verifying your setup

cargo build
cargo test
cargo clippy -- -D warnings

All three should pass cleanly on a fresh checkout.

Quickstart

This example creates an Anthropic provider, registers a tool, builds a Context, and runs react_loop directly. The loop will call the model, use tools if needed, and return the result.

Full example

use layer0::content::Content;
use layer0::context::{Message, Role};
use layer0::DispatchContext;
use layer0::id::{DispatchId, OperatorId};
use skg_context_engine::{Context, ReactLoopConfig, react_loop};
use skg_provider_anthropic::AnthropicProvider;
use skg_tool::{ToolDyn, ToolError, ToolRegistry};
use serde_json::json;
use std::future::Future;
use std::pin::Pin;
use std::sync::Arc;

/// A simple tool that returns the current time.
struct CurrentTimeTool;

impl ToolDyn for CurrentTimeTool {
    fn name(&self) -> &str {
        "current_time"
    }

    fn description(&self) -> &str {
        "Returns the current UTC time."
    }

    fn input_schema(&self) -> serde_json::Value {
        json!({
            "type": "object",
            "properties": {},
            "required": []
        })
    }

    fn call(
        &self,
        _input: serde_json::Value,
        _ctx: &DispatchContext,
    ) -> Pin<Box<dyn Future<Output = Result<serde_json::Value, ToolError>> + Send + '_>> {
        Box::pin(async {
            // In a real tool, you'd use chrono or std::time
            Ok(json!({ "time": "2026-02-28T12:00:00Z" }))
        })
    }
}

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // 1. Create the provider (reads ANTHROPIC_API_KEY from env)
    let api_key = std::env::var("ANTHROPIC_API_KEY")
        .expect("Set ANTHROPIC_API_KEY");
    let provider = AnthropicProvider::new(api_key);

    // 2. Build the tool registry
    let mut tools = ToolRegistry::new();
    tools.register(Arc::new(CurrentTimeTool));

    // 3. Configure the react loop
    let config = ReactLoopConfig {
        system_prompt: "You are a helpful assistant. Use tools when needed.".into(),
        model: Some("claude-haiku-4-5-20251001".into()),
        max_tokens: Some(4096),
        temperature: None,
    };

    // 4. Create a dispatch context (identifies the calling agent)
    let tool_ctx = DispatchContext::new(DispatchId::new("assistant"), OperatorId::new("assistant"));

    // 5. Build a Context and inject the user message
    let mut ctx = Context::new();
    ctx.inject_message(Message::new(Role::User, Content::text("What time is it right now?")))
        .await?;

    // 6. Run the react loop
    let output = react_loop(&mut ctx, &provider, &tools, &tool_ctx, &config).await?;

    println!("Response: {:?}", output.message);
    println!("Exit reason: {:?}", output.exit_reason);
    println!("Tokens: {} in, {} out",
        output.metadata.tokens_in,
        output.metadata.tokens_out,
    );
    println!("Cost: ${}", output.metadata.cost);

    Ok(())
}

What is happening

Provider creation. AnthropicProvider::new(api_key) creates an HTTP client for the Anthropic Messages API. The provider implements the Provider trait, which is an internal (non-object-safe) trait used by operator implementations.
Tool registration. The CurrentTimeTool implements ToolDyn – an object-safe trait that defines a tool’s name, description, JSON Schema, and async execution. Tools are stored as Arc<dyn ToolDyn> in the ToolRegistry.
Loop configuration. ReactLoopConfig holds the system prompt, model, and token limits. It is a plain config struct – not an operator. The react loop uses it to build a CompileConfig for each inference call.
Context and execution. Context is the conversation store – it holds messages, assembly ops, and rules. You inject a user message, then call react_loop() which composes the core primitives: compile context, infer with the provider, apply context ops (append response, execute tools), repeat until the model produces a final response or a limit is reached.
Output. OperatorOutput contains the response message, exit reason (why the loop stopped), and metadata (tokens, cost, duration, sub-dispatch records).

Tip: To use react_loop behind the object-safe Operator trait boundary, wrap it in your own struct that implements Operator. See the Operators guide for the pattern.

Next steps

Read Core Concepts to understand the protocol architecture.
See Providers for details on configuring Anthropic, OpenAI, and Ollama.
See Tools for the full tool authoring guide.
See Operators for ReAct vs. single-shot configuration.

Core Concepts

skelegent’s architecture is built on four Layer 0 protocol traits (plus Signalable and Queryable at Layer 2) and two cross-cutting interfaces, organized into six layers. This page explains each concept and how they compose.

Protocol traits

Every agentic system must answer these questions. Each question maps to a protocol trait. The first four traits (Operator, Dispatcher, StateStore, Environment) live in layer0. Signalable and Queryable live in Layer 2 (skg-effects-core).

Protocol 1: Operator – “What does one agent do per cycle?”

The Operator trait defines the boundary around a single agent’s execution cycle. Input goes in, the agent reasons (model calls) and acts (tool execution), and output comes out.

#![allow(unused)]
fn main() {
#[async_trait]
pub trait Operator: Send + Sync {
    async fn execute(&self, input: OperatorInput) -> Result<OperatorOutput, OperatorError>;
}
}

The trait is intentionally one method. From the outside, an operator is atomic – you do not care whether it made 1 model call or 20, whether it used tools or not, or what context strategy it used. Those are implementation details.

Implementations include a context engine (composable three-phase engine with assembly, inference, reaction) and SingleShotOperator (one model call, no tools).

The context engine’s Context type is the conversation store. It holds the messages array sent to the model. Your application’s domain data — shell history, file state, user preferences — feeds into Context via assembly operations (inject_system, inject_message) or the system_addendum field in OperatorConfig. Domain data and conversation state are separate concerns: your app owns the domain data, Context owns the conversation.

Protocol 2: Dispatcher, Signalable, Queryable – “How do agents compose?”

These three traits decompose the orchestration boundary:

Dispatcher defines how one agent invokes another:

#![allow(unused)]
fn main() {

#[async_trait]

pub trait Dispatcher: Send + Sync {

    async fn dispatch(&self, operator: &OperatorId, input: OperatorInput)

        -> Result<OperatorOutput, OrchError>;

}

}

dispatch might be a function call (in-process) or a network hop to another continent. The caller does not know and does not care.

Signalable provides fire-and-forget inter-workflow messaging:

#![allow(unused)]
fn main() {

#[async_trait]

pub trait Signalable: Send + Sync {

    async fn signal(&self, target: &WorkflowId, signal: SignalPayload)

        -> Result<(), OrchError>;

}

}

Queryable enables read-only inspection of workflow state:

#![allow(unused)]
fn main() {

#[async_trait]

pub trait Queryable: Send + Sync {

    async fn query(&self, target: &WorkflowId, query: QueryPayload)

        -> Result<serde_json::Value, OrchError>;

}

}

Related: dispatch_many() is a free function in skg-orch-kit that dispatches multiple tasks in parallel using Dispatcher.

Protocol 3: StateStore – “How does data persist?”

The StateStore trait provides scoped key-value persistence with optional semantic search.

#![allow(unused)]
fn main() {
#[async_trait]
pub trait StateStore: Send + Sync {
    async fn read(&self, scope: &Scope, key: &str)
        -> Result<Option<serde_json::Value>, StateError>;
    async fn write(&self, scope: &Scope, key: &str, value: serde_json::Value)
        -> Result<(), StateError>;
    async fn delete(&self, scope: &Scope, key: &str) -> Result<(), StateError>;
    async fn list(&self, scope: &Scope, prefix: &str) -> Result<Vec<String>, StateError>;
    async fn search(&self, scope: &Scope, query: &str, limit: usize)
        -> Result<Vec<SearchResult>, StateError>;
}
}

Values are serde_json::Value, which provides schema flexibility without sacrificing serializability. Scopes partition data (per-agent, per-session, per-workflow). Implementations include MemoryStore (in-memory HashMap, good for tests) and FsStore (filesystem-backed, durable).

A read-only projection, StateReader, is provided to operators during context assembly. Operators can read state but must declare writes as effects – they never write directly.

Protocol 4: Environment – “Where does the agent run?”

The Environment trait mediates execution within an isolation boundary.

#![allow(unused)]
fn main() {
#[async_trait]
pub trait Environment: Send + Sync {
    async fn run(&self, input: OperatorInput, spec: &EnvironmentSpec)
        -> Result<OperatorOutput, EnvError>;
}
}

The EnvironmentSpec declares isolation boundaries (process, container, VM, Wasm), credential injection, resource limits, and network policy. LocalEnv passes through with no isolation (for development). Future implementations could spin up containers or Kubernetes pods.

Middleware and runtime coordination

Middleware – Observation and intervention

Per-boundary middleware traits wrap each protocol’s operations using the continuation pattern. Three traits — one per protocol boundary — live in layer0::middleware:

DispatchMiddleware wraps Dispatcher::dispatch. Code before next.dispatch() = pre-processing; code after = post-processing; not calling next = short-circuit.
StoreMiddleware wraps StateStore read/write. Use for encryption-at-rest, audit trails, caching, access control.
ExecMiddleware wraps Environment::run. Use for resource metering, credential injection, sandboxing.

Middleware composes via DispatchStack, StoreStack, and ExecStack builders that organize layers into observer → transformer → guard ordering.

For operator-local interception (before/after inference, before/after tool use), the Rule system provides typed per-trigger-point rules with default no-op implementations. Rules fire via Trigger: Before, After, or When.

Lifecycle coordination – Above Layer 0

Budget/compaction coordination and observation/intervention mechanics span multiple protocols, but they currently live in runtime or orchestration code above Layer 0. Layer 0 keeps the middleware seams and message-level hints such as CompactionPolicy; higher layers decide when to halt, compact, observe, or intervene.

How layers compose

The six layers form a strict dependency hierarchy:

Layer 5  Cross-Cutting    (middleware, runtime governance)
Layer 4  Environment      (isolation, credentials)
Layer 3  State            (persistence)
Layer 2  Orchestration    (multi-agent composition)
Layer 1  Operator impls   (providers, tools, operators, MCP)
Layer 0  Protocol traits  (the stability contract)

Higher layers depend on lower layers, never the reverse. Layer 0 has no knowledge of any implementation. A Layer 1 crate depends on Layer 0 for trait definitions but knows nothing about orchestration or state backends.

This means you can replace any layer’s implementation without touching other layers. Swap MemoryStore for a hypothetical PostgresStore and nothing in your operator code changes. Swap LocalOrch for a Temporal-backed orchestrator and your operators, tools, and state stores remain identical.

The composition pattern

A typical application composes the layers like this:

Operator   = ContextEngine<AnthropicProvider> + ToolRegistry
Middleware = DispatchStack { RedactionMiddleware, ExfilGuardMiddleware }
State      = FsStore (filesystem persistence)
Env        = LocalEnv (no isolation, dev mode)
Orchestr.  = LocalOrch { agent_a -> Operator, agent_b -> Operator }

Each component is constructed independently, then composed through trait objects. The orchestrator holds Arc<dyn Operator> references. The environment holds its own operator reference. Nothing knows about concrete types beyond its own construction site.

Tools and agents

These terms name configuration patterns built on top of Operator, not separate types.

Tool: An operator registered with ToolMetadata (name, description, JSON input schema, concurrency hint). The metadata makes the operator callable from an LLM reasoning loop. The distinction between a tool and any other operator is configuration, not type — the Operator trait is the same.

Agent: A configured operator. Concretely: an Operator implementation (typically a context engine) wired with a provider, identity, tools, and optionally an Arc<dyn Dispatcher> for sub-dispatching to other agents. The term ‘agent’ has no corresponding trait; it describes how an operator is assembled and what capabilities it receives at construction time.

To create an agent, wrap react_loop() (from skg-context-engine) in a struct that implements Operator. The struct holds the provider, tools, and config. The execute() method creates a fresh Context, assembles domain context into it, and calls react_loop(). The provider’s generic type parameter is erased at the Operator boundary — callers interact with Arc<dyn Operator> and never see the concrete provider type. See the operators guide for a complete example.

Architecture

This section describes the structural design of skelegent in detail:

The 6-Layer Model – What each layer does, which crates belong to it, and the dependency rules that keep the system composable.
Protocol Traits – The four protocol traits and two cross-cutting interfaces that form the stability contract.
Design Decisions – Key architectural choices and the reasoning behind them.
Dependency Graph – How crates depend on each other, with an ASCII diagram.

For the full design rationale, see ARCHITECTURE.md in the repository root and the detailed specifications in specs/.

The 6-Layer Model

skelegent organizes its crates into six layers plus an umbrella crate. Each layer has a clear responsibility. The fundamental rule: higher layers depend on lower layers, never the reverse.

 ┌──────────────────────────────────────────────────┐
 │            skelegent (umbrella crate)                │
 │         Feature-gated re-exports of all layers    │
 ├──────────────────────────────────────────────────┤
│  LAYER 5 — Cross-Cutting                         │
│  Security middleware, runtime governance         │
├──────────────────────────────────────────────────┤
 │  LAYER 4 — Environment                           │
 │  Isolation, credentials, secret backends,         │
 │  auth backends, crypto backends                   │
 ├──────────────────────────────────────────────────┤
 │  LAYER 3 — State                                 │
 │  Persistence backends (memory, filesystem)        │
 ├──────────────────────────────────────────────────┤
 │  LAYER 2 — Orchestration                         │
 │  Multi-agent composition (local, kit)             │
 ├──────────────────────────────────────────────────┤
 │  LAYER 1 — Operator Implementations              │
 │  Providers, tools, operators, context, MCP        │
 ├──────────────────────────────────────────────────┤
 │  LAYER 0 — Protocol Traits (layer0)              │
│  4 protocols + 2 interfaces + message types       │
 │  The stability contract. Changes: almost never.   │
 └──────────────────────────────────────────────────┘

Layer 0 – Protocol Traits

Crate: layer0

Layer 0 is the stability contract. It defines four protocol traits (Operator, Dispatcher, StateStore/StateReader, Environment), two cross-cutting interfaces (per-boundary middleware traits, lifecycle events), and all the message types that cross protocol boundaries (OperatorInput, OperatorOutput, Content, Effect, Scope, typed IDs). Signalable and Queryable are defined in Layer 2 (skg-effects-core), not Layer 0.

Dependencies: serde, async-trait, thiserror, rust_decimal, serde_json. Nothing else. No runtime, no HTTP, no provider-specific types.

Change frequency: Almost never. Adding a method to a protocol trait is a breaking change that ripples through every implementation. The traits were designed with extension points (#[non_exhaustive] enums, serde_json::Value metadata fields) to avoid needing changes.

Layer 1 – Operator Implementations

Crates:

skg-turn – Shared toolkit: Provider trait, InferRequest, InferResponse, TokenUsage, type conversions
skg-turn-kit – Turn decomposition primitives and helpers
skg-provider-anthropic – Anthropic Claude API provider
skg-provider-openai – OpenAI API provider
skg-provider-ollama – Ollama local model provider
skg-tool – ToolDyn trait, ToolRegistry, AliasedTool
skg-context – Conversation context management and compaction strategies
skg-mcp – MCP (Model Context Protocol) client
skg-context-engine – Composable three-phase context engine (assembly, inference, reaction) with tool execution
skg-op-single-shot – Single-shot operator (one model call, no tools)

Layer 1 is where the core agentic loop lives. The Provider trait (defined in skg-turn) is intentionally not object-safe – it uses RPITIT for zero-cost abstraction. The object-safe boundary is layer0::Operator. The bridge is a context engine implementation (generic over the provider type) that implements the object-safe Operator trait.

Layer 2 – Orchestration

Crates:

skg-orch-local – In-process orchestrator using tokio tasks
skg-orch-kit – Shared orchestration utilities
skg-effects-core – EffectHandler trait, Signalable, Queryable, and shared effect execution types
skg-effects-local – LocalEffectHandler: in-process effect handler (in-order, best-effort)

Layer 2 implements layer0::Dispatcher and the skg-effects-core traits Signalable and Queryable. The LocalOrch dispatches operator invocations in-process using tokio. It maps OperatorId to Arc<dyn Operator> and handles parallel dispatch via tokio::spawn. The effects crates execute Effect payloads declared by operators — they live at Layer 2 because effect execution is an orchestration concern, not a protocol concern.

Future implementations could include Temporal workflows (durable, replayable) or Restate (durable execution with virtual objects).

Layer 3 – State

Crates:

skg-state-memory – In-memory HashMap store (ephemeral, good for tests)
skg-state-fs – Filesystem-backed store (durable across restarts)

Layer 3 implements layer0::StateStore. Both backends provide scoped key-value storage with serde_json::Value values. The memory store is ideal for testing and short-lived processes. The filesystem store persists data as files, suitable for CLI tools and local development.

Future implementations could include SQLite (embedded), PostgreSQL (queryable, transactional), or Redis (networked, fast).

Layer 4 – Environment

Crates:

skg-env-local – Local passthrough environment (no isolation)
skg-secret – Secret resolution trait
skg-secret-vault – HashiCorp Vault secrets
skg-auth – Authentication and credential framework
skg-crypto – Cryptographic primitives

Layer 4 implements layer0::Environment and provides the credential infrastructure that environments use. LocalEnv passes through with no isolation – it holds an Arc<dyn Operator> and calls execute() directly. The secret, auth, and crypto backends provide credential resolution for the EnvironmentSpec’s CredentialRef system.

Layer 5 – Cross-Cutting

Crates:

skg-hook-security – Security middleware (RedactionMiddleware, ExfilGuardMiddleware)

Layer 5 provides security middleware that wraps operator dispatch, store access, and execution boundaries. The per-boundary middleware traits (DispatchMiddleware, StoreMiddleware, ExecMiddleware) are defined in Layer 0 and composed into stacks (DispatchStack, StoreStack, ExecStack). Layer 5 crates supply concrete middleware implementations — for example, RedactionMiddleware scrubs sensitive data from model outputs, and ExfilGuardMiddleware blocks unauthorized data exfiltration through tool calls.

The umbrella crate

Crate: skelegent

The umbrella crate re-exports all layers behind feature flags. It exists so users can write skelegent = { features = ["context-engine", "provider-anthropic"] } instead of depending on 5+ individual crates. See Installation for the full feature flag table.

Dependency rules

A crate may depend on crates at the same layer or lower layers.
A crate may never depend on a crate at a higher layer.
All crates depend on layer0 (directly or transitively).
layer0 depends on nothing in the workspace.

These rules ensure that any layer can be replaced independently. You can swap your state backend without touching your operator code. You can swap your orchestrator without touching your tools. The protocol traits in Layer 0 are the only shared vocabulary.

Protocol Traits

Layer 0 defines four protocol traits and two cross-cutting interfaces. Signalable and Queryable are defined in Layer 2 (skg-effects-core), not Layer 0. Every trait is object-safe (Box<dyn Trait> is Send + Sync), uses #[async_trait], and is designed to be operation-defined rather than mechanism-defined.

“Operation-defined” means the trait says what happens, not how. Operator::execute means “cause this agent to process one cycle” – not “make an API call” or “run a subprocess.” This is what makes implementations swappable.

Protocol 1: Operator

Crate: layer0::operator

The operator is what one agent does per cycle. It receives input, assembles context, reasons (model calls), acts (tool execution), and produces output.

#![allow(unused)]
fn main() {
#[async_trait]
pub trait Operator: Send + Sync {
    async fn execute(
        &self,
        input: OperatorInput,
    ) -> Result<OperatorOutput, OperatorError>;
}
}

The trait is one method. The operator is atomic from the outside.

OperatorInput

#![allow(unused)]
fn main() {
pub struct OperatorInput {
    pub message: Content,              // The new message/task/signal
    pub trigger: TriggerType,          // What caused this invocation (User, Task, Signal, etc.)
    pub session: Option<SessionId>,    // Session for conversation continuity
    pub config: Option<OperatorConfig>,// Per-invocation config overrides
    pub metadata: serde_json::Value,   // Opaque passthrough (trace IDs, routing, etc.)
}
}

OperatorInput carries only what is new. It does not include conversation history or memory contents. The operator runtime reads those from a StateStore during context assembly. This keeps the protocol boundary clean.

OperatorConfig

#![allow(unused)]
fn main() {
pub struct OperatorConfig {
    pub max_turns: Option<u32>,           // Max ReAct loop iterations
    pub max_cost: Option<Decimal>,        // Budget in USD
    pub max_duration: Option<DurationMs>, // Wall-clock timeout
    pub model: Option<String>,            // Model override
    pub allowed_operators: Option<Vec<String>>, // Operator restrictions
    pub system_addendum: Option<String>,  // Additional system prompt
}
}

Every field is optional. None means “use the implementation’s default.”

Tools are operators registered with ToolMetadata. The allowed_operators field restricts which operators can be sub-dispatched during a turn; tool names in this list are operator names.

OperatorOutput

#![allow(unused)]
fn main() {
pub struct OperatorOutput {
    pub message: Content,              // The operator's response
    pub exit_reason: ExitReason,       // Why the loop stopped
    pub metadata: OperatorMetadata,    // Tokens, cost, timing, tool records
    pub effects: Vec<Effect>,          // Side-effects to execute
}
}

The effects field is a critical design decision. The operator declares effects but does not execute them. The calling layer (orchestrator, environment, lifecycle coordinator) decides when and how to execute them. This is what makes the same operator code work both in-process and in a durable workflow.

ExitReason

#![allow(unused)]
fn main() {
pub enum ExitReason {
    Complete,                   // Natural completion
    MaxTurns,                   // Hit iteration limit
    BudgetExhausted,            // Hit cost budget
    CircuitBreaker,             // Consecutive failures
    Timeout,                    // Wall-clock timeout
    MiddlewareHalt { reason },    // Middleware halted execution
    Error,                      // Unrecoverable error
    Custom(String),             // Extension point
}
}

OperatorMetadata

#![allow(unused)]
fn main() {
pub struct OperatorMetadata {
    pub tokens_in: u64,
    pub tokens_out: u64,
    pub cost: Decimal,                    // USD, precise
    pub turns_used: u32,
    pub sub_dispatches: Vec<SubDispatchRecord>,
    pub duration: DurationMs,
}
}

Every field is concrete (not optional) because every operator produces this data. Implementations that cannot track a field (e.g., cost for a local model) use zero.

SubDispatchRecord

SubDispatchRecord captures the result of a single sub-operator dispatch within a turn:

#![allow(unused)]
fn main() {
pub struct SubDispatchRecord {
    pub name: String,         // Operator name that was dispatched
    pub duration: DurationMs, // Wall-clock time for that dispatch
    pub success: bool,        // Whether the dispatch completed without error
}
}

Protocol 2: Dispatcher

Crate: layer0::dispatch

The sole invocation primitive: how one agent’s output becomes another agent’s input.

#![allow(unused)]
fn main() {

#[async_trait]

pub trait Dispatcher: Send + Sync {

    async fn dispatch(

        &self,

        operator: &OperatorId,

        input: OperatorInput,

    ) -> Result<OperatorOutput, OrchError>;

}

}

dispatch – Send an operator invocation to a specific agent. May be in-process or remote. The key property: calling code does not know which implementation is behind the trait.

Related: dispatch_many() is a free function in skg-orch-kit that dispatches multiple tasks in parallel using Dispatcher::dispatch.

Protocol 2b: Signalable

Crate: skg-effects-core

Fire-and-forget inter-workflow messaging.

#![allow(unused)]
fn main() {

#[async_trait]

pub trait Signalable: Send + Sync {

    async fn signal(

        &self,

        target: &WorkflowId,

        signal: SignalPayload,

    ) -> Result<(), OrchError>;

}

}

signal – Fire-and-forget message to a running workflow. Returns when accepted, not when processed.

Protocol 2c: Queryable

Crate: skg-effects-core

Read-only workflow state queries.

#![allow(unused)]
fn main() {

#[async_trait]

pub trait Queryable: Send + Sync {

    async fn query(

        &self,

        target: &WorkflowId,

        query: QueryPayload,

    ) -> Result<serde_json::Value, OrchError>;

}

}

query – Read-only query of a workflow’s state. Returns serde_json::Value (schema depends on the workflow).

Protocol 3: StateStore / StateReader

Crate: layer0::state

How data persists and is retrieved across turns and sessions.

#![allow(unused)]
fn main() {
#[async_trait]
pub trait StateStore: Send + Sync {
    async fn read(&self, scope: &Scope, key: &str)
        -> Result<Option<serde_json::Value>, StateError>;
    async fn write(&self, scope: &Scope, key: &str, value: serde_json::Value)
        -> Result<(), StateError>;
    async fn delete(&self, scope: &Scope, key: &str)
        -> Result<(), StateError>;
    async fn list(&self, scope: &Scope, prefix: &str)
        -> Result<Vec<String>, StateError>;
    async fn search(&self, scope: &Scope, query: &str, limit: usize)
        -> Result<Vec<SearchResult>, StateError>;
}
}

The trait is deliberately minimal: CRUD + list + search. Compaction is not part of this trait because it requires cross-protocol coordination (the lifecycle interface). Versioning is not part of this trait because not all backends support it.

StateReader is a read-only projection:

#![allow(unused)]
fn main() {
#[async_trait]
pub trait StateReader: Send + Sync {
    async fn read(&self, scope: &Scope, key: &str)
        -> Result<Option<serde_json::Value>, StateError>;
    async fn list(&self, scope: &Scope, prefix: &str)
        -> Result<Vec<String>, StateError>;
    async fn search(&self, scope: &Scope, query: &str, limit: usize)
        -> Result<Vec<SearchResult>, StateError>;
}
}

Every StateStore automatically implements StateReader via a blanket impl. Operators receive &dyn StateReader during context assembly – they can read but cannot write directly. Writes go through Effects in the OperatorOutput.

Protocol 4: Environment

Crate: layer0::environment

How an operator executes within an isolated context.

#![allow(unused)]
fn main() {
#[async_trait]
pub trait Environment: Send + Sync {
    async fn run(
        &self,
        input: OperatorInput,
        spec: &EnvironmentSpec,
    ) -> Result<OperatorOutput, EnvError>;
}
}

The Environment owns or has access to whatever it needs to execute an operator. run() takes only data (OperatorInput + EnvironmentSpec), not a function reference. For LocalEnv, the operator is an Arc<dyn Operator> stored at construction time. For a hypothetical DockerEnvironment, the input would be serialized, sent to a container, and the output deserialized.

EnvironmentSpec

#![allow(unused)]
fn main() {
pub struct EnvironmentSpec {
    pub isolation: Vec<IsolationBoundary>,  // Process, Container, Gvisor, MicroVm, Wasm, etc.
    pub credentials: Vec<CredentialRef>,     // Secrets to inject
    pub resources: Option<ResourceLimits>,   // CPU, memory, disk, GPU limits
    pub network: Option<NetworkPolicy>,      // Allow/deny rules
}
}

Interface 5: Per-Boundary Middleware

Crate: layer0::middleware

Observation and intervention at protocol boundaries. Three traits cover the three boundaries where cross-cutting logic is needed:

#![allow(unused)]
fn main() {
#[async_trait]
pub trait DispatchMiddleware: Send + Sync {
    async fn on_dispatch(
        &self,
        operator: &OperatorId,
        input: OperatorInput,
        next: DispatchNext<'_>,
    ) -> Result<OperatorOutput, OrchError>;
}

#[async_trait]
pub trait StoreMiddleware: Send + Sync {
    async fn on_read(
        &self,
        scope: &Scope,
        key: &str,
        next: StoreNext<'_>,
    ) -> Result<Option<serde_json::Value>, StateError>;

    async fn on_write(
        &self,
        scope: &Scope,
        key: &str,
        value: serde_json::Value,
        next: StoreNext<'_>,
    ) -> Result<(), StateError>;
}

#[async_trait]
pub trait ExecMiddleware: Send + Sync {
    async fn on_exec(
        &self,
        input: OperatorInput,
        next: ExecNext<'_>,
    ) -> Result<OperatorOutput, OperatorError>;
}
}

Each middleware wraps the next layer in the stack. The next parameter is a callback that invokes the rest of the middleware chain (and ultimately the real implementation). Middleware can inspect/modify inputs before calling next, inspect/modify outputs after, or short-circuit by returning early without calling next.

Middleware is composed into stacks:

DispatchStack – wraps Dispatcher::dispatch (budget enforcement, logging, routing)
StoreStack – wraps state store access (redaction, audit logging)
ExecStack – wraps operator execution (security guardrails, telemetry)

The Rule system provides typed interception within the context engine specifically, for use cases like tool-call filtering that are operator-internal rather than cross-cutting. Rules fire via Trigger enum: Before (pre-inference, pre-tool), After (post-inference, post-tool), or When (exit checks, steering).

Message-Level Hints

Crate: layer0::lifecycle

This module now carries only message-level policy hints that travel with protocol data:

CompactionPolicy – An advisory per-message hint consumed by compaction code in the runtime or orchestration layers.

Lifecycle coordination, telemetry streams, and observation/intervention mechanics live above Layer 0 unless they are later promoted into a real cross-boundary contract.

Design Decisions

This page summarizes the key architectural decisions in skelegent and the reasoning behind each one.

Why `#[async_trait]` instead of native async traits

Decision: All Layer 0 protocol traits use #[async_trait] (heap-allocated futures). Internal traits in Layer 1 (like Provider) use RPITIT (native async, zero-cost).

Reasoning: Rust stabilized async fn in traits, but async fn in dyn Trait is still not available natively. Layer 0 traits must be object-safe because the entire composition model depends on Box<dyn Operator>, Arc<dyn StateStore>, etc. The async_trait macro provides this by boxing the returned future.

Internal traits like Provider are never used behind dyn – they appear as generic type parameters in operator wrappers around react_loop (e.g., struct MyOperator<P: Provider>). These can use RPITIT for zero-cost abstraction. The object-safe boundary is the Operator trait, which is the protocol boundary.

Future: When Rust stabilizes async fn in dyn Trait with Send bounds, the Layer 0 traits will migrate to native async. This will be a breaking change in a minor version before v1.0.

Why `serde_json::Value` for state values

Decision: StateStore stores serde_json::Value, not generic T: Serialize.

Reasoning: A generic T would destroy object safety. StateStore must work as dyn StateStore because orchestrators, environments, and operators all share a state store through trait objects. Making the trait generic over the value type would require callers to agree on concrete types at compile time, defeating the purpose of dynamic composition.

serde_json::Value is the universal interchange format for agentic systems. Every LLM API speaks JSON. Every tool accepts and returns JSON. The cost (no compile-time schema checking) is acceptable because state data crosses process boundaries, is persisted to disk, and may be read by different versions of the code.

Why `rust_decimal::Decimal` for cost tracking

Decision: All monetary values (OperatorMetadata.cost, OperatorConfig.max_cost) use rust_decimal::Decimal.

Reasoning: Floating-point accumulation errors are real when tracking spend across thousands of LLM calls. f64 introduces rounding errors that compound over time. A system that runs 10,000 model calls per day, each costing fractions of a cent, needs exact arithmetic to produce accurate cost reports and enforce budgets precisely.

Decimal adds one dependency to Layer 0 but eliminates an entire class of bugs.

Why four protocol concerns plus middleware interfaces

Decision: The architecture has four Layer 0 protocol traits (Operator, Dispatcher, StateStore, Environment) plus Signalable and Queryable at Layer 2 (skg-effects-core), and two cross-cutting interfaces (per-boundary middleware, lifecycle events).

Reasoning: The four concerns are orthogonal and compose independently:

Operator – What happens in a single agent cycle (reasoning + acting).
Dispatch/Signal/Query – How multiple agents compose (topology + durability). Dispatcher invokes, Signalable delivers signals, Queryable reads workflow state.
State – How data persists (storage backend).
Environment – Where code runs (isolation + credentials).

These were derived from analyzing 23 architectural decisions that every agentic system must make. The four concerns cover all 23 decisions without overlap. Reducing to three concerns (by merging state into environment, or orchestration into operator) creates coupling where orthogonal concerns should be independent. Expanding to more concerns creates distinctions without meaningful boundaries.

Middleware is the cross-cutting Layer 0 surface because it intercepts stable protocol seams without turning runtime policy into protocol API. Budget/compaction coordination and observation/intervention mechanics live in runtime or orchestration code above Layer 0 unless they are later promoted into a real cross-boundary contract. Layer 0 only carries the middleware seams and message-level hints that travel with data, such as CompactionPolicy.

Why edition 2024

Decision: The workspace uses Rust edition 2024.

Reasoning: Edition 2024 is the latest stable edition and provides native support for RPITIT (return position impl trait in traits) and other modern language features. This allows traits like Provider to use zero-cost async abstractions without workarounds like the async_trait macro. The Rust ecosystem has fully adopted 2024, providing excellent compatibility with all core dependencies.

Why `#[non_exhaustive]` on all enums and structs

Decision: All public enums (ExitReason, TriggerType, etc.) and structs (OperatorInput, OperatorOutput, OperatorConfig, etc.) in Layer 0 are marked #[non_exhaustive].

Reasoning: Layer 0 is the stability contract. Adding a variant to an enum or a field to a struct should not be a breaking change. #[non_exhaustive] forces downstream code to handle unknown variants (with _ => arms) and prevents struct literal construction (forcing use of constructors or builder methods). This gives Layer 0 the freedom to evolve without breaking every implementation.

Why operators declare effects instead of executing them

Decision: OperatorOutput.effects contains Vec<Effect> – the operator declares side-effects but does not execute them.

Reasoning: The same operator code must work in radically different execution contexts. An operator running in-process has its effects executed immediately by the caller. An operator running inside a Temporal activity has its effects serialized and executed by the workflow engine. If the operator executed effects directly, it would be coupled to its execution context.

The effect declaration pattern makes operators pure functions over data: input in, output + effects out. The calling layer decides execution semantics.

Why the Provider trait is not object-safe

Decision: The Provider trait (in skg-turn) uses RPITIT and is not object-safe. It is never used behind dyn Provider.

Reasoning: Provider implementations are performance-critical – they make HTTP calls to LLM APIs. The zero-cost abstraction of RPITIT (no heap allocation for the future) is worth the restriction of not using dyn Provider. The object-safe boundary is one layer up: a concrete operator wrapper (generic over P: Provider) implements dyn Operator. The generic type parameter is erased at the protocol boundary.

This is the general pattern: internal implementation traits can be generic and non-object-safe for performance. Protocol traits must be object-safe for composition. The bridge between them is a concrete type that is generic internally but implements an object-safe trait externally.

Dependency Graph

This page shows how skelegent’s crates depend on each other. The fundamental rule is that dependencies flow downward: higher layers depend on lower layers, never the reverse.

Note: The ASCII diagram below reflects the core dependency relationships but is incomplete — skg-effects-core, skg-effects-local, skg-turn-kit, skg-auth, skg-auth-omp, skg-crypto, skg-env-docker, skg-orch-env, skg-provider-router, skg-run-core, skg-runner, and skg-state-proxy are not shown. See the crate list in layers.md and crate-map.md for the complete and authoritative crate inventory.

ASCII dependency graph

                        skelegent (umbrella)
                 feature-gated re-exports of all layers
                              │
         ┌────────────────────┼────────────────────────┐
         │                    │                        │
         ▼                    ▼                        ▼
  skg-context-engine  skg-op-single-shot     skg-orch-local
  (Layer 1)          (Layer 1)                 (Layer 2)
    │  │                 │                       │  │
    │  │                 │                       │  └──► skg-orch-kit (L2)
    │  │                 │                       │         │
    │  └─────────────────┼───────────────────────┘        │
    │                    │                                │
    │                    ▼                                │
    │                 skg-turn ◄───────────────────────┘
    │                 (Layer 1)
    │                    ▲  ▲  ▲
    │        ┌───────────┘  │  └───────────┐
    │        │              │              │
    │  skg-provider-  skg-provider-  skg-provider-
    │  anthropic         openai            ollama
    │  (Layer 1)         (Layer 1)         (Layer 1)
    │
    ▼
  skg-tool              skg-mcp
  (Layer 1)                (Layer 1)
    │                        │
    │                        │
    └────────┬───────────────┘
             │
             ▼
           layer0
         (Layer 0)
             ▲
             │
    ┌────────┼──────────┬──────────────┐
    │        │          │              │
skg-   skg-    skg-       skg-
state-    state-     env-local     secret-*
memory    fs         (Layer 4)     skg-auth-*
(Layer 3) (Layer 3)                skg-crypto-*
                                   (Layer 4)

Key relationships

Layer 0: The foundation

layer0 has no workspace dependencies. It depends only on:

serde (serialization for protocol messages)
async-trait (object-safe async traits)
thiserror (ergonomic error types)
rust_decimal (precise cost tracking)
serde_json (for Value in metadata and state)

Every other crate in the workspace depends on layer0, directly or transitively.

Layer 1: Operator ecosystem

The operator ecosystem has several internal dependencies:

skg-turn provides the Provider trait and shared types. All three provider crates depend on it.
skg-tool provides ToolDyn and ToolRegistry. It depends only on layer0.
skg-mcp depends on skg-tool (it creates tools from MCP servers).
skg-context-engine depends on skg-turn (for Provider), skg-tool (for ToolRegistry), and layer0 (for middleware traits).
skg-op-single-shot depends on skg-turn and layer0.

Layer 2: Orchestration

skg-orch-local depends on layer0 and skg-orch-kit. It holds Arc<dyn Operator> references.
skg-orch-kit provides shared utilities for orchestrator implementations.
skg-orch-env depends on layer0 and routes operator execution through Environment::run.
skg-run-core depends only on general-purpose crates (serde, async-trait, thiserror) and defines portable durable run/control primitives above Layer 0.
skg-runner is a runner binary for containerized/operator-hosted execution; it depends on layer0, skg-turn, and transport/runtime crates.

Layer 3: State

skg-state-memory and skg-state-fs depend only on layer0 (and tokio for async I/O). They are completely independent of each other and of all other layers.
skg-state-proxy depends on layer0 and gRPC transport crates; it exposes StateStore over the network for cross-container access.

Layer 4: Environment and credentials

skg-env-local depends on layer0. It holds an Arc<dyn Operator>.
skg-env-docker depends on layer0, skg-secret, and Docker/gRPC crates to run operators in isolated containers.
The secret backends (skg-secret-*), auth backends (skg-auth-*), and crypto backends (skg-crypto-*) depend on skg-secret/skg-auth/skg-crypto respectively, and transitively on layer0.
skg-auth-omp is a local-tooling auth backend that reads OMP credentials from agent.db.

Layer 5: Cross-cutting

skg-hook-security depends on layer0 (for middleware traits). It provides RedactionMiddleware and ExfilGuardMiddleware.

The umbrella

skelegent depends on everything, all behind optional = true with feature flags. It re-exports but adds no logic.

External dependencies by layer

Layer	External deps
0	`serde`, `async-trait`, `thiserror`, `rust_decimal`, `serde_json`
1	`reqwest`, `tokio`, `serde_json`, `schemars` (tools)
2	`tokio`
3	`tokio`
4	Provider-specific SDKs (`aws-sdk`, `gcp`, `reqwest`)
5	`layer0` only (middleware is pure logic)

Crates not shown in the ASCII diagram

The following crates were added after the diagram was drawn and are not yet reflected in the ASCII art above:

Crate	Layer	Depends on
`skg-turn-kit`	1	`layer0`, `skg-turn`
`skg-provider-router`	1	`skg-turn`
`skg-effects-core`	2	`layer0`
`skg-effects-local`	2	`layer0`, `skg-effects-core`
`skg-orch-env`	2	`layer0`
`skg-run-core`	2	general-purpose crates only
`skg-runner`	2	`layer0`, `skg-turn`
`skg-state-proxy`	3	`layer0`
`skg-env-docker`	4	`layer0`, `skg-secret`
`skg-auth`	4	`layer0`
`skg-auth-omp`	4	`skg-auth`, `skg-secret`
`skg-crypto`	4	`layer0`

Guides

Practical guides for working with each subsystem in skelegent:

Providers – How to configure and use LLM providers (Anthropic, OpenAI, Ollama).
Tools – How to create tools, register them, and use them with operators.
Operators – ReAct (reasoning loop with tools) and single-shot (one call, no tools) operators.
State – Persisting data with in-memory and filesystem state stores.
Orchestration – Multi-agent composition with the local orchestrator.
Hooks – Observing and intervening in operator execution.
MCP – Model Context Protocol integration.
Testing – Testing patterns, mock providers, and the test-utils feature.

Each guide focuses on one subsystem. For architectural context (why things are designed this way), see the Architecture section.

Providers

Providers are the LLM backend abstraction in skelegent. Each provider implements the Provider trait (defined in skg-turn), which sends a completion request to an LLM API and returns the response.

The Provider trait

#![allow(unused)]
fn main() {
pub trait Provider: Send + Sync {
    fn infer(
        &self,
        request: InferRequest,
    ) -> impl Future<Output = Result<InferResponse, ProviderError>> + Send;
}
}

This trait uses RPITIT (return-position impl Trait in traits) and is intentionally not object-safe. The object-safe boundary is layer0::Operator, not Provider. See Design Decisions for why.

Available providers

Anthropic (`skg-provider-anthropic`)

Connects to the Anthropic Messages API for Claude models.

#![allow(unused)]
fn main() {
use skg_provider_anthropic::AnthropicProvider;

let provider = AnthropicProvider::new("sk-ant-...");
}

Configuration:

API key: Passed to new(). Read it from ANTHROPIC_API_KEY in production.
Default model: claude-haiku-4-5-20251001. Override per-request via InferRequest.model.
Default max tokens: 4096. Override per-request via InferRequest.max_tokens.
API URL: Override with .with_url() for proxies or testing.

#![allow(unused)]
fn main() {
use skg_provider_anthropic::AnthropicProvider;

let provider = AnthropicProvider::new("sk-ant-...")
    .with_url("https://proxy.example.com/v1/messages");
}

Cost is calculated per-response based on input and output token counts using the Haiku pricing model.

OpenAI (`skg-provider-openai`)

Connects to the OpenAI Chat Completions API.

#![allow(unused)]
fn main() {
use skg_provider_openai::OpenAiProvider;

let provider = OpenAiProvider::new("sk-...");
}

Configuration:

API key: Passed to new(). Read it from OPENAI_API_KEY in production.
API URL: Override with .with_url() for Azure OpenAI or proxies.

Ollama (`skg-provider-ollama`)

Connects to a local Ollama instance for running open-weight models.

#![allow(unused)]
fn main() {
use skg_provider_ollama::OllamaProvider;

let provider = OllamaProvider::new(); // defaults to http://localhost:11434
}

Configuration:

URL: Defaults to http://localhost:11434. Override with .with_url().
No API key required (Ollama runs locally).

InferRequest and InferResponse

The InferRequest struct is the common input to all providers:

#![allow(unused)]
fn main() {
pub struct InferRequest {
    pub model: Option<String>,           // Model identifier
    pub messages: Vec<Message>,           // Conversation history (layer0 types)
    pub tools: Vec<ToolSchema>,           // Available tools (JSON Schema)
    pub max_tokens: Option<u32>,          // Max output tokens
    pub temperature: Option<f64>,         // Sampling temperature
    pub system: Option<String>,           // System prompt
    pub extra: serde_json::Value,         // Provider-specific extensions
}
}

The extra field allows provider-specific features (Anthropic’s prompt caching, thinking blocks, etc.) without polluting the common interface.

The InferResponse contains the model’s output:

#![allow(unused)]
fn main() {
pub struct InferResponse {
    pub content: Content,             // Response content
    pub tool_calls: Vec<ToolCall>,    // Tool calls requested by the model
    pub stop_reason: StopReason,      // EndTurn, ToolUse, MaxTokens
    pub usage: TokenUsage,            // Input/output/cache tokens
    pub model: String,                // Model that responded
    pub cost: Option<Decimal>,        // Calculated cost in USD
}
}

Using providers with operators

Providers are not used directly in most application code. Instead, you pass a provider to an operator:

#![allow(unused)]
fn main() {
use skg_context_engine::{Context, react_loop, ReactLoopConfig};
use skg_provider_anthropic::AnthropicProvider;
use layer0::DispatchContext;
use layer0::id::{DispatchId, OperatorId};
use skg_tool::ToolRegistry;

let provider = AnthropicProvider::new("sk-ant-...");
let config = ReactLoopConfig {
    system_prompt: "You are a helpful assistant.".into(),
    model: Some("claude-haiku-4-5-20251001".into()),
    max_tokens: Some(4096),
    temperature: None,
};

let tools = ToolRegistry::new();
let tool_context = DispatchContext::new(DispatchId::new("assistant"), OperatorId::new("assistant"));
let mut context = Context::new();

// react_loop drives the ReAct loop, calling provider.infer() as needed
let response = react_loop(&provider, &tools, &tool_context, &mut context, &config).await;
}

To make this usable behind layer0::Operator (which is object-safe), wrap the provider, tools, and config in a struct that implements Operator. The Provider type parameter is erased at the Operator trait boundary – callers interact with &dyn Operator or Box<dyn Operator>.

Error handling

Provider errors are represented by ProviderError:

#![allow(unused)]
fn main() {
pub enum ProviderError {
    TransientError { message: String, status: Option<u16> },
    RateLimited,
    ContentBlocked { message: String },
    AuthFailed(String),
    InvalidResponse(String),
    Other(Box<dyn Error + Send + Sync>),
}
}

ProviderError::is_retryable() returns true for RateLimited and TransientError (transient network errors), and false for AuthFailed, ContentBlocked, and InvalidResponse (permanent errors). Operator implementations use this to decide whether to retry.

Tools

Tools give operators the ability to take actions: read files, make HTTP requests, query databases, or perform any side-effecting operation. The tool system is built around the ToolDyn trait and the ToolRegistry.

Unified model: In the skelegent architecture, tools are operators registered with ToolMetadata. ToolDyn and ToolRegistry are the Layer 1 convenience API; at the protocol level, tools are dispatched as operators. The ToolOperator adapter (from skg_tool::adapter) bridges ToolDyn to the Operator trait, so any ToolDyn implementation can be used anywhere an Operator is expected.

The ToolDyn trait

#![allow(unused)]
fn main() {
pub trait ToolDyn: Send + Sync {
    fn name(&self) -> &str;
    fn description(&self) -> &str;
    fn input_schema(&self) -> serde_json::Value;
    fn call(
        &self,
        input: serde_json::Value,
    ) -> Pin<Box<dyn Future<Output = Result<serde_json::Value, ToolError>> + Send + '_>>;
}
}

ToolDyn is object-safe. Tools are stored as Arc<dyn ToolDyn> and can be composed dynamically at runtime. The four methods:

name() – Unique identifier for the tool. This is what the model uses to request the tool.
description() – Human-readable description. Sent to the model as part of the tool definition.
input_schema() – JSON Schema describing the tool’s parameters. The model generates input conforming to this schema.
call() – Async execution. Takes JSON input and a &DispatchContext, returns JSON output or a ToolError.

Creating a tool

Implement ToolDyn for any struct:

#![allow(unused)]
fn main() {
use skg_tool::{ToolDyn, ToolError};
use serde_json::{json, Value};
use std::future::Future;
use std::pin::Pin;

struct ReadFileTool;

impl ToolDyn for ReadFileTool {
    fn name(&self) -> &str {
        "read_file"
    }

    fn description(&self) -> &str {
        "Read the contents of a file at the given path."
    }

    fn input_schema(&self) -> Value {
        json!({
            "type": "object",
            "properties": {
                "path": {
                    "type": "string",
                    "description": "The file path to read"
                }
            },
            "required": ["path"]
        })
    }

    fn call(
        &self,
        input: Value,
        ctx: &DispatchContext,
    ) -> Pin<Box<dyn Future<Output = Result<Value, ToolError>> + Send + '_>> {
        Box::pin(async move {
            let path = input["path"]
                .as_str()
                .ok_or_else(|| ToolError::InvalidInput("missing 'path'".into()))?;

            let contents = tokio::fs::read_to_string(path)
                .await
                .map_err(|e| ToolError::ExecutionFailed(e.to_string()))?;

            Ok(json!({ "contents": contents }))
        })
    }
}
}

The ToolRegistry

ToolRegistry is a named collection of tools:

#![allow(unused)]
fn main() {
use skg_tool::ToolRegistry;
use std::sync::Arc;

let mut registry = ToolRegistry::new();
registry.register(Arc::new(ReadFileTool));
registry.register(Arc::new(WriteFileTool));
registry.register(Arc::new(BashTool));

// Look up by name
if let Some(tool) = registry.get("read_file") {
    let result = tool.call(json!({"path": "/tmp/test.txt"})).await?;
}

// Iterate all tools (e.g., to build tool definitions for the model)
for tool in registry.iter() {
    println!("{}: {}", tool.name(), tool.description());
}
}

Tools are keyed by name. Registering a tool with the same name as an existing tool overwrites it.

AliasedTool

AliasedTool wraps an existing tool under a different name. This is useful when importing tools from external systems (e.g., MCP servers) where upstream names do not match your desired naming scheme:

#![allow(unused)]
fn main() {
use skg_tool::AliasedTool;
use std::sync::Arc;

let original: Arc<dyn ToolDyn> = Arc::new(ReadFileTool);
let aliased = Arc::new(AliasedTool::new("read", original));

assert_eq!(aliased.name(), "read");
// description, schema, and call behavior are delegated to the inner tool
}

Tool errors

#![allow(unused)]
fn main() {
pub enum ToolError {
    NotFound(String),         // Tool not found in registry
    ExecutionFailed(String),  // Tool execution failed
    InvalidInput(String),     // Input didn't match schema
    Other(Box<dyn Error>),    // Catch-all
}
}

How tools integrate with operators

The react_loop function uses a ToolRegistry internally. When the model responds with a ToolUse content block, the loop:

Looks up the tool by name in the registry.
Fires PreSubDispatch hooks (which may skip or modify the call).
Calls tool.call(input, ctx).
Fires PostSubDispatch hooks (which may modify the output).
Backfills the tool result into the conversation context.
Calls the model again with the updated context.

This continues until the model produces a final text response (no more tool use), a limit is reached, or a hook halts execution.

Tool schema design tips

Use "required" to mark parameters the model must provide.
Include "description" on each property – the model uses these to understand what to pass.
Keep schemas simple. Complex nested schemas increase the chance of the model producing invalid input.
Return structured JSON from call(). The model reads the tool result to decide its next action.

Operators

Operators are the core execution unit in skelegent. An operator implements layer0::Operator and encapsulates everything needed to process one agent cycle: context assembly, model calls, tool execution, and output construction.

The Operator trait

#![allow(unused)]
fn main() {
#[async_trait]
pub trait Operator: Send + Sync {
    async fn execute(
        &self,
        input: OperatorInput,
    ) -> Result<OperatorOutput, OperatorError>;
}
}

skelegent ships a context engine (skg-context-engine) — a set of composable primitives around react_loop — and SingleShotOperator (one model call, no tools). External consumers wrap react_loop in their own impl Operator struct for the object-safe boundary.

Context Engine

Crate: skg-context-engine

The context engine is not a monolithic struct. It is a set of composable primitives centered on react_loop(), which orchestrates the assembly → inference → reaction loop:

Assemble context – Build the prompt from the system prompt, conversation history, tool definitions, and the new input message.
Call the model – Send the assembled context to the provider.
Check for tool use – If the model requested tool calls, execute them.
Backfill results – Add tool results to the conversation context.
Repeat – Loop back to step 2 until the model produces a final response or a limit is reached.

Construction

To use the context engine as an Operator, create a wrapper struct that holds a Provider, ToolRegistry, and ReactLoopConfig, then implement Operator by constructing a Context, injecting the user message, and calling react_loop():

#![allow(unused)]
fn main() {
use async_trait::async_trait;
use layer0::operator::{Operator, OperatorInput, OperatorOutput, OperatorError};
use layer0::context::{Message, Role};
use skg_context_engine::{Context, react_loop, ReactLoopConfig};
use skg_turn::provider::Provider;
use layer0::DispatchContext;
use skg_tool::ToolRegistry;

struct MyOperator<P: Provider> {
    provider: P,
    config: ReactLoopConfig,
    tools: ToolRegistry,
    tool_ctx: DispatchContext,
}

#[async_trait]
impl<P: Provider> Operator for MyOperator<P> {
    async fn execute(
        &self,
        input: OperatorInput,
    ) -> Result<OperatorOutput, OperatorError> {
        // Context is the conversation store — create one per invocation
        let mut ctx = Context::new();

        // Inject domain context (shell history, file state, etc.) via assembly ops
        // ctx.inject_system("Additional context here").await?;

        // Inject the user input
        ctx.inject_message(Message::new(Role::User, input.message))
            .await
            .map_err(OperatorError::context_assembly)?;

        // react_loop composes Context, CompileConfig, AppendResponse,
        // and ExecuteTool internally — you just hand it the primitives
        react_loop(&mut ctx, &self.provider, &self.tools, &self.tool_ctx, &self.config)
            .await
            .map_err(|e| OperatorError::non_retryable(e.to_string()))
    }
}
}

The key integration pattern:

Your domain context (shell history, file state, user prefs)
    ↓ feeds into
Context via inject_system(), inject_message(), or system_addendum in OperatorConfig
    ↓ manages
LLM conversation turns (Message with Role + Content)
    ↓ compiles to
CompiledContext → infer(provider) → InferResult
    ↓ response goes through
ContextOps (AppendResponse, ExecuteTool) → rules fire automatically

Configuration

ReactLoopConfig sets the static defaults for the loop:

Field	Default	Description
`system_prompt`	`""`	Base system prompt prepended to every request
`model`	`None`	Model identifier (e.g., `Some("claude-haiku-4-5-20251001".into())`)
`max_tokens`	`None`	Max tokens per model response
`temperature`	`None`	Sampling temperature
These defaults can be overridden per-invocation via `OperatorConfig` in the `OperatorInput`:

#![allow(unused)]
fn main() {
use layer0::operator::{OperatorConfig, OperatorInput, TriggerType};
use layer0::content::Content;
use rust_decimal_macros::dec;

let mut input = OperatorInput::new(
    Content::text("Refactor this module"),
    TriggerType::User,
);
input.config = Some(OperatorConfig {
    max_turns: Some(20),           // Allow more iterations
    max_cost: Some(dec!(0.50)),    // Budget: $0.50
    model: Some("claude-sonnet-4-20250514".into()), // Use a different model
    ..Default::default()
});
}

Exit reasons

The context engine loop stops when:

Complete – The model produced a final text response without requesting any tool use.
MaxTurns – The max_turns limit was reached.
BudgetExhausted – Accumulated cost exceeded max_cost or tool-call step limit exceeded.
Timeout – Wall-clock time exceeded max_duration.
InterceptorHalt { reason } – An interceptor (including a Rule that returns RuleAction::Halt) stopped execution.
CircuitBreaker – Too many consecutive failures (provider errors or tool errors).
Error – An unrecoverable error occurred.
SafetyStop { reason } – Provider safety system stopped generation (content filter or safety mechanism triggered).
AwaitingApproval – One or more tool calls require human approval before execution.
Custom(String) – Operator-defined exit reason.

Effects

The context engine supports effect-producing tools. If a tool is registered in the operator’s EffectTools configuration, calling it produces an Effect in the OperatorOutput instead of executing the tool directly. This is useful for tools that should be executed by the orchestrator or environment rather than inline (e.g., spawning a sub-agent, signaling a workflow).

SingleShotOperator

Crate: skg-op-single-shot

The single-shot operator makes exactly one model call with no tool use. It is useful for:

Classification tasks
Summarization
Structured data extraction
Any task where tool use is not needed

#![allow(unused)]
fn main() {
use skg_op_single_shot::{SingleShotConfig, SingleShotOperator};
use skg_provider_anthropic::AnthropicProvider;

let config = SingleShotConfig {
    system_prompt: "Classify the following text into one of: positive, negative, neutral.".into(),
    default_model: "claude-haiku-4-5-20251001".into(),
    default_max_tokens: 100,
};

let provider = AnthropicProvider::new("sk-ant-...");

let operator = SingleShotOperator::new(provider, config);
}

Behavior

Assemble context from the system prompt and input message.
Call the model once.
Return the response immediately.

There is no loop, no tool execution, and no iteration. The exit reason is always Complete on success.

Choosing between operators

Use case	Operator	Why
Agent with tools	Context Engine	Needs the reasoning loop to call tools and iterate
Classification/extraction	`SingleShotOperator`	One model call is sufficient
Summarization	`SingleShotOperator`	No tools needed
Code generation with testing	Context Engine	May need to run tests, read errors, and iterate
Multi-step research	Context Engine	Needs to search, read, and synthesize

Using operators as trait objects

Both operators implement layer0::Operator, which is object-safe. You can use them interchangeably behind Box<dyn Operator> or Arc<dyn Operator>:

#![allow(unused)]
fn main() {
use layer0::id::OperatorId;
use layer0::operator::Operator;
use std::sync::Arc;

let engine_op: Arc<dyn Operator> = Arc::new(my_operator);
let single_op: Arc<dyn Operator> = Arc::new(single_shot_operator);

// Orchestrator doesn't know or care which operator it's dispatching to
orchestrator.register(OperatorId::new("coder"), engine_op);
orchestrator.register(OperatorId::new("classifier"), single_op);
}

The provider’s generic type parameter is erased at the Operator boundary. Callers never see the concrete provider type.

Custom operators: Rules as extension points

The primary extension mechanism for the context engine loop is Rules. Rules fire during the react loop and can inspect context, modify messages, or halt execution. For detailed guidance on building a custom operator, see:

Building a custom operator

That guide covers:

Implementing Rules for loop interception
Using ContextOp to compose assembly, inference, and reaction
Wiring domain-specific logic into the react loop

The brief example skeleton below shows the shape of a custom operator that wraps react_loop with additional rule-based behavior:

#![allow(unused)]
fn main() {
use skg_context_engine::{Context, react_loop, ReactLoopConfig};
use skg_tool::ToolRegistry;

// Build your operator struct wrapping Provider + ToolRegistry + ReactLoopConfig
// (see the Construction example above), then add Rules to the Context
// before calling react_loop() to customize loop behavior.
}

Customising operator behaviour with Rules

react_loop is the composition function at the heart of the context engine. You don’t subclass or wrap it — you customise what happens inside it by attaching Rules to the Context.

Rules overview

A Rule pairs a trigger with a ContextOp (any async operation that takes &mut Context). Rules fire automatically during Context::run() — the same entry point that every pipeline operation goes through.

use skg_context_engine::rule::{Rule, Trigger};
use skg_context_engine::context::Context;
use std::any::TypeId;

// Three trigger types:
Trigger::Before(TypeId::of::<SomeOp>())  // fire before a specific op
Trigger::After(TypeId::of::<SomeOp>())   // fire after a specific op
Trigger::When(Box::new(|ctx| predicate)) // fire when a predicate is true

// Convenience constructors:
Rule::before::<SomeOp>("name", priority, my_op)
Rule::after::<SomeOp>("name", priority, my_op)
Rule::when("name", priority, |ctx| predicate, my_op)

// Catch-all variants:
Trigger::BeforeAny   // fire before every run() call
Trigger::AfterAny    // fire after every run() call

Rules fire in priority order (highest first). Rules cannot trigger other rules — the dispatch loop skips rule evaluation during rule execution to prevent infinite recursion.

Attaching rules to a Context

use skg_context_engine::context::Context;
use skg_context_engine::rule::Rule;

// At construction:
let ctx = Context::with_rules(vec![rule_a, rule_b]);

// Or incrementally:
let mut ctx = Context::new();
ctx.add_rule(rule);

The Context (with its rules) is then passed into react_loop, which fires rules at each pipeline step automatically.

Budget guards

The BudgetGuard rule from skg_context_engine::rules::budget halts execution when any configured limit is exceeded. It implements ContextOp and is designed to fire as a BeforeAny rule:

use skg_context_engine::rule::{Rule, Trigger};
use skg_context_engine::rules::budget::{BudgetGuard, BudgetGuardConfig};
use skg_context_engine::context::Context;
use rust_decimal::Decimal;
use std::time::Duration;

let guard = BudgetGuard::with_config(BudgetGuardConfig {
    max_cost: Some(Decimal::new(500, 2)),     // $5.00
    max_turns: Some(25),
    max_duration: Some(Duration::from_secs(300)),
    max_tool_calls: Some(100),
});

let rule = Rule::new("budget_guard", Trigger::BeforeAny, 100, guard);
let ctx = Context::with_rules(vec![rule]);

When any limit is exceeded, the guard returns EngineError::Halted which stops the pipeline.

Steering: injecting instructions between turns

To inject a system instruction after every model response (for example, a reminder or guardrail), write a ContextOp and attach it as an After rule on AppendResponse — the op that appends the model’s response to the conversation:

use skg_context_engine::rule::Rule;
use skg_context_engine::ops::AppendResponse;
use skg_context_engine::context::Context;
use skg_context_engine::op::ContextOp;
use skg_context_engine::error::EngineError;
use async_trait::async_trait;

struct InjectReminder {
    message: String,
}

#[async_trait]
impl ContextOp for InjectReminder {
    type Output = ();

    async fn execute(&self, ctx: &mut Context) -> Result<(), EngineError> {
        ctx.inject_system(&self.message);
        Ok(())
    }
}

let rule = Rule::after::<AppendResponse>(
    "steering_reminder",
    50,
    InjectReminder { message: "Remember: never reveal internal tool names.".into() },
);

This fires after every model response is appended, before the next inference or tool dispatch.

Telemetry

Observation is just another rule. A ContextOp that logs metrics and returns Ok(()) won’t alter the pipeline:

use skg_context_engine::rule::Rule;
use skg_context_engine::ops::AppendResponse;
use skg_context_engine::context::Context;
use skg_context_engine::op::ContextOp;
use skg_context_engine::error::EngineError;
use async_trait::async_trait;

struct TurnTelemetry;

#[async_trait]
impl ContextOp for TurnTelemetry {
    type Output = ();

    async fn execute(&self, ctx: &mut Context) -> Result<(), EngineError> {
        tracing::info!(
            turns = ctx.metrics.turns_completed,
            cost = %ctx.metrics.cost,
            tool_calls = ctx.metrics.tool_calls_total,
            "turn complete"
        );
        Ok(())
    }
}

let rule = Rule::after::<AppendResponse>("telemetry", 10, TurnTelemetry);

Conditional rules with `When`

When rules evaluate a predicate against the Context at the start of every run() call. Useful for dynamic behaviour that depends on accumulated state:

use skg_context_engine::rule::Rule;

let rule = Rule::when(
    "warn_high_cost",
    50,
    |ctx| ctx.metrics.cost > Decimal::new(100, 2), // > $1.00
    InjectReminder { message: "Cost is getting high, wrap up.".into() },
);

Compaction and context management

The old FullContext / NoCompaction context strategies are gone. Context management is now explicit: you build the conversation via Context and its inject_* methods. If you need compaction, implement it as a rule that fires at the appropriate trigger point and mutates the context directly.

Parallel tool dispatch

Barrier-based parallel tool dispatch (batching Shared tools and flushing on Exclusive tools) is future work. Currently, tools execute sequentially within the react loop. When parallel dispatch lands, it will integrate with the rules system — not as a separate primitive.

State

The state system provides scoped key-value persistence through the StateStore and StateReader traits. skelegent ships two implementations: MemoryStore (in-memory, ephemeral) and FsStore (filesystem-backed, durable).

StateStore and StateReader

StateStore provides full read-write access:

#![allow(unused)]
fn main() {
#[async_trait]
pub trait StateStore: Send + Sync {
    async fn read(&self, scope: &Scope, key: &str)
        -> Result<Option<serde_json::Value>, StateError>;
    async fn write(&self, scope: &Scope, key: &str, value: serde_json::Value)
        -> Result<(), StateError>;
    async fn delete(&self, scope: &Scope, key: &str) -> Result<(), StateError>;
    async fn list(&self, scope: &Scope, prefix: &str) -> Result<Vec<String>, StateError>;
    async fn search(&self, scope: &Scope, query: &str, limit: usize)
        -> Result<Vec<SearchResult>, StateError>;
}
}

StateReader is a read-only projection (read, list, search only). Every StateStore automatically implements StateReader via a blanket impl. Operators receive &dyn StateReader during context assembly – they can read state but must declare writes as effects.

Scopes

State is partitioned by Scope. A scope is a structured identifier that determines where data lives:

#![allow(unused)]
fn main() {
pub enum Scope {
    Operator(OperatorId),
    Session(SessionId),
    Workflow(WorkflowId),
    Global,
    Custom { namespace: String, id: String },
}
}

Scopes provide isolation: an agent’s state does not collide with another agent’s state, and session-scoped data is separate from workflow-scoped data.

MemoryStore (`skg-state-memory`)

In-memory storage using a HashMap. Data is lost when the process exits.

#![allow(unused)]
fn main() {
use skg_state_memory::MemoryStore;

let store = MemoryStore::new();
}

Best for:

Unit and integration tests
Short-lived processes
Prototyping

The memory store supports concurrent access through internal locking.

Example usage

#![allow(unused)]
fn main() {
use layer0::state::StateStore;
use layer0::effect::Scope;
use layer0::id::SessionId;
use skg_state_memory::MemoryStore;
use serde_json::json;

async fn example() -> Result<(), Box<dyn std::error::Error>> {
let store = MemoryStore::new();
let scope = Scope::Session(SessionId("sess-001".into()));

// Write
store.write(&scope, "user_preference", json!({"theme": "dark"})).await?;

// Read
let value = store.read(&scope, "user_preference").await?;
assert_eq!(value, Some(json!({"theme": "dark"})));

// List keys with prefix
store.write(&scope, "history/turn-1", json!({"msg": "hello"})).await?;
store.write(&scope, "history/turn-2", json!({"msg": "world"})).await?;
let keys = store.list(&scope, "history/").await?;
assert_eq!(keys.len(), 2);

// Delete
store.delete(&scope, "user_preference").await?;
let value = store.read(&scope, "user_preference").await?;
assert_eq!(value, None);
Ok(())
}
}

FsStore (`skg-state-fs`)

Filesystem-backed storage. Each scope/key pair maps to a file on disk. Data persists across process restarts.

#![allow(unused)]
fn main() {
use skg_state_fs::FsStore;

let store = FsStore::new("/tmp/skg-state");
}

The directory structure mirrors the scope hierarchy:

/tmp/skg-state/
  session/
    sess-001/
      user_preference.json
      history/
        turn-1.json
        turn-2.json
  agent/
    coder/
      config.json

Best for:

CLI tools that need persistent state
Local development
Single-machine deployments

Search

The search method supports semantic search within a scope. Implementations that do not support search return an empty Vec (not an error):

#![allow(unused)]
fn main() {
use layer0::state::StateStore;

async fn example(store: &dyn StateStore) -> Result<(), Box<dyn std::error::Error>> {
let scope = Scope::Global;
let results = store.search(&scope, "user authentication", 5).await?;
for result in results {
    println!("{}: score={}", result.key, result.score);
}
Ok(())
}
}

MemoryStore and FsStore return empty results for search. A future store backed by a vector database or full-text search engine could provide real semantic search.

Using state with operators

Operators do not write to state directly. Instead:

The operator runtime provides a &dyn StateReader during context assembly.
The operator reads whatever state it needs to build context.
If the operator wants to persist something, it includes a state-write Effect in its OperatorOutput.
The calling layer (orchestrator, environment) executes the effect.

This design keeps operators pure: input in, output + effects out. The same operator works whether state is in-memory, on disk, or in a remote database.

Error handling

#![allow(unused)]
fn main() {
pub enum StateError {
    NotFound { scope, key },   // Key does not exist
    WriteFailed(String),       // Write operation failed
    Serialization(String),     // Serde error
    Other(Box<dyn Error>),     // Catch-all
}
}

Note that read returns Ok(None) for missing keys, not Err(NotFound). The NotFound variant is for cases where a key was expected to exist (e.g., in a higher-level API that wraps the store).

State, Memory, and Compaction

State and memory are the same system at different timescales

Context is the hot path: messages in the current inference window, each governed by a CompactionPolicy (Pinned, Normal, CompressFirst, DiscardWhenDone). StateStore is the persistence path: compacted summaries, extracted facts, cross-session memories, governed by StoreOptions (tier, lifetime, content_kind, salience, ttl).

The flow:

Messages enter Context via inject_message.
Context grows until a compaction rule fires.
Compaction summarizes old messages (optionally via a Provider).
The summary is written to StateStore.
On the next turn, search() retrieves relevant memories.
Retrieved memories are injected back into Context.

Context is ephemeral working memory. StateStore is long-term memory. They are the same information at different points in time.

Crate boundaries follow technology, not capability

Name crates after what you cargo add — the library or database they wrap — not after the abstract capability they provide. skg-state-sqlite wraps SQLite. skg-state-cozo wraps CozoDB. Names like skg-state-search or skg-state-vector are wrong because they describe capability, not technology.

A single technology can provide multiple capabilities: SQLite provides KV storage, full-text search (FTS5), and vector search in a single crate. The StateStore trait defines what capabilities exist; each implementation does what its underlying technology supports natively. search() returning an empty Vec is the correct behavior for backends that do not support search — not an error.

Crate	KV	Text search	Vector search	Graph
`state-memory`	✓	✗	✗	✗
`state-fs`	✓	✗	✗	✗
`state-sqlite` (extras)	✓	✓ (FTS5)	✗	✗
`state-cozo` (extras)	✓	✓	✓ (HNSW)	✓ (Datalog)

Compaction strategies are ContextOps, not crates

Compaction strategies implement ContextOp, live in skg-context-engine/src/rules/compaction.rs, and activate via Rule + Trigger — the same mechanism as BudgetGuard and TelemetryRecorder. They are not a separate crate because they share the same dependency footprint, the same type universe (Context, Message, CompactionPolicy), and the same activation mechanism as the rest of the context engine. Strategies optionally accept a Provider for summarization and a StateStore for persistence.

The Compact op in ops/compact.rs is the closure-based primitive. Pre-built strategies in rules/compaction.rs — sliding window, policy-aware trim, summarize-and-replace, cognitive state extract — compose on top of it.

Patterns decompose into strategy + storage + rule

Patterns like memory-augmented generation or cognitive state extraction are not crates. They are configurations: a compaction strategy + a StateStore backend + an assembly rule. Users compose them at construction time:

#![allow(unused)]
fn main() {
ctx.add_rule(CompactionRule::new(
    CompactionConfig {
        strategy: Strategy::SummarizeAndReplace { provider: provider.clone() },
        store: Some(state_store.clone()),
        ..Default::default()
    }
));
}

The framework provides the primitives. The application assembles the pattern.

Format is configuration, not crate boundary

JSON versus markdown on the filesystem is a constructor parameter on FsStore, not a reason for a separate crate. HashMap versus LRU eviction in memory is a constructor parameter on MemoryStore. If two behaviors differ only in a parameter value, they belong in the same crate with a richer constructor — not in separate crates.

Orchestration

Orchestration is how multiple agents compose and how execution survives failures. The Dispatcher trait provides dispatch (send work to agents), Signalable provides signaling (inter-workflow communication), and Queryable provides queries (read-only state inspection).

The Dispatcher, Signalable, and Queryable traits

#![allow(unused)]
fn main() {
#[async_trait]
```rust

#[async_trait]

pub trait Dispatcher: Send + Sync {

    async fn dispatch(

        &self,

        operator: &OperatorId,

        input: OperatorInput,

    ) -> Result<OperatorOutput, OrchError>;

}



#[async_trait]

pub trait Signalable: Send + Sync {

    async fn signal(

        &self,

        target: &WorkflowId,

        signal: SignalPayload,

    ) -> Result<(), OrchError>;

}



#[async_trait]

pub trait Queryable: Send + Sync {

    async fn query(

        &self,

        target: &WorkflowId,

        query: QueryPayload,

    ) -> Result<serde_json::Value, OrchError>;

}

}

Related: dispatch_many() is a free function in skg-orch-kit that dispatches multiple tasks in parallel using Dispatcher.

LocalOrch (`skg-orch-local`)

The local orchestrator dispatches operator invocations in-process using tokio. It maps OperatorId values to Arc<dyn Operator> references and calls execute() directly.

#![allow(unused)]
fn main() {
use skg_orch_local::LocalOrch;
use layer0::operator::Operator;
use layer0::id::OperatorId;
use std::sync::Arc;

// Assume `coder` and `reviewer` are constructed operators
let coder: Arc<dyn Operator> = /* ... */;
let reviewer: Arc<dyn Operator> = /* ... */;

let mut orchestrator = LocalOrch::new();
orchestrator.register(OperatorId("coder".into()), coder);
orchestrator.register(OperatorId("reviewer".into()), reviewer);
}

Dispatching

Single dispatch sends work to one agent:

#![allow(unused)]
fn main() {
use layer0::dispatch::Dispatcher;
use layer0::operator::{OperatorInput, TriggerType};
use layer0::content::Content;
use layer0::id::OperatorId;

async fn example(dispatcher: &dyn Dispatcher) -> Result<(), Box<dyn std::error::Error>> {
let input = OperatorInput::new(
    Content::text("Implement the authentication module"),
    TriggerType::Task,
);

let output = dispatcher
    .dispatch(&OperatorId("coder".into()), input)
    .await?;

println!("Agent response: {:?}", output.message);
Ok(())
}
}

Parallel dispatch

dispatch_many sends work to multiple agents concurrently. The local orchestrator uses tokio::spawn for parallelism:

#![allow(unused)]
fn main() {
use layer0::dispatch::Dispatcher;
use layer0::operator::{OperatorInput, TriggerType};
use layer0::content::Content;
use layer0::id::OperatorId;

async fn example(dispatcher: &dyn Dispatcher) -> Result<(), Box<dyn std::error::Error>> {
let tasks = vec![
    (
        OperatorId("analyzer".into()),
        OperatorInput::new(Content::text("Analyze security risks"), TriggerType::Task),
    ),
    (
        OperatorId("reviewer".into()),
        OperatorInput::new(Content::text("Review code quality"), TriggerType::Task),
    ),
];

let results = dispatcher.dispatch_many(tasks).await;
for result in results {
    match result {
        Ok(output) => println!("Success: {:?}", output.exit_reason),
        Err(e) => println!("Failed: {}", e),
    }
}
Ok(())
}
}

Results are returned in the same order as the input tasks. Individual tasks may fail independently.

Signals

Signals provide fire-and-forget messaging to running workflows:

#![allow(unused)]
fn main() {
use skg_effects_core::Signalable;
use layer0::effect::SignalPayload;
use layer0::id::WorkflowId;

async fn example(signalable: &dyn Signalable) -> Result<(), Box<dyn std::error::Error>> {
let signal = SignalPayload {
    signal_type: "cancel".into(),
    data: serde_json::json!({"reason": "user requested"}),
};

signalable
    .signal(&WorkflowId("wf-001".into()), signal)
    .await?;
Ok(())
}
}

signal() returns Ok(()) when the signal is accepted, not when it is processed.

Queries

Queries provide read-only inspection of workflow state:

#![allow(unused)]
fn main() {
use skg_effects_core::{Queryable, QueryPayload};
use layer0::id::WorkflowId;

async fn example(queryable: &dyn Queryable) -> Result<(), Box<dyn std::error::Error>> {
let query = QueryPayload::new("status", serde_json::json!({}));
let result = queryable
    .query(&WorkflowId("wf-001".into()), query)
    .await?;
println!("Workflow status: {}", result);
Ok(())
}
}

OrchKit (`skg-orch-kit`)

The skg-orch-kit crate provides shared utilities for orchestrator implementations. These are building blocks that any orchestrator (local, Temporal, Restate) can reuse.

Error handling

#![allow(unused)]
fn main() {
pub enum OrchError {
    OperatorNotFound(String),    // No agent registered with that ID
    WorkflowNotFound(String), // No workflow with that ID
    DispatchFailed(String),   // Dispatch failed for other reasons
    SignalFailed(String),     // Signal delivery failed
    OperatorError(OperatorError), // Propagated from the operator
    Other(Box<dyn Error>),    // Catch-all
}
}

OperatorError propagates through OrchError via From. If an operator fails during dispatch, the error is wrapped as OrchError::OperatorError.

Future orchestrators

The Dispatcher, Signalable, and Queryable traits are designed to support orchestrators beyond in-process dispatch:

Temporal – Durable execution with automatic replay and fault tolerance. dispatch becomes a Temporal activity. signal maps to Temporal signals. query maps to Temporal queries.
Restate – Durable execution with virtual objects. Similar to Temporal but with a different programming model.
HTTP – Dispatch over HTTP for microservice architectures. dispatch sends a serialized OperatorInput over the network.

The traits are transport-agnostic by design. All protocol types (OperatorInput, OperatorOutput, SignalPayload, QueryPayload) implement Serialize + Deserialize, so they can cross any boundary.

Effects, signals, and custom operators

Skelegent draws a hard boundary: operators declare effects; orchestrators execute them. This separation lets you reuse the same operator across transports (in-process, Temporal, Restate) without leaking execution mechanics.

Custom operators (e.g., barrier-scheduled loops) can freely declare effects like Effect::Custom, Effect::Delegate, or Effect::Signal. The orchestrator decides when to execute them relative to dispatch lifecycles, and exposes signal()/query() for out-of-band communication.

Defaults stay slim: if you do nothing, wrap react_loop in a simple operator or use SingleShotOperator. If you need custom control (barriers and steering), implement a custom operator and keep effects at the boundary. See examples/custom_operator_barrier.

Middleware & Interception

skelegent uses two complementary interception mechanisms:

Per-boundary middleware (DispatchMiddleware, StoreMiddleware, ExecMiddleware) — wraps protocol-level operations using the continuation pattern. Defined in layer0::middleware.
Operator-local interception (Rule system) — typed per-trigger rules inside the context engine. Rules fire via Trigger enum: Before (pre-inference, pre-tool), After (post-inference, post-tool), or When (exit checks). Defined in skg-context-engine::rules.

Security middleware (RedactionMiddleware, ExfilGuardMiddleware) lives in the skg-hook-security crate.

Per-boundary middleware

Three traits — one per Layer 0 protocol boundary — follow the continuation pattern: call next to forward, skip next to short-circuit.

DispatchMiddleware (wraps Dispatcher::dispatch)

#![allow(unused)]
fn main() {
#[async_trait]
pub trait DispatchMiddleware: Send + Sync {
    async fn dispatch(
        &self,
        operator: &OperatorId,
        input: OperatorInput,
        next: &dyn DispatchNext,
    ) -> Result<OperatorOutput, OrchError>;
}
}

Code before next.dispatch() = pre-processing (input mutation, logging). Code after next.dispatch() = post-processing (output mutation, metrics). Not calling next.dispatch() = short-circuit (guardrail halt, cached response).

StoreMiddleware (wraps StateStore read/write)

#![allow(unused)]
fn main() {
#[async_trait]
pub trait StoreMiddleware: Send + Sync {
    async fn write(
        &self,
        scope: &Scope,
        key: &str,
        value: serde_json::Value,
        options: Option<&StoreOptions>,
        next: &dyn StoreWriteNext,
    ) -> Result<(), StateError>;

    async fn read(
        &self,
        scope: &Scope,
        key: &str,
        next: &dyn StoreReadNext,
    ) -> Result<Option<serde_json::Value>, StateError> {
        next.read(scope, key).await
    }
}
}

Use for: encryption-at-rest, audit trails, caching, access control.

ExecMiddleware (wraps Environment::run)

#![allow(unused)]
fn main() {
#[async_trait]
pub trait ExecMiddleware: Send + Sync {
    async fn run(
        &self,
        input: OperatorInput,
        spec: &EnvironmentSpec,
        next: &dyn ExecNext,
    ) -> Result<OperatorOutput, EnvError>;
}
}

Use for: resource metering, credential injection, sandboxing.

Middleware stacks

Middleware composes via stack builders. Each stack organizes layers into three phases:

Observers — outermost; always run, always call next.
Transformers — mutate input/output, always call next.
Guards — innermost; may short-circuit by not calling next.

#![allow(unused)]
fn main() {
use layer0::middleware::DispatchStack;
use std::sync::Arc;

let stack = DispatchStack::builder()
    .observe(Arc::new(logging_middleware))
    .transform(Arc::new(sanitizer_middleware))
    .guard(Arc::new(policy_middleware))
    .build();
}

Call order: observers → transformers → guards → terminal (the real orchestrator). The same builder pattern applies to StoreStack and ExecStack.

Example: dispatch logging middleware

#![allow(unused)]
fn main() {
use async_trait::async_trait;
use layer0::middleware::{DispatchMiddleware, DispatchNext};
use layer0::id::OperatorId;
use layer0::operator::{OperatorInput, OperatorOutput};
use layer0::error::OrchError;

struct LoggingMiddleware;

#[async_trait]
impl DispatchMiddleware for LoggingMiddleware {
    async fn dispatch(
        &self,
        operator: &OperatorId,
        input: OperatorInput,
        next: &dyn DispatchNext,
    ) -> Result<OperatorOutput, OrchError> {
        tracing::info!(%operator, "dispatch start");
        let result = next.dispatch(operator, input).await;
        tracing::info!(%operator, ok = result.is_ok(), "dispatch end");
        result
    }
}
}

Example: dispatch guardrail (deny a tool by name)

#![allow(unused)]
fn main() {
use async_trait::async_trait;
use layer0::middleware::{DispatchMiddleware, DispatchNext};
use layer0::id::OperatorId;
use layer0::operator::{OperatorInput, OperatorOutput};
use layer0::error::OrchError;

struct DenyToolMiddleware {
    denied: String,
}

#[async_trait]
impl DispatchMiddleware for DenyToolMiddleware {
    async fn dispatch(
        &self,
        operator: &OperatorId,
        input: OperatorInput,
        next: &dyn DispatchNext,
    ) -> Result<OperatorOutput, OrchError> {
        if operator.as_str() == self.denied {
            return Err(OrchError::PolicyDenied {
                reason: format!("tool {} is denied by policy", self.denied),
            });
        }
        next.dispatch(operator, input).await
    }
}
}

Rule System (operator-local interception)

For interception inside the context engine (before/after inference, before/after tool calls, exit checks), use the Rule system. Rules fire via Trigger enum with three phases: Before (pre-inference, pre-tool), After (post-inference, post-tool), or When (exit conditions). Each rule has a default no-op implementation — override only what you need.

#![allow(unused)]
fn main() {
#[async_trait]
pub trait Rule: Send + Sync {
    async fn before_inference(&self, state: &LoopState) -> RuleAction { ... }
    async fn after_inference(&self, state: &LoopState, response: &Content) -> RuleAction { ... }
    async fn before_tool_call(&self, state: &LoopState, tool: &str, input: &Value) -> RuleAction { ... }
    async fn after_tool_call(&self, state: &LoopState, tool: &str, result: &str) -> RuleAction { ... }
    async fn when_exit_check(&self, state: &LoopState) -> RuleAction { ... }
    async fn before_steering_inject(&self, state: &LoopState, messages: &[String]) -> RuleAction { ... }
    async fn when_steering_skip(&self, state: &LoopState, skipped: &[String]) { }
    async fn before_compaction(&self, state: &LoopState) -> RuleAction { ... }
    async fn after_compaction(&self, state: &LoopState) { }
}
}

Return types

RuleAction — Continue or Halt { reason }. Returned by before_inference, after_inference, when_exit_check, before_steering_inject, before_compaction.

Attaching a rule

#![allow(unused)]
fn main() {
use skg_context_engine::{Context, react_loop, ReactLoopConfig};
use skg_context_engine::rule::Rule;

// Rules are attached to Context, then passed to react_loop
let mut ctx = Context::new();
ctx.add_rule(my_rule);

// react_loop fires rules automatically during execution
let output = react_loop(&mut ctx, &provider, &tools, &tool_ctx, &config).await?;
}

Example: budget enforcement rule

#![allow(unused)]
fn main() {
use async_trait::async_trait;
use skg_context_engine::rules::{Rule, RuleAction, LoopState};
use rust_decimal_macros::dec;

struct BudgetRule;

#[async_trait]
impl Rule for BudgetRule {
    async fn after_inference(&self, state: &LoopState, _response: &layer0::content::Content) -> RuleAction {
        if state.cost > dec!(1.00) {
            RuleAction::Halt {
                reason: "budget exceeded $1.00".into(),
            }
        } else {
            RuleAction::Continue
        }
    }
}
}

Example: tool input sanitizer rule

#![allow(unused)]
fn main() {
use async_trait::async_trait;
use skg_context_engine::rules::{Rule, RuleAction, LoopState};
use serde_json::Value;

struct StripSecretRule;

#[async_trait]
impl Rule for StripSecretRule {
    async fn before_tool_call(
        &self,
        _state: &LoopState,
        _tool_name: &str,
        input: &Value,
    ) -> RuleAction {
        if let Some(obj) = input.as_object() {
            if obj.contains_key("api_key") {
                let mut cleaned = obj.clone();
                cleaned.remove("api_key");
                return RuleAction::ModifyInput {
                    new_input: Value::Object(cleaned),
                };
            }
        }
        RuleAction::Continue
    }
}
}

Security middleware

The skg-hook-security crate provides two production-ready middleware implementations:

RedactionMiddleware — redacts sensitive data from dispatch output (implements DispatchMiddleware).
ExfilGuardMiddleware — blocks exfiltration attempts in dispatch input (implements DispatchMiddleware).

Use cases

Use case	Mechanism	Why
Budget enforcement	`Rule::after_inference`	Needs access to loop state cost
Tool policy (deny/skip)	`Rule::before_tool_call`	Per-tool decision inside the loop
Secret redaction	`RedactionMiddleware` or `Rule::after_tool_call`	Boundary-level or loop-level
Telemetry / logging	`DispatchMiddleware` (observer)	Cross-cutting, protocol-level
Encryption at rest	`StoreMiddleware`	Wraps state store reads/writes
Steering audit	`Rule::before_steering_inject`	Observe/block steering injection
Exfiltration guard	`ExfilGuardMiddleware`	Policy enforcement at dispatch boundary

MCP

Note: The MCP integration API may shift as the Model Context Protocol specification evolves. This page provides a summary of the current design. See the skg-mcp crate for the latest API.

Overview

skg-mcp provides a client for the Model Context Protocol (MCP). MCP is an open protocol for connecting AI models to external data sources and tools. The skelegent MCP client connects to MCP servers and exposes their tools as ToolDyn implementations that can be registered in a ToolRegistry.

This means tools hosted on MCP servers can be used by skelegent operators alongside locally defined tools, with no difference in how the operator interacts with them.

Integration pattern

The typical flow is:

Connect to one or more MCP servers.
Discover available tools from each server.
Wrap each MCP tool as an Arc<dyn ToolDyn>.
Register them in the operator’s ToolRegistry.

The operator’s ReAct loop then calls MCP tools the same way it calls local tools – through the ToolDyn interface.

When to use MCP

MCP is useful when:

You want to expose tools from existing MCP-compatible servers (database access, file systems, APIs).
You want to share tool definitions across multiple applications.
You want to decouple tool implementation from operator configuration.

For tools that are specific to your application and do not need to be shared, implementing ToolDyn directly is simpler.

Crate

The skg-mcp crate depends on layer0 and skg-tool. Enable it via the mcp feature flag on the skelegent umbrella crate:

[dependencies]
skelegent = { version = "0.4", features = ["mcp"] }

Testing

skelegent is designed for testability. Every protocol trait is object-safe, so you can create mock implementations for any component. Layer 0 provides test utilities, and the workspace includes patterns for unit, integration, and object-safety testing.

test-utils feature in layer0

Layer 0 provides test utilities behind the test-utils feature flag:

[dev-dependencies]
layer0 = { version = "0.4", features = ["test-utils"] }

This module includes mock implementations of the protocol traits that are useful for testing code that depends on dyn Operator, dyn StateStore, etc.

Object-safety tests

A critical property of Layer 0 traits is object safety. Every trait must work behind Box<dyn Trait> and be Send + Sync. The workspace enforces this with compile-time tests:

#![allow(unused)]
fn main() {
fn _assert_send_sync<T: Send + Sync>() {}

#[test]
fn operator_is_object_safe_and_send_sync() {
    _assert_send_sync::<Box<dyn layer0::Operator>>();
}

#[test]
fn state_store_is_object_safe_and_send_sync() {
    _assert_send_sync::<Box<dyn layer0::StateStore>>();
}

#[test]
fn dispatcher_is_object_safe_and_send_sync() {

    _assert_send_sync::<Box<dyn layer0::Dispatcher>>();

}



#[test]

fn signalable_is_object_safe_and_send_sync() {

    _assert_send_sync::<Box<dyn skg_effects_core::Signalable>>();

}



#[test]

fn queryable_is_object_safe_and_send_sync() {

    _assert_send_sync::<Box<dyn skg_effects_core::Queryable>>();

}

#[test]
fn environment_is_object_safe_and_send_sync() {
    _assert_send_sync::<Box<dyn layer0::Environment>>();
}

#[test]
fn dispatch_middleware_is_object_safe_and_send_sync() {
    _assert_send_sync::<Box<dyn layer0::DispatchMiddleware>>();
}
}

These tests cost nothing at runtime – they are purely compile-time assertions. If someone accidentally makes a trait non-object-safe, the test fails to compile.

The same pattern is used for non-Layer-0 traits:

#![allow(unused)]
fn main() {
#[test]
fn tool_dyn_is_object_safe() {
    _assert_send_sync::<std::sync::Arc<dyn skg_tool::ToolDyn>>();
}
}

Serde roundtrip tests

All Layer 0 message types must serialize and deserialize correctly. The workspace tests this with roundtrip assertions:

#![allow(unused)]
fn main() {
use layer0::operator::{OperatorInput, TriggerType};
use layer0::content::Content;

#[test]
fn operator_input_roundtrips() {
    let input = OperatorInput::new(Content::text("hello"), TriggerType::User);
    let json = serde_json::to_string(&input).unwrap();
    let roundtripped: OperatorInput = serde_json::from_str(&json).unwrap();
    assert_eq!(roundtripped.message, input.message);
}
}

Mock providers for operator testing

To test operators without making real API calls, create a mock Provider:

#![allow(unused)]
fn main() {
use skg_turn::provider::{Provider, ProviderError};
use skg_turn::infer::{InferRequest, InferResponse};
use std::future::Future;

struct MockProvider {
    responses: Vec<InferResponse>,
}

impl Provider for MockProvider {
    fn infer(
        &self,
        _request: InferRequest,
    ) -> impl Future<Output = Result<InferResponse, ProviderError>> + Send {
        let response = self.responses[0].clone(); // simplified
        async move { Ok(response) }
    }
}
}

Then construct a Context and call react_loop with the mock provider:

#![allow(unused)]
fn main() {
use skg_context_engine::{Context, react_loop, ReactLoopConfig};
use layer0::DispatchContext;
use layer0::id::{DispatchId, OperatorId};
use skg_tool::ToolRegistry;
use layer0::context::{Message, Role};

let mut ctx = Context::new("You are a helpful assistant.");
ctx.inject_message(Message::new(Role::User, "Hello"));

let tools = ToolRegistry::new();
let tool_ctx = DispatchContext::new(DispatchId::new("assistant"), OperatorId::new("assistant"));
let config = ReactLoopConfig::default();

// Now test without network calls
react_loop(&mut ctx, &mock_provider, &tools, &tool_ctx, &config).await.unwrap();
}

Mock tools

Create test tools by implementing ToolDyn:

#![allow(unused)]
fn main() {
use skg_tool::{ToolDyn, ToolError};
use serde_json::{json, Value};
use std::future::Future;
use std::pin::Pin;

struct AlwaysSucceedTool;

impl ToolDyn for AlwaysSucceedTool {
    fn name(&self) -> &str { "test_tool" }
    fn description(&self) -> &str { "Always succeeds" }
    fn input_schema(&self) -> Value { json!({"type": "object"}) }
    fn call(
        &self,
        input: Value,
    ) -> Pin<Box<dyn Future<Output = Result<Value, ToolError>> + Send + '_>> {
        Box::pin(async move { Ok(json!({"result": "ok"})) })
    }
}
}

Testing state stores

Both MemoryStore and FsStore implement StateStore, so you can write generic tests:

#![allow(unused)]
fn main() {
use layer0::state::StateStore;
use layer0::effect::Scope;
use serde_json::json;

async fn test_crud(store: &dyn StateStore) {
    let scope = Scope::Global;

    // Write and read back
    store.write(&scope, "key", json!("value")).await.unwrap();
    let val = store.read(&scope, "key").await.unwrap();
    assert_eq!(val, Some(json!("value")));

    // Delete
    store.delete(&scope, "key").await.unwrap();
    let val = store.read(&scope, "key").await.unwrap();
    assert_eq!(val, None);
}
}

Use MemoryStore for fast unit tests. Use FsStore with tempfile::TempDir for integration tests that exercise filesystem behavior.

Running the test suite

# Run all tests
cargo test

# Run tests with test-utils
cargo test --features test-utils -p layer0

# Run tests for a specific crate
cargo test -p skg-context-engine

# Verify no clippy warnings
cargo clippy -- -D warnings

Reference

Technical reference material for the skelegent workspace:

Crate Map – Every crate in the workspace, organized by layer, with one-line descriptions.
Error Handling – Error types, the thiserror pattern, and how errors propagate across protocol boundaries.

For API documentation generated from source code, run cargo doc --no-deps --open.

Crate Map

All crates in the skelegent workspace, organized by architectural layer.

Layer 0 – Protocol Traits

Crate	Description
`layer0`	Protocol traits (`Operator`, `Dispatcher`, `StateStore`, `Environment`), middleware traits (`DispatchMiddleware`, `StoreMiddleware`, `ExecMiddleware`), message types, and error types. The stability contract.

Layer 1 – Operator Implementations

Crate	Description
`skg-turn`	Shared toolkit: `Provider` trait, `InferRequest`, `InferResponse`, `TokenUsage`, provider request/response types, content conversions.
`skg-provider-anthropic`	Anthropic Claude API provider. Implements `Provider` for the Messages API.
`skg-provider-openai`	OpenAI API provider. Implements `Provider` for the Chat Completions API.
`skg-provider-ollama`	Ollama local model provider. Implements `Provider` for the Ollama API.
`skg-provider-codex`	OpenAI Codex (Responses API) provider. Implements `Provider` for the Responses API.
`skg-provider-router`	Provider router. Selects an underlying provider per request using pluggable routing policy.
`skg-tool`	`ToolDyn` trait, `ToolRegistry`, `AliasedTool`. Object-safe tool abstraction.
`skg-context`	Conversation context assembly and compaction strategies.
`skg-mcp`	MCP (Model Context Protocol) client. Wraps MCP server tools as `ToolDyn` implementations.
`skg-context-engine`	Composable three-phase context engine (assembly, inference, reaction). Implements `Operator` with tool execution.
`skg-tool-macro`	Proc macro for `#[skg_tool]` attribute. Generates `ToolDyn` implementations from async functions.
`skg-op-single-shot`	Single-shot operator. Implements `Operator` with one model call and no tools.
`skg-turn-kit`	Turn engine primitives: `DispatchPlanner`, `ConcurrencyDecider`, `BatchExecutor` (execution-only), `SteeringSource`.

Layer 2 – Orchestration

Crate	Description
`skg-orch-local`	In-process orchestrator. Implements `Dispatcher` (layer0), `Signalable`, and `Queryable` (skg-effects-core) with tokio tasks.
`skg-orch-kit`	Shared utilities for orchestrator implementations.
`skg-orch-env`	Environment-aware orchestrator. Routes operators through `Environment::run`.
`skg-run-core`	Portable durable run/control primitives and kernel above Layer 0.
`skg-effects-core`	Effect handler trait (`EffectHandler`), `Signalable`, `Queryable`, errors, and policy — no implementations.
`skg-effects-local`	Local in-process `EffectHandler` implementation (in-order, best-effort).
`skg-runner`	Runner binary for containerized/operator-hosted execution with gRPC + healthcheck endpoints.

Layer 3 – State

Crate	Description
`skg-state-memory`	In-memory state store. Implements `StateStore` with `HashMap`. Ephemeral.
`skg-state-fs`	Filesystem state store. Implements `StateStore` with file-backed persistence.
`skg-state-proxy`	gRPC proxy for `StateStore`, enabling cross-container state access.

Layer 4 – Environment and Credentials

Crate	Description
`skg-env-local`	Local environment. Implements `Environment` with no isolation (passthrough).
`skg-env-docker`	Docker-backed environment implementation for isolated operator execution.
`skg-secret`	Secret resolution trait. Defines the interface for secret backends.
`skg-secret-vault`	HashiCorp Vault secret backend.
`skg-crypto`	Cryptographic utilities and primitives.
`skg-auth`	Authentication and authorization abstractions.
`skg-auth-omp`	OMP credential provider that reads Oh My Pi OAuth tokens from `agent.db`.

Layer 5 – Cross-Cutting

Crate	Description
`skg-hook-security`	Security middleware: `RedactionMiddleware` (pattern-based content redaction) and `ExfilGuardMiddleware` (data-loss-prevention guardrails).
`skg-hook-recorder`	Universal operation recorder middleware. Captures dispatch events for testing and debugging.
`skg-hook-retry`	Retry middleware with configurable backoff and deadline-aware dispatch retries.

Umbrella

Crate	Description
`skelegent`	Umbrella crate. Feature-gated re-exports of all layers.

A2A

Crate	Description
`skg-a2a-core`	A2A protocol wire types and conversions.

Examples

Crate	Description
`custom-operator-barrier`	Example custom operator with barrier scheduling and steering (workspace member at `examples/custom_operator_barrier`).
`hello-claude`	Minimal example binary that wires OMP auth plus a single-shot Claude operator.
`middleware_approval`	Example demonstrating approval-gated middleware.
`middleware_echo`	Example demonstrating echo middleware.
`middleware_recorder`	Example demonstrating recorder middleware.

Summary

Layer	Crates
0	1
1	13
2	7
3	3
4	7
5	3
Umbrella	1
A2A	1
Examples	5
Total	41

Error Handling

Note: This page covers the error type design. Usage examples and error recovery patterns are planned for a future update.

Design pattern

skelegent uses thiserror for all error types. Each protocol has its own error enum in layer0::error. Error types are #[non_exhaustive] so new variants can be added without breaking downstream code.

Every error enum includes an Other variant with #[from] Box<dyn std::error::Error + Send + Sync> for wrapping arbitrary errors. This provides an escape hatch for implementation-specific errors that do not fit the named variants.

Error types by protocol

OperatorError

Errors from operator execution (Layer 0, layer0::error::OperatorError):

#![allow(unused)]
fn main() {
pub enum OperatorError {
    Model(String),           // LLM provider error
    SubDispatch { operator, message },  // Sub-dispatch execution error
    ContextAssembly(String), // Context assembly failed
    Retryable(String),       // Transient, may succeed on retry
    NonRetryable(String),    // Permanent failure (budget, safety, invalid input)
    Other(Box<dyn Error>),   // Catch-all
}
}

The Retryable / NonRetryable distinction lets orchestrators make retry decisions without inspecting error details.

OrchError

Errors from orchestration (Layer 0, layer0::error::OrchError):

#![allow(unused)]
fn main() {
pub enum OrchError {
    OperatorNotFound(String),    // Operator ID not registered
    WorkflowNotFound(String), // Workflow ID not found
    DispatchFailed(String),   // Dispatch failed
    SignalFailed(String),     // Signal delivery failed
    OperatorError(OperatorError), // Propagated from operator
    Other(Box<dyn Error>),    // Catch-all
}
}

OperatorError propagates into OrchError via the From trait. If an operator fails during dispatch, the error is wrapped automatically.

StateError

Errors from state operations (Layer 0, layer0::error::StateError):

#![allow(unused)]
fn main() {
pub enum StateError {
    NotFound { scope, key },  // Key expected to exist but doesn't
    WriteFailed(String),      // Write operation failed
    Serialization(String),    // Serde error
    Other(Box<dyn Error>),    // Catch-all
}
}

Note: StateStore::read returns Ok(None) for missing keys. NotFound is for higher-level APIs that expect a key to exist.

EnvError

Errors from environment operations (Layer 0, layer0::error::EnvError):

#![allow(unused)]
fn main() {
pub enum EnvError {
    ProvisionFailed(String),    // Failed to set up the environment
    IsolationViolation(String), // Isolation boundary violated
    CredentialFailed(String),   // Credential injection failed
    ResourceExceeded(String),   // Resource limit exceeded
    OperatorError(OperatorError), // Propagated from operator
    Other(Box<dyn Error>),      // Catch-all
}
}

Like OrchError, OperatorError propagates into EnvError via From.

ProviderError

Errors from LLM providers (Layer 1, skg_turn::provider::ProviderError):

#![allow(unused)]
fn main() {
pub enum ProviderError {
    TransientError { message: String, status: Option<u16> }, // 5xx / network failure
    RateLimited { retry_after: Option<Duration> },           // 429 response
    InvalidRequest { message: String, status: Option<u16> }, // 4xx client error (not retryable)
    ContentBlocked { message: String }, // Content blocked by provider
    AuthFailed(String),       // 401/403 response
    InvalidResponse(String),  // Response parse failure
    Other(Box<dyn Error>),    // Catch-all
}
}

ProviderError::is_retryable() returns true for RateLimited and TransientError. InvalidRequest is not retryable — 4xx client errors indicate malformed requests that will never succeed on retry.

ToolError

Errors from tool operations (Layer 1, skg_tool::ToolError):

#![allow(unused)]
fn main() {
pub enum ToolError {
    NotFound(String),         // Tool not in registry
    ExecutionFailed(String),  // Tool execution failed
    InvalidInput(String),     // Input didn't match schema
    Other(Box<dyn Error>),    // Catch-all
}
}

Error propagation

Errors propagate upward through the layer stack:

ProviderError / ToolError
        ↓ (mapped by operator implementation)
  OperatorError
        ↓ (From impl)
  OrchError / EnvError

Provider and tool errors are mapped to OperatorError by the operator implementation (e.g., the react_loop-based operator maps ProviderError::RateLimited to OperatorError::Model { retryable: true }). Operator errors propagate into orchestration and environment errors automatically via From impls. The RetryMiddleware checks OperatorError::is_retryable() to determine whether an OrchError::OperatorError should be retried.

This layered propagation ensures that callers at each level see errors appropriate to their abstraction. An orchestrator sees OrchError, never ProviderError.

Keyboard shortcuts

skelegent — Composable Agentic AI Architecture in Rust