Tool System

1. The Tool System Philosophy
2. Tool Interface
3. Writing a Tool - Complete Example
4. Tool Registry
5. Tool Call Loop
6. Tool Definitions and Schema
7. Tool Availability
8. CLI Tools (TypeCLI)
9. Using Tools with Orchestration
10. Safety and Observability
Tool Safety Contract

1. The Tool System Philosophy

pagantic draws a hard line between what the model decides and what the harness executes. This separation is the foundation of the entire tool system.

Model decides WHEN to call tools - this decision is probabilistic, driven by the model's understanding of the user's request and the available tool definitions.
Harness decides WHAT gets executed - tool dispatch is deterministic. The harness looks up the tool by name in the registry and runs it. No ambiguity, no inference.
All side effects live in tools, never in the model - the model produces text and tool calls. Tools produce side effects: file I/O, HTTP requests, database queries, shell commands.
Tools are registered, dispatched by name, and observable - every tool execution goes through a single dispatch point (the registry), making it easy to log, trace, and audit.

Design principle: If you can't observe it, you can't debug it. Every tool call is recorded with timing, arguments, and errors through the observer layer.

This split means you can swap models without touching tool code, and you can add tools without retraining models. The contract between them is a JSON schema describing parameters - nothing more.

2. Tool Interface

Every tool in pagantic implements a single interface with four methods. The interface lives in layers/04_tool:

// Tool is the contract each tool must satisfy.
type Tool interface {
    // Info returns tool metadata.
    Info() ToolInfo
    // Definition returns typed tool schema for the model.
    Definition() core.ToolDefinition
    // Execute runs tool with parsed args from model.
    Execute(args map[string]any) (string, error)
    // Available reports if tool is ready now.
    Available() (bool, string)
}

Each method has a clear responsibility:

Method	Purpose	Called by
`Info()`	Returns name, type, and description for logging and status display	Registry, observer
`Definition()`	Returns JSON Schema the model uses to construct tool call arguments	Registry (sent to model)
`Execute()`	Runs the tool with parsed arguments; returns result string or error	Registry dispatch
`Available()`	Reports whether tool can run right now; returns reason if not	Registry (filters definitions)

Supporting Types

type ToolInfo struct {
    Name        string
    Type        ToolType
    Description string
}

type ToolType string

const (
    TypeGo  ToolType = "go"   // Pure Go implementation
    TypeCLI ToolType = "cli"  // Wraps external binary
)

ToolType is metadata only - it does not change execution behavior. It exists so observers and dashboards can distinguish pure-Go tools from tools that shell out to external binaries.

3. Writing a Tool - Complete Example

Here is a complete dice roller tool with detailed comments explaining every method. Use this as a template for your own tools.

package main

import (
    "fmt"
    "math/rand"

    core "github.com/miroslav-matejovsky/pagantic/layers/00_core"
    "github.com/miroslav-matejovsky/pagantic/layers/04_tool"
)

// diceTool implements tool.Tool for rolling dice.
type diceTool struct{}

// Info returns metadata used for logging and status display.
func (d *diceTool) Info() tool.ToolInfo {
    return tool.ToolInfo{
        Name:        "roll_dice",
        Type:        tool.TypeGo,
        Description: "Rolls N dice with S sides each",
    }
}

// Definition returns the schema that tells the model what parameters
// this tool accepts. The model uses this to construct ToolCall arguments.
func (d *diceTool) Definition() core.ToolDefinition {
    return core.ToolDefinition{
        Name:        "roll_dice",
        Description: "Roll dice. Returns sum and individual results.",
        Parameters: core.Schema{
            Type: "object",
            Properties: map[string]core.Schema{
                "count": {
                    Type:        "integer",
                    Description: "Number of dice to roll (1-10)",
                },
                "sides": {
                    Type:        "integer",
                    Description: "Number of sides per die (4, 6, 8, 10, 12, 20)",
                },
            },
            Required: []string{"count", "sides"},
        },
    }
}

// Execute runs when the model calls this tool.
// args are parsed from model's JSON - keys match Parameters schema.
func (d *diceTool) Execute(args map[string]any) (string, error) {
    count := int(args["count"].(float64))  // JSON numbers are float64
    sides := int(args["sides"].(float64))

    if count < 1 || count > 10 {
        return "", fmt.Errorf("count must be 1-10, got %d", count)
    }

    results := make([]int, count)
    sum := 0
    for i := range results {
        results[i] = rand.Intn(sides) + 1
        sum += results[i]
    }

    return fmt.Sprintf("Rolled %dd%d: %v (sum: %d)", count, sides, results, sum), nil
}

// Available reports whether this tool can run right now.
// Return (false, "reason") to temporarily disable a tool.
func (d *diceTool) Available() (bool, string) {
    return true, ""
}

JSON number gotcha: All JSON numbers arrive as float64 in Go's map[string]any. Always cast with int(args["key"].(float64)) for integer parameters. Forgetting this is the most common tool bug.

Naming convention: Tool names should be snake_case and verb-first: roll_dice, search_web, read_file. The model uses these names to decide which tool to call, so make them descriptive.

4. Tool Registry

The Registry is the central dispatch point for all tools. It holds tools by name, filters by availability, and dispatches execution.

// Create registry from tools
registry := tool.NewRegistry(&diceTool{}, &searchTool{}, &calcTool{})

// Get definitions for model (only available tools)
defs := registry.Definitions()

// Execute by name (dispatched by harness, not model)
output, err := registry.Execute("roll_dice", map[string]any{
    "count": 2.0,
    "sides": 6.0,
})

// Check availability of all tools
statuses := registry.CheckAvailability()
for _, s := range statuses {
    fmt.Printf("%s: available=%v reason=%s\n", s.Name, s.Available, s.Reason)
}

Registry Methods

Method	Returns	Description
`NewRegistry(tools ...Tool)`	`*Registry`	Builds registry from variadic tool list
`Definitions()`	`[]ToolDefinition`	Definitions for available tools only (sent to model)
`AllDefinitions()`	`[]ToolDefinition`	Definitions for all tools regardless of availability
`Execute(name, args)`	`(string, error)`	Dispatch by name; returns error if tool unknown
`CheckAvailability()`	`[]ToolStatus`	Availability state for all tools, sorted by name
`Tools()`	`[]Tool`	All registered tools, sorted by name

Single dispatch point: All tool executions go through Registry.Execute(). This makes it trivial to add logging, rate limiting, or access control in one place.

5. Tool Call Loop

The AgentLoop drives the tool call loop. When the model produces a tool call instead of text, the harness executes the tool and feeds the result back. This continues until the model produces a text response or MaxToolIterations is reached (default: 20).

User: "Roll 2d6 and tell me the result" | v Model -> ToolCall{Name: "roll_dice", Args: {count: 2, sides: 6}} | v Harness executes: registry.Execute("roll_dice", args) | v Result: "Rolled 2d6: [3, 5] (sum: 8)" | v Tool result appended to messages as ToolResultMessage | v Model (again) -> "You rolled 2d6 and got 3 and 5, for a total of 8!" | v Final text response returned to user

How It Works Internally

User message is appended to conversation memory.
All messages (system + history + user) are sent to the inference engine.
If the model returns tool calls, each call is executed via the registry.
Each tool result is appended as a ToolResultMessage.
Messages are sent to the engine again (loop iteration).
Steps 3-5 repeat until the model returns text (no tool calls).
If MaxToolIterations is exceeded, an error is returned.

Iteration guard: The default MaxToolIterations is 20. Set it lower for simple agents, higher for complex multi-step workflows. If exceeded, the loop returns an error rather than running forever.

Multiple Tool Calls Per Turn

Models can return multiple tool calls in a single response. The harness executes all of them and appends all results before the next inference call. This allows the model to gather information in parallel.

6. Tool Definitions and Schema

Tool definitions use core.ToolDefinition and core.Schema to describe what a tool does and what parameters it accepts. These get serialized to JSON Schema and sent to the model so it knows how to construct tool call arguments.

core.ToolDefinition{
    Name:        "search_web",
    Description: "Search the web for information",
    Parameters: core.Schema{
        Type: "object",
        Properties: map[string]core.Schema{
            "query": {
                Type:        "string",
                Description: "Search query",
            },
            "max_results": {
                Type:        "integer",
                Description: "Maximum results to return (default: 5)",
            },
        },
        Required: []string{"query"},
    },
}

Schema Type

The core.Schema type is a JSON Schema subset that covers the common cases for tool parameters:

type Schema struct {
    Type        string            `json:"type,omitempty"`
    Description string            `json:"description,omitempty"`
    Properties  map[string]Schema `json:"properties,omitempty"`
    Required    []string          `json:"required,omitempty"`
    Enum        []string          `json:"enum,omitempty"`
    Items       *Schema           `json:"items,omitempty"`
    Default     any               `json:"default,omitempty"`
}

Field	Use
`Type`	`"object"`, `"string"`, `"integer"`, `"number"`, `"boolean"`, `"array"`
`Properties`	Named parameters when Type is `"object"`
`Required`	List of parameter names the model must always provide
`Enum`	Constrain a string parameter to specific values
`Items`	Schema for array elements when Type is `"array"`
`Default`	Default value hint for optional parameters

Write good descriptions. The model reads your Description fields to decide when and how to call a tool. Be specific about ranges, formats, and defaults. Vague descriptions lead to bad tool calls.

7. Tool Availability

Tools can be conditionally available. The Available() method lets a tool report whether it can run right now and why not. This is useful for tools that depend on external resources, configuration, or runtime conditions.

type fileTool struct {
    dir string
}

func (f *fileTool) Available() (bool, string) {
    if f.dir == "" {
        return false, "no working directory configured"
    }
    info, err := os.Stat(f.dir)
    if err != nil || !info.IsDir() {
        return false, fmt.Sprintf("directory not accessible: %s", f.dir)
    }
    return true, ""
}

How Availability Affects the Registry

registry.Definitions() returns only available tools. The model never sees unavailable tools.
registry.AllDefinitions() returns all tools regardless of availability.
registry.CheckAvailability() returns a ToolStatus for each tool with its name, type, description, availability flag, and reason.

// ToolStatus is availability state for one tool.
type ToolStatus struct {
    Name        string
    Type        ToolType
    Description string
    Available   bool
    Reason      string
}

Dynamic availability: Availability is checked every time Definitions() is called. A tool can become available or unavailable between loop iterations. For example, a tool might become unavailable after a rate limit is hit.

Unavailable tools can still be executed. Registry.Execute() does not check availability - it only checks that the tool exists. The availability filter applies only to definition listing. This is by design: the model should not see unavailable tools, but if a tool becomes unavailable mid-loop, the harness can still try to execute a previously-issued call.

8. CLI Tools (TypeCLI)

CLI tools wrap external binaries. They implement the same Tool interface but shell out to a subprocess for execution. Mark them with TypeCLI so observers know they depend on external programs.

// Tool that wraps an external binary
type grepTool struct{}

func (g *grepTool) Info() tool.ToolInfo {
    return tool.ToolInfo{
        Name:        "grep_files",
        Type:        tool.TypeCLI,  // marks as external binary wrapper
        Description: "Search files using grep",
    }
}

func (g *grepTool) Definition() core.ToolDefinition {
    return core.ToolDefinition{
        Name:        "grep_files",
        Description: "Search files recursively for a pattern",
        Parameters: core.Schema{
            Type: "object",
            Properties: map[string]core.Schema{
                "pattern": {
                    Type:        "string",
                    Description: "Regular expression pattern to search for",
                },
            },
            Required: []string{"pattern"},
        },
    }
}

func (g *grepTool) Execute(args map[string]any) (string, error) {
    pattern := args["pattern"].(string)
    cmd := exec.Command("grep", "-r", pattern, ".")
    output, err := cmd.CombinedOutput()
    return string(output), err
}

func (g *grepTool) Available() (bool, string) {
    _, err := exec.LookPath("grep")
    if err != nil {
        return false, "grep binary not found in PATH"
    }
    return true, ""
}

CLI tool availability: Use exec.LookPath in Available() to check if the binary exists. This way the tool gracefully disappears from the model's view when the binary is missing.

Security consideration: CLI tools run arbitrary subprocesses. Validate and sanitize arguments before passing them to exec.Command. Never pass raw model output as shell arguments without validation.

9. Using Tools with Orchestration

Tools are wired into the orchestration layer through the registry. Both AgentLoop and SpecializedLoop accept a *tool.Registry.

AgentLoop - Multi-turn with tools

agent := orchestrate.NewAgentLoop(orchestrate.LoopConfig{
    Engine:   engine,
    Tools:    registry,  // enable tool calls
    OnToolResult: func(name, output string) {
        fmt.Printf("[TOOL] %s: %s\n", name, output)
    },
})

The OnToolResult callback fires after each tool execution. Use it for logging, progress display, or streaming tool output to users.

SpecializedLoop - Two-phase: tools then structured output

sl := orchestrate.NewSpecializedLoop(orchestrate.SpecializedConfig{
    Engine: engine,
    Tools:  registry,   // Phase 1: tool calls
    Schema: schema,     // Phase 2: structured output
})
// Phase 1: model uses tools to gather info
// Phase 2: model produces structured JSON using gathered info

SpecializedLoop creates a fresh AgentLoop internally for each call. In phase 1, the model uses tools to gather information. In phase 2, tools are removed and the model produces structured JSON output conforming to the given schema.

Two-phase design: Separating tool use from structured output prevents the model from trying to produce JSON while also issuing tool calls. Phase 1 is free-form; phase 2 is constrained.

LoopConfig Fields

Field	Type	Description
`Engine`	`inference.Engine`	LLM inference backend (required)
`Tools`	`*tool.Registry`	Tool registry; nil disables tool calls
`SystemPrompt`	`string`	System message prepended to all requests
`MaxToolIterations`	`int`	Max tool-call loop rounds; 0 uses default (20)
`OnToolResult`	`func(name, output string)`	Callback after each tool execution
`Observer`	`observe.EventLog`	Event log for observability

10. Safety and Observability

pagantic treats tool observability as a first-class concern. The ToolExecutor wraps the registry with event recording:

ToolExecutor

// ToolExecutor wraps registry with observability.
type ToolExecutor struct {
    Registry *Registry
    Observer observe.EventLog
}

// Execute runs tool call and records events.
func (te *ToolExecutor) Execute(ctx context.Context, call core.ToolCall) core.ToolResult {
    start := time.Now()
    te.record(observe.Event{
        Timestamp: start,
        Layer:     "tool",
        Action:    "execute_start",
        Data: map[string]any{
            "call_id": call.ID,
            "name":    call.Name,
            "args":    call.Arguments,
        },
    })

    // ... execute tool ...

    te.record(observe.Event{
        Timestamp: time.Now(),
        Layer:     "tool",
        Action:    "execute_end",
        Data: map[string]any{
            "call_id": call.ID,
            "name":    call.Name,
            "args":    call.Arguments,
        },
        Duration: time.Since(start),
        Error:    err,
    })

    return result
}

Safety Properties

Single dispatch point - All tool executions go through the registry. One place to add guards, rate limits, or access control.
Full event recording - Every tool execution is recorded with call ID, name, arguments, duration, and any error. Events use execute_start and execute_end actions.
Context cancellation - ToolExecutor checks ctx.Err() before execution. If the context is cancelled, the tool is not run and a cancellation event is recorded.
Error as data, not exceptions - Tool errors are returned as tool result messages with IsError: true. The model sees the error text and can retry or adjust its approach.

// ToolResult carries output back to the model.
type ToolResult struct {
    CallID  string
    Name    string
    Content string
    IsError bool  // model sees error text and can react
}

Error recovery: When a tool returns an error, the model receives the error text as a tool result. Smart models will adjust their approach - try different arguments, use a different tool, or explain the failure to the user. This is more resilient than crashing the loop.

Observer Events

Events recorded by the tool layer follow this structure:

observe.Event{
    Timestamp: time.Now(),
    Layer:     "tool",           // always "tool" for tool events
    Action:    "execute_start",  // or "execute_end", "execute_cancelled"
    Data: map[string]any{
        "call_id": "call_abc123",
        "name":    "roll_dice",
        "args":    map[string]any{"count": 2, "sides": 6},
    },
    Duration:  elapsed,          // only on execute_end
    Error:     err,              // nil on success
}

Debugging tip: Filter observer events by Layer: "tool" to see all tool activity. The call_id field lets you correlate start and end events for the same call.

Tool Safety Contract

Tools are deterministic execution points with well-defined safety properties. The safety contract formalizes isolation, idempotency, argument validation, and interaction with retry behavior.

Tool Safety Metadata

Each tool declares safety metadata that orchestration uses for execution decisions:

// Safety metadata for a tool
ToolSafety {
    SideEffects   SideEffectLevel  // none | read | write | external
    Idempotent    bool             // Safe to retry without side effects
    Requires      []string         // Required capabilities: "filesystem", "network", etc.
}

Side Effect Level	Description	Example
`none`	Pure computation, no I/O	Calculator, text formatter
`read`	Reads external state, no mutations	File reader, API query
`write`	Mutates local state	File writer, database insert
`external`	Calls external services with side effects	Email sender, webhook trigger

Idempotency and Retry

Orchestration uses the idempotency flag to decide retry behavior when a tool call fails:

Idempotent tools (Idempotent: true) - safe to retry automatically. The orchestration loop can re-execute the tool call without risk of duplicate side effects.
Non-idempotent tools (Idempotent: false) - retry requires explicit approval or a different recovery path. The loop reports a ToolFailure and lets the caller decide.

Retry contract. Tool retries require idempotency. Non-idempotent tool failures are terminal within the current loop iteration unless the caller provides explicit recovery logic.

Argument Validation

Tools validate arguments deterministically before execution:

Schema validation - arguments checked against the tool's parameter Schema
Tool-specific checks - custom validation logic in the tool implementation (e.g., file path exists, URL is reachable)
Fail-fast - validation errors return immediately without executing the tool body

Argument validation errors map to ToolFailure with error code TOOL_EXECUTION_FAILED and details indicating which argument failed.

Tool Sandbox

ToolSandbox is a conceptual boundary defining what a tool is allowed to access:

Filesystem constraints - which paths a tool can read/write
Network constraints - which hosts/ports a tool can reach
Process constraints - whether a tool can spawn subprocesses

Documentation-level. ToolSandbox is a conceptual model for reasoning about tool safety. Runtime enforcement is not built in but can be implemented via wrapper tools that check constraints before delegating to the inner tool.

Timeout and Rate Limiting

ToolTimeoutPolicy - per-tool deadline. If a tool exceeds its timeout, execution is cancelled and a ToolFailure with code TOOL_TIMEOUT is reported.
ToolRateLimitPolicy - per-tool rate limit (documentation-level). Limits calls per time window to prevent abuse of external APIs.

Mid-Loop Unavailability

When a tool becomes unavailable during an active tool loop:

The tool's Available() check returns false
The registry reports the tool as unavailable
The orchestration loop receives a ToolFailure with code TOOL_UNAVAILABLE
If other tools can satisfy the model's intent, the loop continues with available tools
If the unavailable tool is critical, the loop reports the failure to the caller

Correlation and Observability

Tool execution events carry correlation identifiers linking them to the inference response that triggered the tool call:

request_id - ties to the SystemRequest
tool_call_id - matches the ToolCall.ID from the inference result
step_name - when inside a PlanExecutor step

This enables full causality tracing: inference response -> tool call -> tool result -> next inference.

Table of Contents

1. The Tool System Philosophy

2. Tool Interface

Supporting Types

3. Writing a Tool - Complete Example

4. Tool Registry

Registry Methods

5. Tool Call Loop

How It Works Internally

Multiple Tool Calls Per Turn

6. Tool Definitions and Schema

Schema Type

7. Tool Availability

How Availability Affects the Registry

8. CLI Tools (TypeCLI)

9. Using Tools with Orchestration

AgentLoop - Multi-turn with tools

SpecializedLoop - Two-phase: tools then structured output

LoopConfig Fields

10. Safety and Observability

ToolExecutor

Safety Properties

Observer Events

Tool Safety Contract

Tool Safety Metadata

Idempotency and Retry

Argument Validation

Tool Sandbox

Timeout and Rate Limiting

Mid-Loop Unavailability

Correlation and Observability