Table of Contents

1. The Tool System Philosophy

pagantic draws a hard line between what the model decides and what the harness executes. This separation is the foundation of the entire tool system.

Design principle: If you can't observe it, you can't debug it. Every tool call is recorded with timing, arguments, and errors through the observer layer.

This split means you can swap models without touching tool code, and you can add tools without retraining models. The contract between them is a JSON schema describing parameters - nothing more.

2. Tool Interface

Every tool in pagantic implements a single interface with four methods. The interface lives in layers/04_tool:

// Tool is the contract each tool must satisfy.
type Tool interface {
    // Info returns tool metadata.
    Info() ToolInfo
    // Definition returns typed tool schema for the model.
    Definition() core.ToolDefinition
    // Execute runs tool with parsed args from model.
    Execute(args map[string]any) (string, error)
    // Available reports if tool is ready now.
    Available() (bool, string)
}

Each method has a clear responsibility:

Method Purpose Called by
Info() Returns name, type, and description for logging and status display Registry, observer
Definition() Returns JSON Schema the model uses to construct tool call arguments Registry (sent to model)
Execute() Runs the tool with parsed arguments; returns result string or error Registry dispatch
Available() Reports whether tool can run right now; returns reason if not Registry (filters definitions)

Supporting Types

type ToolInfo struct {
    Name        string
    Type        ToolType
    Description string
}

type ToolType string

const (
    TypeGo  ToolType = "go"   // Pure Go implementation
    TypeCLI ToolType = "cli"  // Wraps external binary
)
ToolType is metadata only - it does not change execution behavior. It exists so observers and dashboards can distinguish pure-Go tools from tools that shell out to external binaries.

3. Writing a Tool - Complete Example

Here is a complete dice roller tool with detailed comments explaining every method. Use this as a template for your own tools.

package main

import (
    "fmt"
    "math/rand"

    core "github.com/miroslav-matejovsky/pagantic/layers/00_core"
    "github.com/miroslav-matejovsky/pagantic/layers/04_tool"
)

// diceTool implements tool.Tool for rolling dice.
type diceTool struct{}

// Info returns metadata used for logging and status display.
func (d *diceTool) Info() tool.ToolInfo {
    return tool.ToolInfo{
        Name:        "roll_dice",
        Type:        tool.TypeGo,
        Description: "Rolls N dice with S sides each",
    }
}

// Definition returns the schema that tells the model what parameters
// this tool accepts. The model uses this to construct ToolCall arguments.
func (d *diceTool) Definition() core.ToolDefinition {
    return core.ToolDefinition{
        Name:        "roll_dice",
        Description: "Roll dice. Returns sum and individual results.",
        Parameters: core.Schema{
            Type: "object",
            Properties: map[string]core.Schema{
                "count": {
                    Type:        "integer",
                    Description: "Number of dice to roll (1-10)",
                },
                "sides": {
                    Type:        "integer",
                    Description: "Number of sides per die (4, 6, 8, 10, 12, 20)",
                },
            },
            Required: []string{"count", "sides"},
        },
    }
}

// Execute runs when the model calls this tool.
// args are parsed from model's JSON - keys match Parameters schema.
func (d *diceTool) Execute(args map[string]any) (string, error) {
    count := int(args["count"].(float64))  // JSON numbers are float64
    sides := int(args["sides"].(float64))

    if count < 1 || count > 10 {
        return "", fmt.Errorf("count must be 1-10, got %d", count)
    }

    results := make([]int, count)
    sum := 0
    for i := range results {
        results[i] = rand.Intn(sides) + 1
        sum += results[i]
    }

    return fmt.Sprintf("Rolled %dd%d: %v (sum: %d)", count, sides, results, sum), nil
}

// Available reports whether this tool can run right now.
// Return (false, "reason") to temporarily disable a tool.
func (d *diceTool) Available() (bool, string) {
    return true, ""
}
JSON number gotcha: All JSON numbers arrive as float64 in Go's map[string]any. Always cast with int(args["key"].(float64)) for integer parameters. Forgetting this is the most common tool bug.
Naming convention: Tool names should be snake_case and verb-first: roll_dice, search_web, read_file. The model uses these names to decide which tool to call, so make them descriptive.

4. Tool Registry

The Registry is the central dispatch point for all tools. It holds tools by name, filters by availability, and dispatches execution.

// Create registry from tools
registry := tool.NewRegistry(&diceTool{}, &searchTool{}, &calcTool{})

// Get definitions for model (only available tools)
defs := registry.Definitions()

// Execute by name (dispatched by harness, not model)
output, err := registry.Execute("roll_dice", map[string]any{
    "count": 2.0,
    "sides": 6.0,
})

// Check availability of all tools
statuses := registry.CheckAvailability()
for _, s := range statuses {
    fmt.Printf("%s: available=%v reason=%s\n", s.Name, s.Available, s.Reason)
}

Registry Methods

Method Returns Description
NewRegistry(tools ...Tool) *Registry Builds registry from variadic tool list
Definitions() []ToolDefinition Definitions for available tools only (sent to model)
AllDefinitions() []ToolDefinition Definitions for all tools regardless of availability
Execute(name, args) (string, error) Dispatch by name; returns error if tool unknown
CheckAvailability() []ToolStatus Availability state for all tools, sorted by name
Tools() []Tool All registered tools, sorted by name
Single dispatch point: All tool executions go through Registry.Execute(). This makes it trivial to add logging, rate limiting, or access control in one place.

5. Tool Call Loop

The AgentLoop drives the tool call loop. When the model produces a tool call instead of text, the harness executes the tool and feeds the result back. This continues until the model produces a text response or MaxToolIterations is reached (default: 20).

User: "Roll 2d6 and tell me the result" | v Model -> ToolCall{Name: "roll_dice", Args: {count: 2, sides: 6}} | v Harness executes: registry.Execute("roll_dice", args) | v Result: "Rolled 2d6: [3, 5] (sum: 8)" | v Tool result appended to messages as ToolResultMessage | v Model (again) -> "You rolled 2d6 and got 3 and 5, for a total of 8!" | v Final text response returned to user

How It Works Internally

  1. User message is appended to conversation memory.
  2. All messages (system + history + user) are sent to the inference engine.
  3. If the model returns tool calls, each call is executed via the registry.
  4. Each tool result is appended as a ToolResultMessage.
  5. Messages are sent to the engine again (loop iteration).
  6. Steps 3-5 repeat until the model returns text (no tool calls).
  7. If MaxToolIterations is exceeded, an error is returned.
Iteration guard: The default MaxToolIterations is 20. Set it lower for simple agents, higher for complex multi-step workflows. If exceeded, the loop returns an error rather than running forever.

Multiple Tool Calls Per Turn

Models can return multiple tool calls in a single response. The harness executes all of them and appends all results before the next inference call. This allows the model to gather information in parallel.

6. Tool Definitions and Schema

Tool definitions use core.ToolDefinition and core.Schema to describe what a tool does and what parameters it accepts. These get serialized to JSON Schema and sent to the model so it knows how to construct tool call arguments.

core.ToolDefinition{
    Name:        "search_web",
    Description: "Search the web for information",
    Parameters: core.Schema{
        Type: "object",
        Properties: map[string]core.Schema{
            "query": {
                Type:        "string",
                Description: "Search query",
            },
            "max_results": {
                Type:        "integer",
                Description: "Maximum results to return (default: 5)",
            },
        },
        Required: []string{"query"},
    },
}

Schema Type

The core.Schema type is a JSON Schema subset that covers the common cases for tool parameters:

type Schema struct {
    Type        string            `json:"type,omitempty"`
    Description string            `json:"description,omitempty"`
    Properties  map[string]Schema `json:"properties,omitempty"`
    Required    []string          `json:"required,omitempty"`
    Enum        []string          `json:"enum,omitempty"`
    Items       *Schema           `json:"items,omitempty"`
    Default     any               `json:"default,omitempty"`
}
Field Use
Type "object", "string", "integer", "number", "boolean", "array"
Properties Named parameters when Type is "object"
Required List of parameter names the model must always provide
Enum Constrain a string parameter to specific values
Items Schema for array elements when Type is "array"
Default Default value hint for optional parameters
Write good descriptions. The model reads your Description fields to decide when and how to call a tool. Be specific about ranges, formats, and defaults. Vague descriptions lead to bad tool calls.

7. Tool Availability

Tools can be conditionally available. The Available() method lets a tool report whether it can run right now and why not. This is useful for tools that depend on external resources, configuration, or runtime conditions.

type fileTool struct {
    dir string
}

func (f *fileTool) Available() (bool, string) {
    if f.dir == "" {
        return false, "no working directory configured"
    }
    info, err := os.Stat(f.dir)
    if err != nil || !info.IsDir() {
        return false, fmt.Sprintf("directory not accessible: %s", f.dir)
    }
    return true, ""
}

How Availability Affects the Registry

// ToolStatus is availability state for one tool.
type ToolStatus struct {
    Name        string
    Type        ToolType
    Description string
    Available   bool
    Reason      string
}
Dynamic availability: Availability is checked every time Definitions() is called. A tool can become available or unavailable between loop iterations. For example, a tool might become unavailable after a rate limit is hit.
Unavailable tools can still be executed. Registry.Execute() does not check availability - it only checks that the tool exists. The availability filter applies only to definition listing. This is by design: the model should not see unavailable tools, but if a tool becomes unavailable mid-loop, the harness can still try to execute a previously-issued call.

8. CLI Tools (TypeCLI)

CLI tools wrap external binaries. They implement the same Tool interface but shell out to a subprocess for execution. Mark them with TypeCLI so observers know they depend on external programs.

// Tool that wraps an external binary
type grepTool struct{}

func (g *grepTool) Info() tool.ToolInfo {
    return tool.ToolInfo{
        Name:        "grep_files",
        Type:        tool.TypeCLI,  // marks as external binary wrapper
        Description: "Search files using grep",
    }
}

func (g *grepTool) Definition() core.ToolDefinition {
    return core.ToolDefinition{
        Name:        "grep_files",
        Description: "Search files recursively for a pattern",
        Parameters: core.Schema{
            Type: "object",
            Properties: map[string]core.Schema{
                "pattern": {
                    Type:        "string",
                    Description: "Regular expression pattern to search for",
                },
            },
            Required: []string{"pattern"},
        },
    }
}

func (g *grepTool) Execute(args map[string]any) (string, error) {
    pattern := args["pattern"].(string)
    cmd := exec.Command("grep", "-r", pattern, ".")
    output, err := cmd.CombinedOutput()
    return string(output), err
}

func (g *grepTool) Available() (bool, string) {
    _, err := exec.LookPath("grep")
    if err != nil {
        return false, "grep binary not found in PATH"
    }
    return true, ""
}
CLI tool availability: Use exec.LookPath in Available() to check if the binary exists. This way the tool gracefully disappears from the model's view when the binary is missing.
Security consideration: CLI tools run arbitrary subprocesses. Validate and sanitize arguments before passing them to exec.Command. Never pass raw model output as shell arguments without validation.

9. Using Tools with Orchestration

Tools are wired into the orchestration layer through the registry. Both AgentLoop and SpecializedLoop accept a *tool.Registry.

AgentLoop - Multi-turn with tools

agent := orchestrate.NewAgentLoop(orchestrate.LoopConfig{
    Engine:   engine,
    Tools:    registry,  // enable tool calls
    OnToolResult: func(name, output string) {
        fmt.Printf("[TOOL] %s: %s\n", name, output)
    },
})

The OnToolResult callback fires after each tool execution. Use it for logging, progress display, or streaming tool output to users.

SpecializedLoop - Two-phase: tools then structured output

sl := orchestrate.NewSpecializedLoop(orchestrate.SpecializedConfig{
    Engine: engine,
    Tools:  registry,   // Phase 1: tool calls
    Schema: schema,     // Phase 2: structured output
})
// Phase 1: model uses tools to gather info
// Phase 2: model produces structured JSON using gathered info

SpecializedLoop creates a fresh AgentLoop internally for each call. In phase 1, the model uses tools to gather information. In phase 2, tools are removed and the model produces structured JSON output conforming to the given schema.

Two-phase design: Separating tool use from structured output prevents the model from trying to produce JSON while also issuing tool calls. Phase 1 is free-form; phase 2 is constrained.

LoopConfig Fields

Field Type Description
Engine inference.Engine LLM inference backend (required)
Tools *tool.Registry Tool registry; nil disables tool calls
SystemPrompt string System message prepended to all requests
MaxToolIterations int Max tool-call loop rounds; 0 uses default (20)
OnToolResult func(name, output string) Callback after each tool execution
Observer observe.EventLog Event log for observability

10. Safety and Observability

pagantic treats tool observability as a first-class concern. The ToolExecutor wraps the registry with event recording:

ToolExecutor

// ToolExecutor wraps registry with observability.
type ToolExecutor struct {
    Registry *Registry
    Observer observe.EventLog
}

// Execute runs tool call and records events.
func (te *ToolExecutor) Execute(ctx context.Context, call core.ToolCall) core.ToolResult {
    start := time.Now()
    te.record(observe.Event{
        Timestamp: start,
        Layer:     "tool",
        Action:    "execute_start",
        Data: map[string]any{
            "call_id": call.ID,
            "name":    call.Name,
            "args":    call.Arguments,
        },
    })

    // ... execute tool ...

    te.record(observe.Event{
        Timestamp: time.Now(),
        Layer:     "tool",
        Action:    "execute_end",
        Data: map[string]any{
            "call_id": call.ID,
            "name":    call.Name,
            "args":    call.Arguments,
        },
        Duration: time.Since(start),
        Error:    err,
    })

    return result
}

Safety Properties

// ToolResult carries output back to the model.
type ToolResult struct {
    CallID  string
    Name    string
    Content string
    IsError bool  // model sees error text and can react
}
Error recovery: When a tool returns an error, the model receives the error text as a tool result. Smart models will adjust their approach - try different arguments, use a different tool, or explain the failure to the user. This is more resilient than crashing the loop.

Observer Events

Events recorded by the tool layer follow this structure:

observe.Event{
    Timestamp: time.Now(),
    Layer:     "tool",           // always "tool" for tool events
    Action:    "execute_start",  // or "execute_end", "execute_cancelled"
    Data: map[string]any{
        "call_id": "call_abc123",
        "name":    "roll_dice",
        "args":    map[string]any{"count": 2, "sides": 6},
    },
    Duration:  elapsed,          // only on execute_end
    Error:     err,              // nil on success
}
Debugging tip: Filter observer events by Layer: "tool" to see all tool activity. The call_id field lets you correlate start and end events for the same call.

Tool Safety Contract

Tools are deterministic execution points with well-defined safety properties. The safety contract formalizes isolation, idempotency, argument validation, and interaction with retry behavior.

Tool Safety Metadata

Each tool declares safety metadata that orchestration uses for execution decisions:

// Safety metadata for a tool
ToolSafety {
    SideEffects   SideEffectLevel  // none | read | write | external
    Idempotent    bool             // Safe to retry without side effects
    Requires      []string         // Required capabilities: "filesystem", "network", etc.
}
Side Effect LevelDescriptionExample
nonePure computation, no I/OCalculator, text formatter
readReads external state, no mutationsFile reader, API query
writeMutates local stateFile writer, database insert
externalCalls external services with side effectsEmail sender, webhook trigger

Idempotency and Retry

Orchestration uses the idempotency flag to decide retry behavior when a tool call fails:

Retry contract. Tool retries require idempotency. Non-idempotent tool failures are terminal within the current loop iteration unless the caller provides explicit recovery logic.

Argument Validation

Tools validate arguments deterministically before execution:

  1. Schema validation - arguments checked against the tool's parameter Schema
  2. Tool-specific checks - custom validation logic in the tool implementation (e.g., file path exists, URL is reachable)
  3. Fail-fast - validation errors return immediately without executing the tool body

Argument validation errors map to ToolFailure with error code TOOL_EXECUTION_FAILED and details indicating which argument failed.

Tool Sandbox

ToolSandbox is a conceptual boundary defining what a tool is allowed to access:

Documentation-level. ToolSandbox is a conceptual model for reasoning about tool safety. Runtime enforcement is not built in but can be implemented via wrapper tools that check constraints before delegating to the inner tool.

Timeout and Rate Limiting

Mid-Loop Unavailability

When a tool becomes unavailable during an active tool loop:

  1. The tool's Available() check returns false
  2. The registry reports the tool as unavailable
  3. The orchestration loop receives a ToolFailure with code TOOL_UNAVAILABLE
  4. If other tools can satisfy the model's intent, the loop continues with available tools
  5. If the unavailable tool is critical, the loop reports the failure to the caller

Correlation and Observability

Tool execution events carry correlation identifiers linking them to the inference response that triggered the tool call:

This enables full causality tracing: inference response -> tool call -> tool result -> next inference.