Layer 4 - Deterministic capability execution outside the model
pagantic draws a hard line between what the model decides and what the harness executes. This separation is the foundation of the entire tool system.
This split means you can swap models without touching tool code, and you can add tools without retraining models. The contract between them is a JSON schema describing parameters - nothing more.
Every tool in pagantic implements a single interface with four methods.
The interface lives in layers/04_tool:
// Tool is the contract each tool must satisfy.
type Tool interface {
// Info returns tool metadata.
Info() ToolInfo
// Definition returns typed tool schema for the model.
Definition() core.ToolDefinition
// Execute runs tool with parsed args from model.
Execute(args map[string]any) (string, error)
// Available reports if tool is ready now.
Available() (bool, string)
}
Each method has a clear responsibility:
| Method | Purpose | Called by |
|---|---|---|
Info() |
Returns name, type, and description for logging and status display | Registry, observer |
Definition() |
Returns JSON Schema the model uses to construct tool call arguments | Registry (sent to model) |
Execute() |
Runs the tool with parsed arguments; returns result string or error | Registry dispatch |
Available() |
Reports whether tool can run right now; returns reason if not | Registry (filters definitions) |
type ToolInfo struct {
Name string
Type ToolType
Description string
}
type ToolType string
const (
TypeGo ToolType = "go" // Pure Go implementation
TypeCLI ToolType = "cli" // Wraps external binary
)
Here is a complete dice roller tool with detailed comments explaining every method. Use this as a template for your own tools.
package main
import (
"fmt"
"math/rand"
core "github.com/miroslav-matejovsky/pagantic/layers/00_core"
"github.com/miroslav-matejovsky/pagantic/layers/04_tool"
)
// diceTool implements tool.Tool for rolling dice.
type diceTool struct{}
// Info returns metadata used for logging and status display.
func (d *diceTool) Info() tool.ToolInfo {
return tool.ToolInfo{
Name: "roll_dice",
Type: tool.TypeGo,
Description: "Rolls N dice with S sides each",
}
}
// Definition returns the schema that tells the model what parameters
// this tool accepts. The model uses this to construct ToolCall arguments.
func (d *diceTool) Definition() core.ToolDefinition {
return core.ToolDefinition{
Name: "roll_dice",
Description: "Roll dice. Returns sum and individual results.",
Parameters: core.Schema{
Type: "object",
Properties: map[string]core.Schema{
"count": {
Type: "integer",
Description: "Number of dice to roll (1-10)",
},
"sides": {
Type: "integer",
Description: "Number of sides per die (4, 6, 8, 10, 12, 20)",
},
},
Required: []string{"count", "sides"},
},
}
}
// Execute runs when the model calls this tool.
// args are parsed from model's JSON - keys match Parameters schema.
func (d *diceTool) Execute(args map[string]any) (string, error) {
count := int(args["count"].(float64)) // JSON numbers are float64
sides := int(args["sides"].(float64))
if count < 1 || count > 10 {
return "", fmt.Errorf("count must be 1-10, got %d", count)
}
results := make([]int, count)
sum := 0
for i := range results {
results[i] = rand.Intn(sides) + 1
sum += results[i]
}
return fmt.Sprintf("Rolled %dd%d: %v (sum: %d)", count, sides, results, sum), nil
}
// Available reports whether this tool can run right now.
// Return (false, "reason") to temporarily disable a tool.
func (d *diceTool) Available() (bool, string) {
return true, ""
}
float64
in Go's map[string]any. Always cast with int(args["key"].(float64))
for integer parameters. Forgetting this is the most common tool bug.
snake_case
and verb-first: roll_dice, search_web,
read_file. The model uses these names to decide which tool
to call, so make them descriptive.
The Registry is the central dispatch point for all tools.
It holds tools by name, filters by availability, and dispatches execution.
// Create registry from tools
registry := tool.NewRegistry(&diceTool{}, &searchTool{}, &calcTool{})
// Get definitions for model (only available tools)
defs := registry.Definitions()
// Execute by name (dispatched by harness, not model)
output, err := registry.Execute("roll_dice", map[string]any{
"count": 2.0,
"sides": 6.0,
})
// Check availability of all tools
statuses := registry.CheckAvailability()
for _, s := range statuses {
fmt.Printf("%s: available=%v reason=%s\n", s.Name, s.Available, s.Reason)
}
| Method | Returns | Description |
|---|---|---|
NewRegistry(tools ...Tool) |
*Registry |
Builds registry from variadic tool list |
Definitions() |
[]ToolDefinition |
Definitions for available tools only (sent to model) |
AllDefinitions() |
[]ToolDefinition |
Definitions for all tools regardless of availability |
Execute(name, args) |
(string, error) |
Dispatch by name; returns error if tool unknown |
CheckAvailability() |
[]ToolStatus |
Availability state for all tools, sorted by name |
Tools() |
[]Tool |
All registered tools, sorted by name |
Registry.Execute(). This makes it trivial to add logging,
rate limiting, or access control in one place.
The AgentLoop drives the tool call loop. When the model
produces a tool call instead of text, the harness executes the tool and
feeds the result back. This continues until the model produces a text
response or MaxToolIterations is reached (default: 20).
ToolResultMessage.MaxToolIterations is exceeded, an error is returned.MaxToolIterations
is 20. Set it lower for simple agents, higher for complex multi-step workflows.
If exceeded, the loop returns an error rather than running forever.
Models can return multiple tool calls in a single response. The harness executes all of them and appends all results before the next inference call. This allows the model to gather information in parallel.
Tool definitions use core.ToolDefinition and core.Schema
to describe what a tool does and what parameters it accepts. These get
serialized to JSON Schema and sent to the model so it knows how to
construct tool call arguments.
core.ToolDefinition{
Name: "search_web",
Description: "Search the web for information",
Parameters: core.Schema{
Type: "object",
Properties: map[string]core.Schema{
"query": {
Type: "string",
Description: "Search query",
},
"max_results": {
Type: "integer",
Description: "Maximum results to return (default: 5)",
},
},
Required: []string{"query"},
},
}
The core.Schema type is a JSON Schema subset that covers
the common cases for tool parameters:
type Schema struct {
Type string `json:"type,omitempty"`
Description string `json:"description,omitempty"`
Properties map[string]Schema `json:"properties,omitempty"`
Required []string `json:"required,omitempty"`
Enum []string `json:"enum,omitempty"`
Items *Schema `json:"items,omitempty"`
Default any `json:"default,omitempty"`
}
| Field | Use |
|---|---|
Type |
"object", "string", "integer", "number", "boolean", "array" |
Properties |
Named parameters when Type is "object" |
Required |
List of parameter names the model must always provide |
Enum |
Constrain a string parameter to specific values |
Items |
Schema for array elements when Type is "array" |
Default |
Default value hint for optional parameters |
Description fields to decide when and how to call a tool.
Be specific about ranges, formats, and defaults. Vague descriptions
lead to bad tool calls.
Tools can be conditionally available. The Available() method
lets a tool report whether it can run right now and why not. This is
useful for tools that depend on external resources, configuration, or
runtime conditions.
type fileTool struct {
dir string
}
func (f *fileTool) Available() (bool, string) {
if f.dir == "" {
return false, "no working directory configured"
}
info, err := os.Stat(f.dir)
if err != nil || !info.IsDir() {
return false, fmt.Sprintf("directory not accessible: %s", f.dir)
}
return true, ""
}
registry.Definitions() returns only available
tools. The model never sees unavailable tools.registry.AllDefinitions() returns all tools
regardless of availability.registry.CheckAvailability() returns a ToolStatus
for each tool with its name, type, description, availability flag, and reason.// ToolStatus is availability state for one tool.
type ToolStatus struct {
Name string
Type ToolType
Description string
Available bool
Reason string
}
Definitions() is called. A tool can become available or
unavailable between loop iterations. For example, a tool might become
unavailable after a rate limit is hit.
Registry.Execute() does not check availability - it only
checks that the tool exists. The availability filter applies only to
definition listing. This is by design: the model should not see
unavailable tools, but if a tool becomes unavailable mid-loop, the
harness can still try to execute a previously-issued call.
CLI tools wrap external binaries. They implement the same Tool
interface but shell out to a subprocess for execution. Mark them with
TypeCLI so observers know they depend on external programs.
// Tool that wraps an external binary
type grepTool struct{}
func (g *grepTool) Info() tool.ToolInfo {
return tool.ToolInfo{
Name: "grep_files",
Type: tool.TypeCLI, // marks as external binary wrapper
Description: "Search files using grep",
}
}
func (g *grepTool) Definition() core.ToolDefinition {
return core.ToolDefinition{
Name: "grep_files",
Description: "Search files recursively for a pattern",
Parameters: core.Schema{
Type: "object",
Properties: map[string]core.Schema{
"pattern": {
Type: "string",
Description: "Regular expression pattern to search for",
},
},
Required: []string{"pattern"},
},
}
}
func (g *grepTool) Execute(args map[string]any) (string, error) {
pattern := args["pattern"].(string)
cmd := exec.Command("grep", "-r", pattern, ".")
output, err := cmd.CombinedOutput()
return string(output), err
}
func (g *grepTool) Available() (bool, string) {
_, err := exec.LookPath("grep")
if err != nil {
return false, "grep binary not found in PATH"
}
return true, ""
}
exec.LookPath in
Available() to check if the binary exists. This way the tool
gracefully disappears from the model's view when the binary is missing.
exec.Command. Never pass raw model output as shell arguments
without validation.
Tools are wired into the orchestration layer through the registry.
Both AgentLoop and SpecializedLoop accept
a *tool.Registry.
agent := orchestrate.NewAgentLoop(orchestrate.LoopConfig{
Engine: engine,
Tools: registry, // enable tool calls
OnToolResult: func(name, output string) {
fmt.Printf("[TOOL] %s: %s\n", name, output)
},
})
The OnToolResult callback fires after each tool execution.
Use it for logging, progress display, or streaming tool output to users.
sl := orchestrate.NewSpecializedLoop(orchestrate.SpecializedConfig{
Engine: engine,
Tools: registry, // Phase 1: tool calls
Schema: schema, // Phase 2: structured output
})
// Phase 1: model uses tools to gather info
// Phase 2: model produces structured JSON using gathered info
SpecializedLoop creates a fresh AgentLoop
internally for each call. In phase 1, the model uses tools to gather
information. In phase 2, tools are removed and the model produces
structured JSON output conforming to the given schema.
| Field | Type | Description |
|---|---|---|
Engine |
inference.Engine |
LLM inference backend (required) |
Tools |
*tool.Registry |
Tool registry; nil disables tool calls |
SystemPrompt |
string |
System message prepended to all requests |
MaxToolIterations |
int |
Max tool-call loop rounds; 0 uses default (20) |
OnToolResult |
func(name, output string) |
Callback after each tool execution |
Observer |
observe.EventLog |
Event log for observability |
pagantic treats tool observability as a first-class concern.
The ToolExecutor wraps the registry with event recording:
// ToolExecutor wraps registry with observability.
type ToolExecutor struct {
Registry *Registry
Observer observe.EventLog
}
// Execute runs tool call and records events.
func (te *ToolExecutor) Execute(ctx context.Context, call core.ToolCall) core.ToolResult {
start := time.Now()
te.record(observe.Event{
Timestamp: start,
Layer: "tool",
Action: "execute_start",
Data: map[string]any{
"call_id": call.ID,
"name": call.Name,
"args": call.Arguments,
},
})
// ... execute tool ...
te.record(observe.Event{
Timestamp: time.Now(),
Layer: "tool",
Action: "execute_end",
Data: map[string]any{
"call_id": call.ID,
"name": call.Name,
"args": call.Arguments,
},
Duration: time.Since(start),
Error: err,
})
return result
}
execute_start and execute_end actions.ToolExecutor checks
ctx.Err() before execution. If the context is cancelled,
the tool is not run and a cancellation event is recorded.IsError: true. The model sees
the error text and can retry or adjust its approach.// ToolResult carries output back to the model.
type ToolResult struct {
CallID string
Name string
Content string
IsError bool // model sees error text and can react
}
Events recorded by the tool layer follow this structure:
observe.Event{
Timestamp: time.Now(),
Layer: "tool", // always "tool" for tool events
Action: "execute_start", // or "execute_end", "execute_cancelled"
Data: map[string]any{
"call_id": "call_abc123",
"name": "roll_dice",
"args": map[string]any{"count": 2, "sides": 6},
},
Duration: elapsed, // only on execute_end
Error: err, // nil on success
}
Layer: "tool" to see all tool activity. The
call_id field lets you correlate start and end events
for the same call.
Tools are deterministic execution points with well-defined safety properties. The safety contract formalizes isolation, idempotency, argument validation, and interaction with retry behavior.
Each tool declares safety metadata that orchestration uses for execution decisions:
// Safety metadata for a tool
ToolSafety {
SideEffects SideEffectLevel // none | read | write | external
Idempotent bool // Safe to retry without side effects
Requires []string // Required capabilities: "filesystem", "network", etc.
}
| Side Effect Level | Description | Example |
|---|---|---|
none | Pure computation, no I/O | Calculator, text formatter |
read | Reads external state, no mutations | File reader, API query |
write | Mutates local state | File writer, database insert |
external | Calls external services with side effects | Email sender, webhook trigger |
Orchestration uses the idempotency flag to decide retry behavior when a tool call fails:
Idempotent: true) - safe to retry automatically.
The orchestration loop can re-execute the tool call without risk of duplicate side effects.Idempotent: false) - retry requires explicit
approval or a different recovery path. The loop reports a
ToolFailure and lets the caller decide.Tools validate arguments deterministically before execution:
Argument validation errors map to
ToolFailure with error code
TOOL_EXECUTION_FAILED and details indicating which argument failed.
ToolSandbox is a conceptual boundary defining what a tool is allowed to access:
TOOL_TIMEOUT is reported.When a tool becomes unavailable during an active tool loop:
Available() check returns falseTOOL_UNAVAILABLETool execution events carry correlation identifiers linking them to the inference response that triggered the tool call:
request_id - ties to the SystemRequesttool_call_id - matches the ToolCall.ID from the inference resultstep_name - when inside a PlanExecutor stepThis enables full causality tracing: inference response -> tool call -> tool result -> next inference.