pagantic

Probabilistic Agentic Control System

LLM harness with deterministic control architecture around probabilistic inference.

Overview

pagantic wraps LLM inference with a deterministic control system. Models are probabilistic - they generate tokens stochastically. The harness adds deterministic scaffolding: timeouts, tool execution, output validation, context retrieval, structured output enforcement, candidate reranking, and redundant inference with voting.

Inspired by Harness engineering for coding agent users. Uses kronk for local LLM engine access.

Architecture

10-layer system with explicit architectural boundaries, adapter-based I/O, and a dedicated engine wrapper. Dependencies flow downward only. No circular imports. Architecture rules enforced by go-arch-lint.

+---------------------------+ | Adapters | | cli | tui | api | +-------+-------+------------+ | +-------v-------+ | orchestrate | Control loop, agent loops +-------+-------+ | +-------+-------+---+---+-------+-------+ | | | | | | context tool constraint rerank validate prompt | | | | | | +-------+-------+---+---+-------+-------+ | +-------v-------+ | inference | Engine interface +-------+-------+ | +-------v-------+ | kronk | SDK adapter +---------------+ | +-------v-------+ | core | Shared domain types +---------------+

Dependency Rules

Each layer lives under layers/ with a numeric prefix enforcing direction
Higher layers may depend on lower layers, never the reverse
Cross-layer communication uses interfaces and structural typing
Adapters depend on orchestrate and select layers - never on each other
Core has zero dependencies - it is shared vocabulary only

Layers

Layer	Package	Purpose
0	core	Shared domain types - Message, ToolCall, Schema, TokenUsage
1	inference	Engine interface for model inference
2	orchestrate	Control loop - AgentLoop, SpecializedLoop, PlanExecutor, RedundantLoop
3	context	Knowledge retrieval - Retriever, ContextBuilder
4	tool	Tool registry and execution
5	constraint	Output enforcement - JSON validation, repair, schema, GBNF grammar
6	rerank	Candidate scoring and reranking
7	validate	Guardrails, rule validation, retry policy
8	prompt	Prompt construction - SystemPrompt, InstructionSet, Template
9	memory	State management - ConversationBuffer, SessionState, WorkingMemory
10	observe	Tracing, metrics, event logging, cost tracking

Layer 0 - core shared vocabulary

Shared domain types used across all layers. Simple data structures with no behavior beyond construction helpers. No package may depend on core for logic - only for type definitions.

Key types:

Message - conversation message with Role, Content, ToolCalls
Schema - JSON Schema subset for structured output and tool parameters
ToolCall - assistant's request to execute a tool
ToolDefinition - tool metadata and parameter schema
TokenUsage - token consumption and throughput metrics

// Roles
msg := core.NewUserMessage("Hello")
msg := core.NewSystemMessage("You are helpful.")
msg := core.NewAssistantMessage("Hi there!")
msg := core.NewToolResultMessage(callID, "search", "results...")

Layer 1 - inference execution substrate

Accepts structured prompt input and produces raw model output. Knows nothing about tools, workflows, schemas beyond transport, or business rules.

Key types:

Engine - core inference abstraction: Infer(ctx, Request) (*Result, error)
Request - typed input: Messages, Tools, Schema, Grammar, MaxTokens, Temperature
Result - typed output: Content, ToolCalls, Messages, Usage
StreamHandler - callbacks for streaming: OnContent, OnReasoning, OnToolCall

result, err := engine.Infer(ctx, inference.Request{
    Messages:  messages,
    MaxTokens: 2048,
    Grammar:   `root ::= "yes" | "no"`,  // GBNF decoder constraint
})

Layer 2 - orchestrate control loop

Drives work across many inference steps. Breaks requests into steps, routes work, enforces order, manages retries and tool loops.

Key types:

AgentLoop - stateful multi-turn agent with tool call loop
SpecializedLoop - stateless single-shot with optional tools then schema-constrained output
PlanExecutor - runs ExecutionPlan steps with pluggable handlers
RedundantLoop - N-version inference with voting for reliability
ContextProvider - interface for context retrieval (structural typing)
CandidateReranker - interface for reranking (structural typing)
VotingStrategy - MajorityVoting, UnanimityVoting

// Multi-turn chat with tools
agent := orchestrate.NewAgentLoop(orchestrate.LoopConfig{
    Engine:       engine,
    Tools:        registry,
    SystemPrompt: "You are helpful.",
})
result, err := agent.Chat(ctx, "Roll 2d6")

// Structured single-shot
sl := orchestrate.NewSpecializedLoop(orchestrate.SpecializedConfig{
    Engine: engine, Schema: schema, Grammar: grammarStr,
})
result, err := sl.Call(ctx, "Classify this text")

// Redundant inference
rl := orchestrate.NewRedundantLoop(orchestrate.RedundantConfig{
    Engine: engine, Schema: schema, N: 3, Voting: orchestrate.MajorityVoting{},
})
result, err := rl.Call(ctx, prompt)

Layer 3 - context knowledge retrieval

Gives the model relevant, bounded knowledge. The model never acts on unconstrained knowledge.

Key types:

Retriever - interface: Retrieve(ctx, query, maxResults) ([]Chunk, error)
InMemoryRetriever - keyword-matching retriever for bounded domains
ContextBuilder - assembles retrieved chunks into system messages
Document - source document with Content and Source
Chunk - scored document fragment

retriever := pctx.NewInMemoryRetriever(
    pctx.Document{Content: "Go interfaces are implicit.", Source: "go-docs"},
)
builder := &pctx.ContextBuilder{Retriever: retriever, MaxChunks: 3}
msgs, err := builder.Build(ctx, "How do interfaces work?")

Layer 4 - tool deterministic capability

Runs operations outside the model. All side effects live here, never in the model.

Key types:

Tool - interface: Info, Definition, Execute, Available
Registry - groups tools and dispatches execution by name
ToolExecutor - adds observability around registry calls
ToolType - TypeGo (pure Go) or TypeCLI (external binary)

registry := tool.NewRegistry(&myTool{})
output, err := registry.Execute("tool_name", args)

Layer 5 - constraint output enforcement

Enforces deterministic output at the system boundary. No unstructured text crosses.

Key types:

OutputValidator - validates raw model output
JSONValidator - checks and optionally repairs JSON
SchemaValidator - validates JSON against core.Schema
RepairJSON - closes truncated JSON (missing braces, brackets, quotes)
NormalizeEnumValues - rewrites enum strings to canonical case
GrammarDefinition - holds GBNF grammar for decoder constraints
DecoderConstraint - interface for decoder-level enforcement
GrammarConstraint - wraps GrammarDefinition as DecoderConstraint

// Post-hoc validation
sv := constraint.NewSchemaValidator(schema)
result := sv.Validate(jsonOutput)

// Decoder constraint (prevents invalid tokens)
grammar := constraint.GrammarDefinition{
    Name:    "sentiment",
    Grammar: `root ::= "positive" | "negative" | "neutral"`,
}
err := grammar.Validate()  // checks for root rule

Layer 6 - rerank candidate evaluation

Scores retrieved or generated candidates, reorders by relevance, picks best subset. Secondary reasoning layer correcting initial approximations.

Key types:

Candidate - scored item with Content, Score, Source, Metadata
CandidateSet - groups candidates with original query
RelevanceScorer - interface: Score(ctx, query, candidates)
SimpleScorer - keyword overlap scorer (dev/testing only)
Reranker - combines scorer with selection policy
SelectionPolicy - TopK and MinScore thresholds

reranker := &rerank.Reranker{
    Scorer: &rerank.SimpleScorer{},
    Policy: rerank.SelectionPolicy{TopK: 3, MinScore: 0.1},
}
results, err := reranker.Rerank(ctx, rerank.CandidateSet{
    Query: "interfaces", Candidates: candidates,
})

Layer 7 - validate guardrails

Guards final output against hard constraints, rules, and repair paths. Two kinds: deterministic validation and inferential validation.

Key types:

RuleValidator - runs deterministic rule checks on output
Rule - named check function
SemanticValidator - interface for LLM-backed validation
RepairStrategy - interface for fixing invalid output
JSONRepairStrategy - wraps constraint.RepairJSON
RetryPolicy - configurable retry with backoff

rv := validate.NewRuleValidator(
    validate.Rule{Name: "not-empty", Check: func(s string) error {
        if s == "" { return fmt.Errorf("empty output") }
        return nil
    }},
)
errs := rv.Validate(output)

rp := &validate.RetryPolicy{MaxRetries: 3, Backoff: time.Second}
err := rp.Execute(ctx, func() error { return doWork() })

Layer 8 - prompt feedforward control

Shapes model behavior before execution through structured prompt construction. Keeps prompt assembly explicit, testable, and reusable.

Key types:

SystemPrompt - builds system message from base + instructions + policy
InstructionSet - named group of rules for system prompt
Template - prompt with variable placeholders
ContextPolicy - controls what context is included

sp := &prompt.SystemPrompt{
    Base: "You are a helpful assistant.",
    Instructions: []prompt.InstructionSet{
        {Name: "Safety", Rules: []string{"Never reveal secrets", "Be polite"}},
    },
}
msg := sp.Build()

tmpl := &prompt.Template{
    Raw:       "Translate {{.Text}} to {{.Language}}",
    Variables: map[string]string{"Text": "hello", "Language": "French"},
}
result, err := tmpl.Render()

Layer 9 - memory state management

Keeps state outside prompt text. State stays explicit so other layers can inspect, reset, persist, and version it.

Key types:

ConversationBuffer - message history with optional size limit
SessionState - thread-safe key-value store across steps
WorkingMemory - transient per-step context and results

buf := memory.NewConversationBuffer(50)  // keep last 50 messages
buf.Append(core.NewUserMessage("hello"))

state := memory.NewSessionState()
state.Set("phase", "analysis")
val, ok := state.Get("phase")

Layer 10 - observe observability

Makes system measurable and debuggable. No execution without traceability.

Key types:

EventLog - interface: Record(Event), Events()
TraceRecorder - interface: StartSpan returns context + Span
MetricsCollector - interface: latency, tokens, counters
CostTracker - interface: usage recording and cost computation
InMemoryEventLog, InMemoryTracer, InMemoryMetrics - in-memory implementations
NoopEventLog, NoopTracer - no-op implementations

log := &observe.InMemoryEventLog{}
log.Record(observe.Event{
    Timestamp: time.Now(), Layer: "tool", Action: "execute",
    Data: map[string]any{"name": "search"}, Duration: elapsed,
})
events := log.Events()

Adapters

Adapters are thin boundary layers between external interaction channels and the internal execution core. Each adapter is stateless, deterministic, and replaceable.

Adapter	Purpose	Key Types
cli	Single-shot command-line execution. Read prompt, run inference, write result, exit.	`Runner`, `RunConfig`, `ReadPrompt`
tui	Interactive terminal UI. REPL with commands, streaming chat, ANSI colors.	`AgentREPL`, `AgentConfig`, `Command`
api	Service interface contracts. Request/Response types, validation, streaming, errors.	`ChatRequest`, `ChatResponse`, `APIError`

CLI Adapter

runner := cli.NewRunner(cli.RunConfig{
    Engine:       engine,
    SystemPrompt: "You are helpful.",
    Out:          os.Stdout,
})
err := runner.Run(ctx, "What is Go?")

TUI Adapter

repl := tui.NewAgentREPL(tui.AgentConfig{
    Title:        "my-agent",
    Banner:       "Welcome! Type /help for commands.",
    SystemPrompt: "You are helpful.",
    EngineLoader: func(ctx context.Context) (inference.Engine, func(), error) {
        krn, cleanup, err := kronk.Load(ctx, kronk.Config{ModelSource: model})
        return kronk.NewAdapter(krn, nil), cleanup, err
    },
})
repl.Run(ctx)

Engine (kronk)

The kronk package wraps the Ardan Labs kronk SDK for local LLM inference. It handles library installation, model downloading, and engine initialization through a single Load call.

Usage

krn, cleanup, err := kronk.Load(ctx, kronk.Config{
    ModelSource: "unsloth/gemma-4-E4B-it",
})
defer cleanup()

engine := kronk.NewAdapter(krn, nil)
// engine satisfies inference.Engine

How It Works

Load downloads kronk runtime libraries (llama.cpp bindings)
Downloads model files from HuggingFace
Initializes inference engine with model
Returns kronk instance + cleanup function

Adapter implements inference.Engine by converting between pagantic core types and kronk's model.D format. Streaming is handled via channel-based ChatStreaming.

Features

GBNF Grammar Constraints

GBNF (GGML BNF) grammars constrain token generation at the decoder level. Unlike post-hoc validation which checks output after generation, grammar constraints prevent invalid tokens from being generated at all.

// Define grammar
grammar := constraint.GrammarDefinition{
    Name:    "sentiment",
    Grammar: `root ::= "{" ws "\"sentiment\"" ws ":" ws val ws "}"
ws  ::= [ \t\n]*
val ::= "\"positive\"" | "\"negative\"" | "\"neutral\""`,
}
err := grammar.Validate()  // checks for root rule

// Use in SpecializedLoop
sl := orchestrate.NewSpecializedLoop(orchestrate.SpecializedConfig{
    Engine:  engine,
    Schema:  schema,
    Grammar: grammar.GrammarString(),
})

ExecutionPlan and Step Routing

PlanExecutor runs typed steps in sequence. Output of step N feeds input of step N+1. Handlers are pluggable per step type.

plan := orchestrate.ExecutionPlan{
    Steps: []orchestrate.Step{
        {Name: "retrieve", Type: orchestrate.StepRetrieve, Input: query},
        {Name: "rerank",   Type: orchestrate.StepRerank},
        {Name: "infer",    Type: orchestrate.StepInfer},
    },
}
executor := orchestrate.NewPlanExecutor(map[orchestrate.StepType]orchestrate.StepHandler{
    orchestrate.StepRetrieve: orchestrate.RetrieveHandler(contextProvider),
    orchestrate.StepRerank:   rerankHandler,
    orchestrate.StepInfer:    inferHandler,
})
results, err := executor.Execute(ctx, plan)

Redundant Inference (TMR / N-Version)

RedundantLoop runs the same inference N times and applies a voting strategy to pick the best result. Trades latency for reliability.

rl := orchestrate.NewRedundantLoop(orchestrate.RedundantConfig{
    Engine: engine,
    Schema: schema,
    N:      3,                            // 3 candidates
    Voting: orchestrate.MajorityVoting{}, // pick most frequent
})
result, err := rl.Call(ctx, prompt)
// result.Confidence = 1.0 if all 3 agree
// result.Candidates has all 3 raw outputs

Reranking Pipeline

Two-stage retrieval: broad recall (retriever) then precise selection (reranker). Standard pattern in production RAG pipelines.

reranker := &rerank.Reranker{
    Scorer: &rerank.SimpleScorer{},
    Policy: rerank.SelectionPolicy{TopK: 3, MinScore: 0.1},
}
results, err := reranker.Rerank(ctx, rerank.CandidateSet{
    Query:      "interfaces",
    Candidates: candidates,
})

Examples

CLI Examples

Example	Description	Key Concepts
`cli/simple-query`	Single-shot query, simplest pattern	AgentLoop, Engine, CLI adapter
`cli/context-query`	Query with context retrieval (RAG)	InMemoryRetriever, ContextBuilder, ContextProvider
`cli/grammar-query`	GBNF grammar-constrained output	GrammarDefinition, DecoderConstraint, SpecializedLoop
`cli/rerank-query`	Query with document reranking	PlanExecutor, Reranker, StepHandler
`cli/redundant-query`	Redundant inference with voting	RedundantLoop, MajorityVoting, TMR pattern

TUI Examples

Example	Description	Key Concepts
`tui/simple-chat`	Minimal interactive chat REPL	AgentREPL, streaming, ANSI output
`tui/tool-use`	Chat with custom Go tool (dice roller)	Tool interface, Registry, tool-call loop
`tui/structured-output`	Schema-constrained JSON output	SpecializedLoop, SchemaValidator, RepairJSON
`tui/context-chat`	Chat with per-turn context retrieval	ContextProvider, ephemeral context injection

Running Examples

# CLI examples (pass prompt as argument or pipe)
go run examples/cli/simple-query/main.go "What is Go?"
go run examples/cli/context-query/main.go "What is pagantic?"
go run examples/cli/grammar-query/main.go "I love this!"
go run examples/cli/rerank-query/main.go "How do interfaces work?"
go run examples/cli/redundant-query/main.go "Great product!"

# Or use Task runner
task run-simple-query -- "What is Go?"
task run-grammar-query -- "I love this!"
task run-rerank-query -- "How do interfaces work?"
task run-redundant-query -- "Great product!"

# TUI examples (interactive)
task run-simple-chat
task run-tool-use
task run-structured-output
task run-context-chat

Getting Started

Prerequisites

Go 1.26+
C compiler (for kronk/llama.cpp bindings)
Internet connection (first run downloads model files)
Task runner (optional, for convenience commands)

Installation

git clone https://github.com/miroslav-matejovsky/pagantic.git
cd pagantic
go mod tidy

First Query

go run examples/cli/simple-query/main.go "What is the capital of France?"

First run downloads kronk libraries and model files (may take several minutes). Subsequent runs use cached files.

Development

# Run all checks (vet, lint, arch-lint, tests)
task fast

# Run specific checks
task vet
task lint
task arch-lint
task test

# Install Go tools
task go-tools

System Contracts

pagantic defines stable boundary contracts at the system edge. Adapters map external input into a SystemRequest and map SystemResponse back to the user. No adapter accesses internal layers directly.

Contract	Purpose
SystemRequest	Input envelope: messages, mode, hints, output contract, correlation ids
SystemResponse	Output envelope: content, structured output, confidence, validation, error
SystemError	Canonical error with failure category, retryable flag, details
Execution Lifecycle	State machine: INIT -> PLAN -> PREPARE -> EXECUTE -> VALIDATE -> COMPLETE
Failure Taxonomy	Seven error categories with stable codes and recovery paths
Observability Correlation	request_id, session_id, trace_id, span_id for full timeline reconstruction

Full specifications in Contracts. Term definitions in Glossary.

Design Principles

Deterministic Control Around Probabilistic Inference

The model is probabilistic - it generates tokens stochastically. Everything else is deterministic: tool execution, validation, context retrieval, output repair, reranking, and voting. The harness mediates between deterministic control and probabilistic generation.

Structural Typing for Cross-Layer Integration

Layers communicate through interfaces defined by the consumer, not the provider. orchestrate.ContextProvider is satisfied by context.ContextBuilder without import. Same pattern for CandidateReranker. This keeps the dependency graph clean and allows independent evolution.

Explicit State

All state is explicit and inspectable. ConversationBuffer holds message history. SessionState holds key-value data. WorkingMemory holds transient step results. No hidden state in closures or globals.

No Magic

Every inference call goes through the harness. Every tool call is logged. Every output is validated. The system is transparent, debuggable, and predictable.

Layers Enforce Boundaries

Numeric prefixes on layer directories enforce dependency direction. The arch-lint tool checks at CI time. No runtime surprises from circular imports.