pagantic

Probabilistic Agentic Control System

LLM harness with deterministic control architecture around probabilistic inference.

Overview

pagantic wraps LLM inference with a deterministic control system. Models are probabilistic - they generate tokens stochastically. The harness adds deterministic scaffolding: timeouts, tool execution, output validation, context retrieval, structured output enforcement, candidate reranking, and redundant inference with voting.

Inspired by Harness engineering for coding agent users. Uses kronk for local LLM engine access.

Architecture

10-layer system with explicit architectural boundaries, adapter-based I/O, and a dedicated engine wrapper. Dependencies flow downward only. No circular imports. Architecture rules enforced by go-arch-lint.

+---------------------------+ | Adapters | | cli | tui | api | +-------+-------+------------+ | +-------v-------+ | orchestrate | Control loop, agent loops +-------+-------+ | +-------+-------+---+---+-------+-------+ | | | | | | context tool constraint rerank validate prompt | | | | | | +-------+-------+---+---+-------+-------+ | +-------v-------+ | inference | Engine interface +-------+-------+ | +-------v-------+ | kronk | SDK adapter +---------------+ | +-------v-------+ | core | Shared domain types +---------------+

Dependency Rules

Layers

LayerPackagePurpose
0coreShared domain types - Message, ToolCall, Schema, TokenUsage
1inferenceEngine interface for model inference
2orchestrateControl loop - AgentLoop, SpecializedLoop, PlanExecutor, RedundantLoop
3contextKnowledge retrieval - Retriever, ContextBuilder
4toolTool registry and execution
5constraintOutput enforcement - JSON validation, repair, schema, GBNF grammar
6rerankCandidate scoring and reranking
7validateGuardrails, rule validation, retry policy
8promptPrompt construction - SystemPrompt, InstructionSet, Template
9memoryState management - ConversationBuffer, SessionState, WorkingMemory
10observeTracing, metrics, event logging, cost tracking

Layer 0 - core shared vocabulary

Shared domain types used across all layers. Simple data structures with no behavior beyond construction helpers. No package may depend on core for logic - only for type definitions.

Key types:

// Roles
msg := core.NewUserMessage("Hello")
msg := core.NewSystemMessage("You are helpful.")
msg := core.NewAssistantMessage("Hi there!")
msg := core.NewToolResultMessage(callID, "search", "results...")

Layer 1 - inference execution substrate

Accepts structured prompt input and produces raw model output. Knows nothing about tools, workflows, schemas beyond transport, or business rules.

Key types:

result, err := engine.Infer(ctx, inference.Request{
    Messages:  messages,
    MaxTokens: 2048,
    Grammar:   `root ::= "yes" | "no"`,  // GBNF decoder constraint
})

Layer 2 - orchestrate control loop

Drives work across many inference steps. Breaks requests into steps, routes work, enforces order, manages retries and tool loops.

Key types:

// Multi-turn chat with tools
agent := orchestrate.NewAgentLoop(orchestrate.LoopConfig{
    Engine:       engine,
    Tools:        registry,
    SystemPrompt: "You are helpful.",
})
result, err := agent.Chat(ctx, "Roll 2d6")

// Structured single-shot
sl := orchestrate.NewSpecializedLoop(orchestrate.SpecializedConfig{
    Engine: engine, Schema: schema, Grammar: grammarStr,
})
result, err := sl.Call(ctx, "Classify this text")

// Redundant inference
rl := orchestrate.NewRedundantLoop(orchestrate.RedundantConfig{
    Engine: engine, Schema: schema, N: 3, Voting: orchestrate.MajorityVoting{},
})
result, err := rl.Call(ctx, prompt)

Layer 3 - context knowledge retrieval

Gives the model relevant, bounded knowledge. The model never acts on unconstrained knowledge.

Key types:

retriever := pctx.NewInMemoryRetriever(
    pctx.Document{Content: "Go interfaces are implicit.", Source: "go-docs"},
)
builder := &pctx.ContextBuilder{Retriever: retriever, MaxChunks: 3}
msgs, err := builder.Build(ctx, "How do interfaces work?")

Layer 4 - tool deterministic capability

Runs operations outside the model. All side effects live here, never in the model.

Key types:

registry := tool.NewRegistry(&myTool{})
output, err := registry.Execute("tool_name", args)

Layer 5 - constraint output enforcement

Enforces deterministic output at the system boundary. No unstructured text crosses.

Key types:

// Post-hoc validation
sv := constraint.NewSchemaValidator(schema)
result := sv.Validate(jsonOutput)

// Decoder constraint (prevents invalid tokens)
grammar := constraint.GrammarDefinition{
    Name:    "sentiment",
    Grammar: `root ::= "positive" | "negative" | "neutral"`,
}
err := grammar.Validate()  // checks for root rule

Layer 6 - rerank candidate evaluation

Scores retrieved or generated candidates, reorders by relevance, picks best subset. Secondary reasoning layer correcting initial approximations.

Key types:

reranker := &rerank.Reranker{
    Scorer: &rerank.SimpleScorer{},
    Policy: rerank.SelectionPolicy{TopK: 3, MinScore: 0.1},
}
results, err := reranker.Rerank(ctx, rerank.CandidateSet{
    Query: "interfaces", Candidates: candidates,
})

Layer 7 - validate guardrails

Guards final output against hard constraints, rules, and repair paths. Two kinds: deterministic validation and inferential validation.

Key types:

rv := validate.NewRuleValidator(
    validate.Rule{Name: "not-empty", Check: func(s string) error {
        if s == "" { return fmt.Errorf("empty output") }
        return nil
    }},
)
errs := rv.Validate(output)

rp := &validate.RetryPolicy{MaxRetries: 3, Backoff: time.Second}
err := rp.Execute(ctx, func() error { return doWork() })

Layer 8 - prompt feedforward control

Shapes model behavior before execution through structured prompt construction. Keeps prompt assembly explicit, testable, and reusable.

Key types:

sp := &prompt.SystemPrompt{
    Base: "You are a helpful assistant.",
    Instructions: []prompt.InstructionSet{
        {Name: "Safety", Rules: []string{"Never reveal secrets", "Be polite"}},
    },
}
msg := sp.Build()

tmpl := &prompt.Template{
    Raw:       "Translate {{.Text}} to {{.Language}}",
    Variables: map[string]string{"Text": "hello", "Language": "French"},
}
result, err := tmpl.Render()

Layer 9 - memory state management

Keeps state outside prompt text. State stays explicit so other layers can inspect, reset, persist, and version it.

Key types:

buf := memory.NewConversationBuffer(50)  // keep last 50 messages
buf.Append(core.NewUserMessage("hello"))

state := memory.NewSessionState()
state.Set("phase", "analysis")
val, ok := state.Get("phase")

Layer 10 - observe observability

Makes system measurable and debuggable. No execution without traceability.

Key types:

log := &observe.InMemoryEventLog{}
log.Record(observe.Event{
    Timestamp: time.Now(), Layer: "tool", Action: "execute",
    Data: map[string]any{"name": "search"}, Duration: elapsed,
})
events := log.Events()

Adapters

Adapters are thin boundary layers between external interaction channels and the internal execution core. Each adapter is stateless, deterministic, and replaceable.

AdapterPurposeKey Types
cli Single-shot command-line execution. Read prompt, run inference, write result, exit. Runner, RunConfig, ReadPrompt
tui Interactive terminal UI. REPL with commands, streaming chat, ANSI colors. AgentREPL, AgentConfig, Command
api Service interface contracts. Request/Response types, validation, streaming, errors. ChatRequest, ChatResponse, APIError

CLI Adapter

runner := cli.NewRunner(cli.RunConfig{
    Engine:       engine,
    SystemPrompt: "You are helpful.",
    Out:          os.Stdout,
})
err := runner.Run(ctx, "What is Go?")

TUI Adapter

repl := tui.NewAgentREPL(tui.AgentConfig{
    Title:        "my-agent",
    Banner:       "Welcome! Type /help for commands.",
    SystemPrompt: "You are helpful.",
    EngineLoader: func(ctx context.Context) (inference.Engine, func(), error) {
        krn, cleanup, err := kronk.Load(ctx, kronk.Config{ModelSource: model})
        return kronk.NewAdapter(krn, nil), cleanup, err
    },
})
repl.Run(ctx)

Engine (kronk)

The kronk package wraps the Ardan Labs kronk SDK for local LLM inference. It handles library installation, model downloading, and engine initialization through a single Load call.

Usage

krn, cleanup, err := kronk.Load(ctx, kronk.Config{
    ModelSource: "unsloth/gemma-4-E4B-it",
})
defer cleanup()

engine := kronk.NewAdapter(krn, nil)
// engine satisfies inference.Engine

How It Works

  1. Load downloads kronk runtime libraries (llama.cpp bindings)
  2. Downloads model files from HuggingFace
  3. Initializes inference engine with model
  4. Returns kronk instance + cleanup function

Adapter implements inference.Engine by converting between pagantic core types and kronk's model.D format. Streaming is handled via channel-based ChatStreaming.

Features

GBNF Grammar Constraints

GBNF (GGML BNF) grammars constrain token generation at the decoder level. Unlike post-hoc validation which checks output after generation, grammar constraints prevent invalid tokens from being generated at all.

// Define grammar
grammar := constraint.GrammarDefinition{
    Name:    "sentiment",
    Grammar: `root ::= "{" ws "\"sentiment\"" ws ":" ws val ws "}"
ws  ::= [ \t\n]*
val ::= "\"positive\"" | "\"negative\"" | "\"neutral\""`,
}
err := grammar.Validate()  // checks for root rule

// Use in SpecializedLoop
sl := orchestrate.NewSpecializedLoop(orchestrate.SpecializedConfig{
    Engine:  engine,
    Schema:  schema,
    Grammar: grammar.GrammarString(),
})

ExecutionPlan and Step Routing

PlanExecutor runs typed steps in sequence. Output of step N feeds input of step N+1. Handlers are pluggable per step type.

plan := orchestrate.ExecutionPlan{
    Steps: []orchestrate.Step{
        {Name: "retrieve", Type: orchestrate.StepRetrieve, Input: query},
        {Name: "rerank",   Type: orchestrate.StepRerank},
        {Name: "infer",    Type: orchestrate.StepInfer},
    },
}
executor := orchestrate.NewPlanExecutor(map[orchestrate.StepType]orchestrate.StepHandler{
    orchestrate.StepRetrieve: orchestrate.RetrieveHandler(contextProvider),
    orchestrate.StepRerank:   rerankHandler,
    orchestrate.StepInfer:    inferHandler,
})
results, err := executor.Execute(ctx, plan)

Redundant Inference (TMR / N-Version)

RedundantLoop runs the same inference N times and applies a voting strategy to pick the best result. Trades latency for reliability.

rl := orchestrate.NewRedundantLoop(orchestrate.RedundantConfig{
    Engine: engine,
    Schema: schema,
    N:      3,                            // 3 candidates
    Voting: orchestrate.MajorityVoting{}, // pick most frequent
})
result, err := rl.Call(ctx, prompt)
// result.Confidence = 1.0 if all 3 agree
// result.Candidates has all 3 raw outputs

Reranking Pipeline

Two-stage retrieval: broad recall (retriever) then precise selection (reranker). Standard pattern in production RAG pipelines.

reranker := &rerank.Reranker{
    Scorer: &rerank.SimpleScorer{},
    Policy: rerank.SelectionPolicy{TopK: 3, MinScore: 0.1},
}
results, err := reranker.Rerank(ctx, rerank.CandidateSet{
    Query:      "interfaces",
    Candidates: candidates,
})

Examples

CLI Examples

ExampleDescriptionKey Concepts
cli/simple-query Single-shot query, simplest pattern AgentLoop, Engine, CLI adapter
cli/context-query Query with context retrieval (RAG) InMemoryRetriever, ContextBuilder, ContextProvider
cli/grammar-query GBNF grammar-constrained output GrammarDefinition, DecoderConstraint, SpecializedLoop
cli/rerank-query Query with document reranking PlanExecutor, Reranker, StepHandler
cli/redundant-query Redundant inference with voting RedundantLoop, MajorityVoting, TMR pattern

TUI Examples

ExampleDescriptionKey Concepts
tui/simple-chat Minimal interactive chat REPL AgentREPL, streaming, ANSI output
tui/tool-use Chat with custom Go tool (dice roller) Tool interface, Registry, tool-call loop
tui/structured-output Schema-constrained JSON output SpecializedLoop, SchemaValidator, RepairJSON
tui/context-chat Chat with per-turn context retrieval ContextProvider, ephemeral context injection

Running Examples

# CLI examples (pass prompt as argument or pipe)
go run examples/cli/simple-query/main.go "What is Go?"
go run examples/cli/context-query/main.go "What is pagantic?"
go run examples/cli/grammar-query/main.go "I love this!"
go run examples/cli/rerank-query/main.go "How do interfaces work?"
go run examples/cli/redundant-query/main.go "Great product!"

# Or use Task runner
task run-simple-query -- "What is Go?"
task run-grammar-query -- "I love this!"
task run-rerank-query -- "How do interfaces work?"
task run-redundant-query -- "Great product!"

# TUI examples (interactive)
task run-simple-chat
task run-tool-use
task run-structured-output
task run-context-chat

Getting Started

Prerequisites

Installation

git clone https://github.com/miroslav-matejovsky/pagantic.git
cd pagantic
go mod tidy

First Query

go run examples/cli/simple-query/main.go "What is the capital of France?"

First run downloads kronk libraries and model files (may take several minutes). Subsequent runs use cached files.

Development

# Run all checks (vet, lint, arch-lint, tests)
task fast

# Run specific checks
task vet
task lint
task arch-lint
task test

# Install Go tools
task go-tools

System Contracts

pagantic defines stable boundary contracts at the system edge. Adapters map external input into a SystemRequest and map SystemResponse back to the user. No adapter accesses internal layers directly.

ContractPurpose
SystemRequestInput envelope: messages, mode, hints, output contract, correlation ids
SystemResponseOutput envelope: content, structured output, confidence, validation, error
SystemErrorCanonical error with failure category, retryable flag, details
Execution LifecycleState machine: INIT -> PLAN -> PREPARE -> EXECUTE -> VALIDATE -> COMPLETE
Failure TaxonomySeven error categories with stable codes and recovery paths
Observability Correlationrequest_id, session_id, trace_id, span_id for full timeline reconstruction

Full specifications in Contracts. Term definitions in Glossary.

Design Principles

Deterministic Control Around Probabilistic Inference

The model is probabilistic - it generates tokens stochastically. Everything else is deterministic: tool execution, validation, context retrieval, output repair, reranking, and voting. The harness mediates between deterministic control and probabilistic generation.

Structural Typing for Cross-Layer Integration

Layers communicate through interfaces defined by the consumer, not the provider. orchestrate.ContextProvider is satisfied by context.ContextBuilder without import. Same pattern for CandidateReranker. This keeps the dependency graph clean and allows independent evolution.

Explicit State

All state is explicit and inspectable. ConversationBuffer holds message history. SessionState holds key-value data. WorkingMemory holds transient step results. No hidden state in closures or globals.

No Magic

Every inference call goes through the harness. Every tool call is logged. Every output is validated. The system is transparent, debuggable, and predictable.

Layers Enforce Boundaries

Numeric prefixes on layer directories enforce dependency direction. The arch-lint tool checks at CI time. No runtime surprises from circular imports.