Probabilistic Agentic Control System
LLM harness with deterministic control architecture around probabilistic inference.
pagantic wraps LLM inference with a deterministic control system. Models are probabilistic - they generate tokens stochastically. The harness adds deterministic scaffolding: timeouts, tool execution, output validation, context retrieval, structured output enforcement, candidate reranking, and redundant inference with voting.
Inspired by Harness engineering for coding agent users. Uses kronk for local LLM engine access.
10-layer system with explicit architectural boundaries, adapter-based I/O,
and a dedicated engine wrapper. Dependencies flow downward only. No circular
imports. Architecture rules enforced by go-arch-lint.
layers/ with a numeric prefix enforcing direction| Layer | Package | Purpose |
|---|---|---|
| 0 | core | Shared domain types - Message, ToolCall, Schema, TokenUsage |
| 1 | inference | Engine interface for model inference |
| 2 | orchestrate | Control loop - AgentLoop, SpecializedLoop, PlanExecutor, RedundantLoop |
| 3 | context | Knowledge retrieval - Retriever, ContextBuilder |
| 4 | tool | Tool registry and execution |
| 5 | constraint | Output enforcement - JSON validation, repair, schema, GBNF grammar |
| 6 | rerank | Candidate scoring and reranking |
| 7 | validate | Guardrails, rule validation, retry policy |
| 8 | prompt | Prompt construction - SystemPrompt, InstructionSet, Template |
| 9 | memory | State management - ConversationBuffer, SessionState, WorkingMemory |
| 10 | observe | Tracing, metrics, event logging, cost tracking |
Shared domain types used across all layers. Simple data structures with no behavior beyond construction helpers. No package may depend on core for logic - only for type definitions.
Key types:
Message - conversation message with Role, Content, ToolCallsSchema - JSON Schema subset for structured output and tool parametersToolCall - assistant's request to execute a toolToolDefinition - tool metadata and parameter schemaTokenUsage - token consumption and throughput metrics// Roles
msg := core.NewUserMessage("Hello")
msg := core.NewSystemMessage("You are helpful.")
msg := core.NewAssistantMessage("Hi there!")
msg := core.NewToolResultMessage(callID, "search", "results...")
Accepts structured prompt input and produces raw model output. Knows nothing about tools, workflows, schemas beyond transport, or business rules.
Key types:
Engine - core inference abstraction: Infer(ctx, Request) (*Result, error)Request - typed input: Messages, Tools, Schema, Grammar, MaxTokens, TemperatureResult - typed output: Content, ToolCalls, Messages, UsageStreamHandler - callbacks for streaming: OnContent, OnReasoning, OnToolCallresult, err := engine.Infer(ctx, inference.Request{
Messages: messages,
MaxTokens: 2048,
Grammar: `root ::= "yes" | "no"`, // GBNF decoder constraint
})
Drives work across many inference steps. Breaks requests into steps, routes work, enforces order, manages retries and tool loops.
Key types:
AgentLoop - stateful multi-turn agent with tool call loopSpecializedLoop - stateless single-shot with optional tools then schema-constrained outputPlanExecutor - runs ExecutionPlan steps with pluggable handlersRedundantLoop - N-version inference with voting for reliabilityContextProvider - interface for context retrieval (structural typing)CandidateReranker - interface for reranking (structural typing)VotingStrategy - MajorityVoting, UnanimityVoting// Multi-turn chat with tools
agent := orchestrate.NewAgentLoop(orchestrate.LoopConfig{
Engine: engine,
Tools: registry,
SystemPrompt: "You are helpful.",
})
result, err := agent.Chat(ctx, "Roll 2d6")
// Structured single-shot
sl := orchestrate.NewSpecializedLoop(orchestrate.SpecializedConfig{
Engine: engine, Schema: schema, Grammar: grammarStr,
})
result, err := sl.Call(ctx, "Classify this text")
// Redundant inference
rl := orchestrate.NewRedundantLoop(orchestrate.RedundantConfig{
Engine: engine, Schema: schema, N: 3, Voting: orchestrate.MajorityVoting{},
})
result, err := rl.Call(ctx, prompt)
Gives the model relevant, bounded knowledge. The model never acts on unconstrained knowledge.
Key types:
Retriever - interface: Retrieve(ctx, query, maxResults) ([]Chunk, error)InMemoryRetriever - keyword-matching retriever for bounded domainsContextBuilder - assembles retrieved chunks into system messagesDocument - source document with Content and SourceChunk - scored document fragmentretriever := pctx.NewInMemoryRetriever(
pctx.Document{Content: "Go interfaces are implicit.", Source: "go-docs"},
)
builder := &pctx.ContextBuilder{Retriever: retriever, MaxChunks: 3}
msgs, err := builder.Build(ctx, "How do interfaces work?")
Runs operations outside the model. All side effects live here, never in the model.
Key types:
Tool - interface: Info, Definition, Execute, AvailableRegistry - groups tools and dispatches execution by nameToolExecutor - adds observability around registry callsToolType - TypeGo (pure Go) or TypeCLI (external binary)registry := tool.NewRegistry(&myTool{})
output, err := registry.Execute("tool_name", args)
Enforces deterministic output at the system boundary. No unstructured text crosses.
Key types:
OutputValidator - validates raw model outputJSONValidator - checks and optionally repairs JSONSchemaValidator - validates JSON against core.SchemaRepairJSON - closes truncated JSON (missing braces, brackets, quotes)NormalizeEnumValues - rewrites enum strings to canonical caseGrammarDefinition - holds GBNF grammar for decoder constraintsDecoderConstraint - interface for decoder-level enforcementGrammarConstraint - wraps GrammarDefinition as DecoderConstraint// Post-hoc validation
sv := constraint.NewSchemaValidator(schema)
result := sv.Validate(jsonOutput)
// Decoder constraint (prevents invalid tokens)
grammar := constraint.GrammarDefinition{
Name: "sentiment",
Grammar: `root ::= "positive" | "negative" | "neutral"`,
}
err := grammar.Validate() // checks for root rule
Scores retrieved or generated candidates, reorders by relevance, picks best subset. Secondary reasoning layer correcting initial approximations.
Key types:
Candidate - scored item with Content, Score, Source, MetadataCandidateSet - groups candidates with original queryRelevanceScorer - interface: Score(ctx, query, candidates)SimpleScorer - keyword overlap scorer (dev/testing only)Reranker - combines scorer with selection policySelectionPolicy - TopK and MinScore thresholdsreranker := &rerank.Reranker{
Scorer: &rerank.SimpleScorer{},
Policy: rerank.SelectionPolicy{TopK: 3, MinScore: 0.1},
}
results, err := reranker.Rerank(ctx, rerank.CandidateSet{
Query: "interfaces", Candidates: candidates,
})
Guards final output against hard constraints, rules, and repair paths. Two kinds: deterministic validation and inferential validation.
Key types:
RuleValidator - runs deterministic rule checks on outputRule - named check functionSemanticValidator - interface for LLM-backed validationRepairStrategy - interface for fixing invalid outputJSONRepairStrategy - wraps constraint.RepairJSONRetryPolicy - configurable retry with backoffrv := validate.NewRuleValidator(
validate.Rule{Name: "not-empty", Check: func(s string) error {
if s == "" { return fmt.Errorf("empty output") }
return nil
}},
)
errs := rv.Validate(output)
rp := &validate.RetryPolicy{MaxRetries: 3, Backoff: time.Second}
err := rp.Execute(ctx, func() error { return doWork() })
Shapes model behavior before execution through structured prompt construction. Keeps prompt assembly explicit, testable, and reusable.
Key types:
SystemPrompt - builds system message from base + instructions + policyInstructionSet - named group of rules for system promptTemplate - prompt with variable placeholdersContextPolicy - controls what context is includedsp := &prompt.SystemPrompt{
Base: "You are a helpful assistant.",
Instructions: []prompt.InstructionSet{
{Name: "Safety", Rules: []string{"Never reveal secrets", "Be polite"}},
},
}
msg := sp.Build()
tmpl := &prompt.Template{
Raw: "Translate {{.Text}} to {{.Language}}",
Variables: map[string]string{"Text": "hello", "Language": "French"},
}
result, err := tmpl.Render()
Keeps state outside prompt text. State stays explicit so other layers can inspect, reset, persist, and version it.
Key types:
ConversationBuffer - message history with optional size limitSessionState - thread-safe key-value store across stepsWorkingMemory - transient per-step context and resultsbuf := memory.NewConversationBuffer(50) // keep last 50 messages
buf.Append(core.NewUserMessage("hello"))
state := memory.NewSessionState()
state.Set("phase", "analysis")
val, ok := state.Get("phase")
Makes system measurable and debuggable. No execution without traceability.
Key types:
EventLog - interface: Record(Event), Events()TraceRecorder - interface: StartSpan returns context + SpanMetricsCollector - interface: latency, tokens, countersCostTracker - interface: usage recording and cost computationInMemoryEventLog, InMemoryTracer, InMemoryMetrics - in-memory implementationsNoopEventLog, NoopTracer - no-op implementationslog := &observe.InMemoryEventLog{}
log.Record(observe.Event{
Timestamp: time.Now(), Layer: "tool", Action: "execute",
Data: map[string]any{"name": "search"}, Duration: elapsed,
})
events := log.Events()
Adapters are thin boundary layers between external interaction channels and the internal execution core. Each adapter is stateless, deterministic, and replaceable.
| Adapter | Purpose | Key Types |
|---|---|---|
| cli | Single-shot command-line execution. Read prompt, run inference, write result, exit. | Runner, RunConfig, ReadPrompt |
| tui | Interactive terminal UI. REPL with commands, streaming chat, ANSI colors. | AgentREPL, AgentConfig, Command |
| api | Service interface contracts. Request/Response types, validation, streaming, errors. | ChatRequest, ChatResponse, APIError |
runner := cli.NewRunner(cli.RunConfig{
Engine: engine,
SystemPrompt: "You are helpful.",
Out: os.Stdout,
})
err := runner.Run(ctx, "What is Go?")
repl := tui.NewAgentREPL(tui.AgentConfig{
Title: "my-agent",
Banner: "Welcome! Type /help for commands.",
SystemPrompt: "You are helpful.",
EngineLoader: func(ctx context.Context) (inference.Engine, func(), error) {
krn, cleanup, err := kronk.Load(ctx, kronk.Config{ModelSource: model})
return kronk.NewAdapter(krn, nil), cleanup, err
},
})
repl.Run(ctx)
The kronk package wraps the Ardan Labs kronk SDK for local LLM inference.
It handles library installation, model downloading, and engine initialization through
a single Load call.
krn, cleanup, err := kronk.Load(ctx, kronk.Config{
ModelSource: "unsloth/gemma-4-E4B-it",
})
defer cleanup()
engine := kronk.NewAdapter(krn, nil)
// engine satisfies inference.Engine
Load downloads kronk runtime libraries (llama.cpp bindings)
Adapter implements inference.Engine by converting between
pagantic core types and kronk's model.D format. Streaming is handled
via channel-based ChatStreaming.
GBNF (GGML BNF) grammars constrain token generation at the decoder level. Unlike post-hoc validation which checks output after generation, grammar constraints prevent invalid tokens from being generated at all.
// Define grammar
grammar := constraint.GrammarDefinition{
Name: "sentiment",
Grammar: `root ::= "{" ws "\"sentiment\"" ws ":" ws val ws "}"
ws ::= [ \t\n]*
val ::= "\"positive\"" | "\"negative\"" | "\"neutral\""`,
}
err := grammar.Validate() // checks for root rule
// Use in SpecializedLoop
sl := orchestrate.NewSpecializedLoop(orchestrate.SpecializedConfig{
Engine: engine,
Schema: schema,
Grammar: grammar.GrammarString(),
})
PlanExecutor runs typed steps in sequence. Output of step N feeds input
of step N+1. Handlers are pluggable per step type.
plan := orchestrate.ExecutionPlan{
Steps: []orchestrate.Step{
{Name: "retrieve", Type: orchestrate.StepRetrieve, Input: query},
{Name: "rerank", Type: orchestrate.StepRerank},
{Name: "infer", Type: orchestrate.StepInfer},
},
}
executor := orchestrate.NewPlanExecutor(map[orchestrate.StepType]orchestrate.StepHandler{
orchestrate.StepRetrieve: orchestrate.RetrieveHandler(contextProvider),
orchestrate.StepRerank: rerankHandler,
orchestrate.StepInfer: inferHandler,
})
results, err := executor.Execute(ctx, plan)
RedundantLoop runs the same inference N times and applies a voting
strategy to pick the best result. Trades latency for reliability.
rl := orchestrate.NewRedundantLoop(orchestrate.RedundantConfig{
Engine: engine,
Schema: schema,
N: 3, // 3 candidates
Voting: orchestrate.MajorityVoting{}, // pick most frequent
})
result, err := rl.Call(ctx, prompt)
// result.Confidence = 1.0 if all 3 agree
// result.Candidates has all 3 raw outputs
Two-stage retrieval: broad recall (retriever) then precise selection (reranker). Standard pattern in production RAG pipelines.
reranker := &rerank.Reranker{
Scorer: &rerank.SimpleScorer{},
Policy: rerank.SelectionPolicy{TopK: 3, MinScore: 0.1},
}
results, err := reranker.Rerank(ctx, rerank.CandidateSet{
Query: "interfaces",
Candidates: candidates,
})
| Example | Description | Key Concepts |
|---|---|---|
cli/simple-query |
Single-shot query, simplest pattern | AgentLoop, Engine, CLI adapter |
cli/context-query |
Query with context retrieval (RAG) | InMemoryRetriever, ContextBuilder, ContextProvider |
cli/grammar-query |
GBNF grammar-constrained output | GrammarDefinition, DecoderConstraint, SpecializedLoop |
cli/rerank-query |
Query with document reranking | PlanExecutor, Reranker, StepHandler |
cli/redundant-query |
Redundant inference with voting | RedundantLoop, MajorityVoting, TMR pattern |
| Example | Description | Key Concepts |
|---|---|---|
tui/simple-chat |
Minimal interactive chat REPL | AgentREPL, streaming, ANSI output |
tui/tool-use |
Chat with custom Go tool (dice roller) | Tool interface, Registry, tool-call loop |
tui/structured-output |
Schema-constrained JSON output | SpecializedLoop, SchemaValidator, RepairJSON |
tui/context-chat |
Chat with per-turn context retrieval | ContextProvider, ephemeral context injection |
# CLI examples (pass prompt as argument or pipe)
go run examples/cli/simple-query/main.go "What is Go?"
go run examples/cli/context-query/main.go "What is pagantic?"
go run examples/cli/grammar-query/main.go "I love this!"
go run examples/cli/rerank-query/main.go "How do interfaces work?"
go run examples/cli/redundant-query/main.go "Great product!"
# Or use Task runner
task run-simple-query -- "What is Go?"
task run-grammar-query -- "I love this!"
task run-rerank-query -- "How do interfaces work?"
task run-redundant-query -- "Great product!"
# TUI examples (interactive)
task run-simple-chat
task run-tool-use
task run-structured-output
task run-context-chat
git clone https://github.com/miroslav-matejovsky/pagantic.git
cd pagantic
go mod tidy
go run examples/cli/simple-query/main.go "What is the capital of France?"
First run downloads kronk libraries and model files (may take several minutes). Subsequent runs use cached files.
# Run all checks (vet, lint, arch-lint, tests)
task fast
# Run specific checks
task vet
task lint
task arch-lint
task test
# Install Go tools
task go-tools
pagantic defines stable boundary contracts at the system edge. Adapters map external input into a SystemRequest and map SystemResponse back to the user. No adapter accesses internal layers directly.
| Contract | Purpose |
|---|---|
| SystemRequest | Input envelope: messages, mode, hints, output contract, correlation ids |
| SystemResponse | Output envelope: content, structured output, confidence, validation, error |
| SystemError | Canonical error with failure category, retryable flag, details |
| Execution Lifecycle | State machine: INIT -> PLAN -> PREPARE -> EXECUTE -> VALIDATE -> COMPLETE |
| Failure Taxonomy | Seven error categories with stable codes and recovery paths |
| Observability Correlation | request_id, session_id, trace_id, span_id for full timeline reconstruction |
Full specifications in Contracts. Term definitions in Glossary.
The model is probabilistic - it generates tokens stochastically. Everything else is deterministic: tool execution, validation, context retrieval, output repair, reranking, and voting. The harness mediates between deterministic control and probabilistic generation.
Layers communicate through interfaces defined by the consumer, not the provider.
orchestrate.ContextProvider is satisfied by context.ContextBuilder
without import. Same pattern for CandidateReranker. This keeps the
dependency graph clean and allows independent evolution.
All state is explicit and inspectable. ConversationBuffer holds message history. SessionState holds key-value data. WorkingMemory holds transient step results. No hidden state in closures or globals.
Every inference call goes through the harness. Every tool call is logged. Every output is validated. The system is transparent, debuggable, and predictable.
Numeric prefixes on layer directories enforce dependency direction. The arch-lint tool checks at CI time. No runtime surprises from circular imports.