Controlling LLM output through decoder-level enforcement and post-hoc validation
LLM output is probabilistic. Left unconstrained, models produce whatever tokens maximize their probability distribution. Two fundamentally different approaches exist for controlling this:
pagantic supports both, and they compose. Grammar constrains generation, then schema validates structure, then repair fixes edge cases, then rules check semantics:
The constraint layer (Layer 5) handles deterministic enforcement at the system boundary. The validate layer (Layer 7) guards final output with rules, semantic checks, and retry loops. Together they form a defense-in-depth strategy for output correctness.
GBNF (GGML BNF) is the grammar format used by llama.cpp. When passed to the inference engine, the decoder is constrained to only produce tokens matching the grammar rules. This is fundamentally different from post-hoc validation - invalid output is prevented, not detected.
Holds a GBNF grammar for decoder-level output constraints:
type GrammarDefinition struct {
// Name identifies this grammar for logging and debugging.
Name string
// Grammar is the GBNF grammar string. Must contain a root rule.
Grammar string
}
// GrammarString returns the raw GBNF text for decoder enforcement.
func (g GrammarDefinition) GrammarString() string
// Validate checks grammar for basic structural issues.
// Returns error if grammar is empty or missing root rule.
func (g GrammarDefinition) Validate() error
Simple yes/no constraint - model can only output one of two words:
root ::= "yes" | "no"
Sentiment JSON - structured output with enum values:
root ::= "{" ws "\"sentiment\"" ws ":" ws val ws "}"
ws ::= [ \t\n]*
val ::= "\"positive\"" | "\"negative\"" | "\"neutral\""
Multi-field JSON - name and age object:
root ::= "{" ws "\"name\"" ws ":" ws string ws "," ws "\"age\"" ws ":" ws number ws "}"
ws ::= [ \t\n]*
string ::= "\"" [a-zA-Z ]+ "\""
number ::= [0-9]+
grammar := constraint.GrammarDefinition{
Name: "sentiment",
Grammar: `root ::= "positive" | "negative" | "neutral"`,
}
err := grammar.Validate()
// Checks:
// 1. Grammar is not empty
// 2. Grammar has a root rule (exact "root" LHS, not just prefix)
Internally, hasRootRule parses each line, extracts the left-hand side of ::=, and requires an exact match to "root". This prevents false positives like rootRule ::= ... from passing validation.
The standalone ValidateGrammar function provides the same checks without needing a GrammarDefinition struct:
err := constraint.ValidateGrammar(`root ::= "yes" | "no"`)
// nil - valid grammar
err = constraint.ValidateGrammar(`answer ::= "yes" | "no"`)
// error: constraint: grammar missing root rule (must contain 'root ::=')
Represents a constraint enforceable at the decoder level. Implementations return a grammar string that the inference engine uses to restrict token generation:
type DecoderConstraint interface {
// GrammarString returns GBNF grammar for decoder enforcement.
GrammarString() string
}
// GrammarConstraint wraps GrammarDefinition as a DecoderConstraint.
type GrammarConstraint struct {
Definition GrammarDefinition
}
func (gc GrammarConstraint) GrammarString() string {
return gc.Definition.GrammarString()
}
grammar := constraint.GrammarDefinition{
Name: "sentiment",
Grammar: `root ::= "{" ws "\"sentiment\"" ws ":" ws val ws "}"
ws ::= [ \t\n]*
val ::= "\"positive\"" | "\"negative\"" | "\"neutral\""`,
}
if err := grammar.Validate(); err != nil {
log.Fatal(err)
}
sl := orchestrate.NewSpecializedLoop(orchestrate.SpecializedConfig{
Engine: engine,
Schema: schema,
Grammar: grammar.GrammarString(), // passed to decoder
})
SchemaValidator validates JSON output against core.Schema definitions. It checks structure, types, required fields, and enum membership:
sv := constraint.NewSchemaValidator(core.Schema{
Type: "object",
Properties: map[string]core.Schema{
"sentiment": {Type: "string", Enum: []string{"positive", "negative", "neutral"}},
"confidence": {Type: "number"},
},
Required: []string{"sentiment", "confidence"},
})
result := sv.Validate(`{"sentiment":"positive","confidence":0.95}`)
// result.Valid == true
// result.Errors == nil
result = sv.Validate(`{"sentiment":"maybe"}`)
// result.Valid == false
// result.Errors contains: missing required field "confidence",
// field "sentiment": must be one of [positive negative neutral]
Required must exist in the objectstring, number, integer, boolean, object, arrayItems schema is defined, each array element is validated against ittype ValidationResult struct {
Valid bool // true when output passes all checks
Errors []string // list of validation failures
Output string // original or repaired output
}
Empty schemas (no type, properties, required, enum, or items) always pass validation. This lets you use SchemaValidator in pipelines where schema is optional.
Models sometimes produce truncated or malformed JSON, especially when hitting token limits. RepairJSON fixes common issues:
// Truncated JSON (missing closing braces)
repaired := constraint.RepairJSON(`{"name": "Alice", "age": 30`)
// Result: `{"name": "Alice", "age": 30}`
// Missing closing bracket
repaired = constraint.RepairJSON(`[1, 2, 3`)
// Result: `[1, 2, 3]`
// Unclosed string
repaired = constraint.RepairJSON(`{"name": "Alice`)
// Result: `{"name": "Alice"}`
The repair algorithm walks the string character by character:
{ or [} or ] is encountered"json.Valid after repair to confirm the result is valid JSON.
Models sometimes produce enum values with wrong casing. NormalizeEnumValues rewrites them to the canonical casing defined in the schema:
schema := core.Schema{
Type: "object",
Properties: map[string]core.Schema{
"sentiment": {Type: "string", Enum: []string{"positive", "negative", "neutral"}},
},
}
normalized := constraint.NormalizeEnumValues(
`{"sentiment":"Positive"}`, // model output "Positive"
schema,
)
// Result: `{"sentiment":"positive"}` // normalized to canonical "positive"
json.Decoder using UseNumber() to preserve numeric precisionstrings.EqualFoldJSONValidator combines JSON validity checking with optional repair. It implements the OutputValidator interface:
type OutputValidator interface {
Validate(output string) ValidationResult
}
type JSONValidator struct {
AttemptRepair bool // when true, tries RepairJSON before failing
}
jv := constraint.NewJSONValidator(true) // enable repair
result := jv.Validate(`{"name": "Alice"`)
// AttemptRepair is true, so:
// 1. Detects invalid JSON
// 2. Runs RepairJSON -> `{"name": "Alice"}`
// 3. Validates repaired output
// result.Valid == true
// result.Output == `{"name": "Alice"}`
When AttemptRepair is false, invalid JSON immediately returns a validation failure without attempting repair.
Deterministic rule checks on final output. Each rule is a named function that returns an error on failure:
rv := validate.NewRuleValidator(
validate.Rule{
Name: "not-empty",
Check: func(s string) error {
if strings.TrimSpace(s) == "" {
return fmt.Errorf("output is empty")
}
return nil
},
},
validate.Rule{
Name: "max-length",
Check: func(s string) error {
if len(s) > 10000 {
return fmt.Errorf("output exceeds 10000 characters")
}
return nil
},
},
validate.Rule{
Name: "valid-json",
Check: func(s string) error {
if !json.Valid([]byte(s)) {
return fmt.Errorf("output is not valid JSON")
}
return nil
},
},
)
errors := rv.Validate(output)
// errors is []error, one per failed rule
// nil rules (Check == nil) are skipped
nil for pass, non-nil error for failRuleValidator returns no errors (safe to call on zero value)NewRuleValidator has no effectLLM-backed validation for content quality. Use when deterministic rules are not enough - checking factual consistency with context, tone matching, or hallucination detection:
type SemanticValidator interface {
// Validate says if output fits intent.
Validate(ctx context.Context, output string, intent string) (valid bool, reason string, err error)
}
The interface takes three parameters:
ctx - context for cancellation and timeoutsoutput - the model output to validateintent - what the output should accomplish (used by the LLM to judge quality)Returns valid (pass/fail), reason (human-readable explanation), and err (infrastructure error).
TODO in the validate package.
When output fails validation, a RepairStrategy attempts to fix it before retrying:
type RepairStrategy interface {
// Repair returns fixed output or error.
Repair(ctx context.Context, output string, errors []string) (string, error)
}
The interface receives context, the broken output, and the list of validation errors that triggered repair. This allows strategies to target specific failure modes.
strategy := &validate.JSONRepairStrategy{}
repaired, err := strategy.Repair(ctx, brokenJSON, validationErrors)
// Wraps constraint.RepairJSON
// Returns error if repaired output is still not valid JSON
errors parameter to decide which repair approach to try. For example, a strategy might handle "missing required field" differently from "invalid type".
When validation fails and repair cannot fix the output, retry the entire inference:
rp := &validate.RetryPolicy{
MaxRetries: 3, // extra tries after first attempt
Backoff: time.Second, // wait time between retries
}
err := rp.Execute(ctx, func() error {
result, err := engine.Infer(ctx, req)
if err != nil {
return err // retried
}
if !isValid(result) {
return fmt.Errorf("invalid output") // retried
}
return nil // success, stops retrying
})
MaxRetries + 1 times (1 initial + N retries)Backoff duration between each retryctx.Err() before each attempt and during backoff waitsnil immediately on first successRetryPolicy executes the function exactly once (safe zero value)Execute.
All constraints compose in a layered pipeline. Here is a complete example showing grammar, schema, repair, and validation working together:
// 1. Define GBNF grammar (decoder-level)
grammar := constraint.GrammarDefinition{
Name: "analysis",
Grammar: `root ::= "{" ws items ws "}"
ws ::= [ \t\n]*
items ::= item ("," ws item)*
item ::= "\"" key "\"" ws ":" ws value
key ::= [a-z_]+
value ::= "\"" [a-zA-Z ]+ "\"" | [0-9]+`,
}
// 2. Define JSON schema (post-hoc validation)
schema := core.Schema{
Type: "object",
Properties: map[string]core.Schema{
"sentiment": {Type: "string", Enum: []string{"positive", "negative", "neutral"}},
"confidence": {Type: "number"},
},
Required: []string{"sentiment", "confidence"},
}
// 3. Use SpecializedLoop (handles grammar + schema + repair + validation)
sl := orchestrate.NewSpecializedLoop(orchestrate.SpecializedConfig{
Engine: engine,
Schema: schema,
Grammar: grammar.GrammarString(),
})
result, err := sl.Call(ctx, "Analyze: Great product!")
// Constraint pipeline:
// 1. GBNF grammar constrains token generation -> valid JSON structure
// 2. JSON repair fixes any truncation -> complete JSON
// 3. Enum normalization -> canonical casing
// 4. Schema validation -> structural correctness
For custom pipelines outside SpecializedLoop, compose constraints explicitly:
// Step 1: Get model output (with grammar if available)
output := runInference(ctx, prompt, grammar.GrammarString())
// Step 2: Repair truncated JSON
jv := constraint.NewJSONValidator(true) // attempt repair
result := jv.Validate(output)
if !result.Valid {
return fmt.Errorf("JSON repair failed: %v", result.Errors)
}
output = result.Output
// Step 3: Normalize enum casing
output = constraint.NormalizeEnumValues(output, schema)
// Step 4: Validate against schema
sv := constraint.NewSchemaValidator(schema)
schemaResult := sv.Validate(output)
if !schemaResult.Valid {
return fmt.Errorf("schema validation failed: %v", schemaResult.Errors)
}
// Step 5: Run business rules
rv := validate.NewRuleValidator(customRules...)
if errs := rv.Validate(output); len(errs) > 0 {
return fmt.Errorf("rule validation failed: %v", errs)
}
| Constraint | When to use | Layer |
|---|---|---|
| GBNF Grammar | Strict output format, decoder enforcement | 5 - constraint |
| Schema Validation | JSON structure and type checking | 5 - constraint |
| JSON Repair | Truncated model output | 5 - constraint |
| Enum Normalization | Inconsistent casing from model | 5 - constraint |
| JSON Validator | Combined validity check with optional repair | 5 - constraint |
| Rule Validation | Custom deterministic checks on final output | 7 - validate |
| Semantic Validation | Content quality, factual consistency, hallucination | 7 - validate |
| Repair Strategy | Fixing output that failed validation | 7 - validate |
| Retry Policy | Transient failures, invalid output after repair | 7 - validate |
pagantic defines seven failure categories. Constraint and validation layers emit specific categories that map to the canonical failure taxonomy.
| Error Code | Category | Trigger | Recovery |
|---|---|---|---|
CONSTRAINT_GRAMMAR_REJECTED |
ConstraintFailure | GBNF grammar rejected model output at decoder level | Re-infer (grammar will constrain next attempt) |
CONSTRAINT_SCHEMA_INVALID |
ConstraintFailure | SchemaValidator found output does not match JSON schema | JSON repair + re-validate, or re-infer |
CONSTRAINT_JSON_INVALID |
ConstraintFailure | Output is not valid JSON (truncated, malformed) | RepairJSON + re-validate |
CONSTRAINT_ENUM_UNRECOGNIZED |
ConstraintFailure | Enum value not in allowed set after normalization | Re-infer with tighter prompt |
| Error Code | Category | Trigger | Recovery |
|---|---|---|---|
VALIDATION_RULE_FAILED |
ValidationFailure | RuleValidator deterministic check failed | RepairStrategy + retry, or re-infer |
VALIDATION_SEMANTIC_FAILED |
ValidationFailure | SemanticValidator detected hallucination or nonsense | Re-infer with adjusted prompt |
The constraint pipeline is the ordered sequence of enforcement stages applied to model output. It is a first-class abstraction, not ad hoc composition.
Stages execute in order. Each stage receives the output (possibly repaired by a previous stage) and produces a pass/fail result:
| Stage | Layer | When Applied | Outcome |
|---|---|---|---|
| 1. Decoder Constraint | constraint (L05) | During inference (GBNF grammar) | Tokens constrained at generation time |
| 2. JSON Repair | constraint (L05) | Post-inference, if JSON invalid | Repaired JSON or failure |
| 3. Enum Normalization | constraint (L05) | Post-repair, if schema has enums | Normalized enum values |
| 4. Schema Validation | constraint (L05) | Post-normalization | Pass or ConstraintFailure |
| 5. Rule Validation | validate (L07) | Post-schema validation | Pass or ValidationFailure |
| 6. Semantic Validation | validate (L07) | Optional, post-rule validation | Pass or ValidationFailure (with confidence) |
Not all stages apply to every execution mode:
| Stage | chat | structured | plan | redundant |
|---|---|---|---|---|
| Decoder Constraint | - | Optional | Per step | Optional |
| JSON Repair | - | Yes | Per step | Yes |
| Enum Normalization | - | Yes | Per step | Yes |
| Schema Validation | - | Required | Per step | Required |
| Rule Validation | Optional | Optional | Per step | Optional |
| Semantic Validation | Optional | Optional | Per step | Optional |
Every structured output path must declare:
OutputContract.RepairAllowedConfidence is an optional score on SystemResponse indicating output reliability. It is a system-wide concept, not limited to any single pattern.
| Source | How Computed | Range | Pattern |
|---|---|---|---|
| Voting | Agreement ratio among N redundant inferences | [0..1] | RedundantLoop |
| Validation | Binary: 1.0 if all rules pass, 0.0 if any fail. Semantic validation may provide a continuous score. | [0..1] | Any structured pattern |
| Retrieval | Coverage metric: average retrieval score or fraction of query terms matched | [0..1] | Patterns using ContextProvider |
Confidence can be used as: