Controlled generation with closure gating
This page is a theory note. It expands the topic in short chapters and defines terminology without duplicating the formal specification documents.
The diagram has a transparent background and is intended to be read together with the caption and the sections below.
Related wiki pages: VM, event stream, context scope, bounded closure, LLM, macro-unit.
Related specs: DS004, DS010, DS011.
Overview
VSAVM does not treat generation as unconstrained continuation. “Generation” is split into two mechanisms:
- Surface realization (implemented): render VM/closure results into text without introducing new claims.
- Continuation proposals (experimental, DS011): a learned macro-unit language model can propose bytes to continue a prompt, but proposals must respect budgets and (when VM state is available) claim validation.
Implemented today: rendering checked artifacts
The default path for answering is: execute → close → render. The renderer is intentionally simple and deterministic so it cannot “invent” facts.
- VM produces a result containing
claims,mode(strict/conditional/indeterminate), and optional trace references. - Claim gating selects which claims are allowed to be rendered (see
GenerationService+ClaimGate.filterClaims). - Deterministic rendering converts each claim into text lines (see
ClaimRenderer). - Mode adaptation prefixes or suppresses output in conditional/indeterminate modes (see
ModeAdapter+UncertaintyMarker). - Audit hint appends a minimal trace note when traces exist (see
TraceExplainer).
DS011 continuation: macro-units + budgets + validation
DS011 defines an optional continuation loop built on macro-units: frequently occurring byte sequences discovered by compression (MDL). Macro-units are reversible: they expand deterministically into the original bytes.
The current implementation lives in src/training/outer-loop/macro-unit-model.mjs and is exercised by eval_tinyLLM. It is a byte-level model with:
- Kneser–Ney smoothing and backoff over bounded n-gram orders
- Streaming training (
trainStream) with pruning and optional sampling - Time/token budgets (
budgetMs,maxTokens) - Repetition penalties and n-gram blocking to reduce degenerate loops
Step-by-step continuation loop
- Collect context: take a bounded window of recent bytes (the prompt tail).
- Optionally encode VM state: compute a deterministic conditioning signature (see
VMStateConditioner). - Propose candidates: generate top-K macro-unit proposals (and keep a byte-level fallback).
- Validate candidates: if VM state is available, pass each proposal through
ClaimGate.validateMacroUnit. - Select next: choose the best remaining candidate under the decoding policy and budget.
- Append bytes and repeat until budget or stop condition ends the loop.
Budgets and “thinking more”
Budgets are explicit and user-controlled. Increasing a budget changes what the system is allowed to assert:
- Closure budgets (DS004) control how far the VM explores consequences (depth/steps/branches/time).
- Continuation budgets (DS011) control how long the macro-unit model is allowed to run (time/tokens).
In both cases, budget is not a cosmetic knob: it is the horizon of the non-contradiction promise.
Why separators matter (DS010)
Controlled generation depends on scope boundaries. DS010 discovers structural separators so facts, rules, and continuations remain localized to the current structural region instead of bleeding across unrelated regions.
References
Beam search (Wikipedia) Transitive closure (Wikipedia) Verification and validation (Wikipedia) Natural language generation (Wikipedia)