Faithful surface realization
This page is a theory note. It expands the topic in short chapters and defines terminology without duplicating the formal specification documents.
The diagram has a transparent background and is intended to be read together with the caption and the sections below.
Related wiki pages: VM, event stream, VSA, bounded closure, consistency contract.
Overview
Decoding is a common place where systems silently reintroduce hallucinations. VSAVM treats decoding as surface realization of internal artifacts: if the VM did not derive a claim (or the budget was insufficient to verify it), the realizer is not allowed to present it as an unconditional fact.
Implemented today: deterministic rendering pipeline
The current implementation is intentionally conservative and deterministic. It lives in src/generation/generation-service.mjs and is used when rendering a query result.
- Claim selection: take the closure result’s
claims(already produced by VM execution and bounded closure). - Claim rendering: convert each claim term into a stable textual form (see
ClaimRenderer). - Uncertainty marking: if the closure mode is conditional, add lightweight qualifiers (see
UncertaintyMarker). - Mode adaptation: strict = render as-is; conditional = prefix with “Conditional”; indeterminate = return an explicit indeterminate message (see
ModeAdapter). - Audit hint: optionally append a short trace note when trace references exist (see
TraceExplainer).
Fidelity preservation mechanisms
Fidelity is preserved by construction, not by a separate “truth checker”:
- No new claims: the renderer only formats claims already present in the closure result.
- Explicit boundary behavior: conditional and indeterminate modes are surfaced in the output text, not hidden.
- Trace pointers: when present, traces provide a hook for auditing what was executed.
What is realized
The VM can produce a result mode, a set of claims (terms), conflicts, assumptions, and trace references. Today’s realizer focuses on claims-to-text. Higher-level report formatting can be built on top, but the invariant remains: every asserted line must correspond to a checked internal artifact.
Continuation is separate from decoding (DS011)
DS011 adds an optional macro-unit language model that can generate byte continuations under budgets. This is exercised in eval_tinyLLM for fair comparisons against a TensorFlow baseline. Importantly:
- Continuation is not the default answer path for VM query answering.
- Continuation must not be treated as truth: when VM state is available, proposals are expected to be gated by claim validation and/or closure checks.
Quality assurance and validation
For surface realization, validation is mostly structural:
- Determinism: strict mode output is deterministic for the same input and VM state.
- Mode correctness: conditional/indeterminate outputs must visibly reflect the response mode.
- Trace presence: when traces exist, the output surfaces that there is auditable evidence.
Why constraints matter
Without constraints, a fluent realizer can add plausible details that were never derived. Constraints turn the correctness contract into an end-to-end property: not only is the internal reasoning checked, but the emitted text is guaranteed to be a rendering of checked state rather than an additional source of information.
Audit and user trust
Faithful realization supports audit. When the user asks why a claim was made, the system can point to the underlying fact identifiers and trace steps. When it cannot justify a claim, it must degrade to conditional or indeterminate outputs rather than inventing.
References
Natural language generation (Wikipedia) Explainable AI (Wikipedia) Verification and validation (Wikipedia)