Large Language Model (LLM)

This wiki entry defines a term used across VSAVM and explains why it matters in the architecture.

The diagram has a transparent background and highlights the operational meaning of the term inside VSAVM.

Related wiki pages: VM, event stream, VSA, bounded closure, consistency contract.

Definition

A large language model is typically a neural network trained to predict the next token (or next segment) of text. In VSAVM, “LLM-like” describes the interface (interactive continuation), not the source of truth.

Role in VSAVM

VSAVM uses continuation prediction as a proposal mechanism, but correctness is owned by the VM and bounded closure:

Proposals: continuations can come from a neural model (baseline) or from VSAVM’s own macro-unit model (DS011).
Acceptance: factual claims are only emitted when supported by executable state and the correctness contract (DS004).
Boundary behavior: when budgets are insufficient, outputs must degrade to conditional or indeterminate instead of “guessing”.

Mechanics and implications

In this repository, two “LLM-like” paths exist:

Query answering (default): execute a query in the VM, run bounded closure, then render checked claims deterministically.
Continuation (DS011, evaluation harness): train a macro-unit language model to continue byte sequences under budgets. This is compared against a small TensorFlow Transformer baseline in eval_tinyLLM.

The important implication is that fluency is never treated as truth. Continuation quality is measured as a language-model metric (perplexity, reference match, repetition), while correctness for claims is measured via VM/closure.

Practical evaluation (eval_tinyLLM)

The eval_tinyLLM suite exists to make “more realistic” comparisons reproducible while keeping the codebase dependency-light:

Prepare a dataset split under a deterministic datasetId (size-based, keyed by maxBytes and split settings).
Train VSAVM macro-units (streaming) and optionally persist facts.
Train the TensorFlow baseline on the same dataset.
Compare both engines under identical budgets per prompt and write a timestamped HTML report to eval_tinyLLM/results/.

Artifacts are cached under eval_tinyLLM/cache/datasets/ and eval_tinyLLM/cache/models/ so multiple dataset sizes and multiple trained models can coexist.

References