Structural scope isolation
VSAVM prevents context bleeding by deriving scope only from structural separators in the data (DS010 / NFS11). Scope is a structural path, not a hand-assigned topic label.
Problem: polysemy and source mixing
Real corpora contain repeated strings that refer to different things in different places, and they often contain mutually incompatible statements. If the system treats all assertions as globally active, contradiction checks either explode or become meaningless.
- Polysemy: the same surface form appears in multiple sources with different referents.
- Quoted passages: a quote can contradict the narrator without being an error.
- Alternative versions: two documents can disagree without either being “wrong” inside its own context.
Design rule (enforced in code)
Scopes must emerge from structure. A scope is derived from separators such as document boundaries, headings, paragraphs, speakers, scenes, functions, and other structural cuts present in the input stream.
The implementation rejects any attempt to create a scope that starts with ['domain', ...]. This is enforced by createScopeId in src/core/types/identifiers.mjs.
What a scope looks like
Scope IDs are hierarchical paths. The exact tokens depend on the modality ingest layer, but they must always correspond to structural separators (never to a domain name).
['document', 'doc_0042', 'section_3', 'paragraph_12']
['file', 'src/parser.mjs', 'function_parse', 'block_2']
['conversation', 'turn_17', 'assistant', 'sentence_2']
['dataset', 'record_128']
Containment is prefix-based. For example, ['dataset','record_128'] contains all deeper scopes such as ['dataset','record_128','byte_17'].
How scope is derived at runtime (current implementation)
- Ingest events with optional
contextPatharrays (preferred) using DS007 event structures. - Detect separators with the DS010 VSA detector (
detectStructuralSeparators) when explicit context is missing, or to add a gradient-based boundary signal. - Create a ScopeId at a position using
createStructuralScopeId(events, position, separators):- If the event has a
contextPath, use it directly. - Otherwise, build a fallback path from the strongest detected separators (e.g.,
['stream','section_boundary_120','minor_boundary_318']).
- If the event has a
- Attach scope to facts so closure and contradiction checks are localized.
- Query within a scope by selecting a structural region (via scope containment) instead of merging unrelated regions.
Context selection is structural, not topical
When the user wants “the other meaning”, the system switches by structural reference (document/section/file/record), not by picking a domain.
User: "In record_12, what does 'Python' refer to?"
System: [answers within scope ['dataset','record_12']]
User: "Now answer using record_99 instead."
System: [answers within scope ['dataset','record_99']]
Why this matters for correctness
- Contradictions become computable: a conflict is opposing polarity for the same canonical FactId within the same scope.
- Separator errors are localized: segmentation mistakes do not contaminate the entire corpus.
- Modality agnostic: the boundary tokens differ across modalities, but the rule (“structure only”) stays the same.
Related: Structural boundaries and scope, DS010, Specs