Experiment 4: Consensus vs Naive List

Goal: show why multi-column consensus is not just “more of the same.” We ingest a clean stream, then we query with noisy windows. Each column sees a different noisy version of the same window. Consensus should recover the correct location more reliably than any single view or a naive list scan.

Plain-language version: many small witnesses see the story with different errors. Voting should beat a single witness and should beat a naive scan that only accepts exact matches.

How this experiment works

Generate a clean story and store it in all columns.
Create a noisy query window by corrupting some tokens.
Ask each column to localize that window.
Compare single‑column answers, a consensus vote, and a naive list scan.

What “noise” means

Noise is the fraction of tokens in the query window that are randomly replaced. Higher noise simulates sloppy input or uncertain observations.

Baselines: “Single Column” means using just column 0. “Best Column” is the best-performing column without voting (an optimistic upper bound). “Naive List” linearly scans the full token list and only accepts an exact window match.

“Proxy” metrics are rough lower-bound estimates, not exact measurements. They are useful for comparing scaling trends (linear scans vs indexed search) without claiming precise runtime or memory.

Metric	Value	Target	Status
Consensus Acc	0.992	> 0.90	Pass
Single Column Acc	0.769	—	Baseline
Best Column Acc	0.780	—	Baseline
Naive List Acc	0.183	—	Baseline
Consensus Gain (vs Single)	0.224	> 0.05	Pass
Consensus Gain (vs Naive)	0.809	> 0.05	Pass
List Comparisons / Query	1,188	Linear scan	Expected
VSA Storage (lower bound)	103,040 bytes	Proxy	Proxy
List Storage (lower bound)	1,600 bytes	Proxy	Proxy

Interpretation: consensus is much more robust to noise. Here it reaches 0.992 accuracy, while a single column is ~0.769 and the naive list baseline is ~0.183 under the same noisy queries. The large positive “Consensus Gain” values show that voting is doing real work, not just averaging identical answers.

Visual: Accuracy vs Noise

Accuracy versus noise for different column counts

Read left to right: noise goes up, the red line (1 column) drops, while the blue/green lines (5–9 columns) stay near the top.

Visual: Gain vs Columns (Noise = 0.35)

Consensus gain versus columns at noise 0.35

This fixes the hardest noise level (0.35) and shows how much accuracy you recover compared to a single column. Most of the gain appears by 5 columns.

Work / Speed Proxies (Why this beats a naive list)

We use a work proxy rather than a raw runtime claim. For the naive list, the work is “token comparisons per query.” For VSABrains, the proxy is “candidate locations scored per query.” Lower is better for both.

Work Proxy (noise=0.25, cols=5)	Naive List	VSABrains	Interpretation
Work per query	1,188 comparisons	5.67 locations scored	VSABrains does far less work
Baseline ÷ VSA work	~209×		Large win under noisy queries
Index hits per token	—	~1.14 candidates/token/column	Index stays tight

Visual: Work Proxy (Noise = 0.25, Cols = 5)

Work proxy comparison at noise 0.25 with 5 columns

The naive list must compare against almost the whole stream, while VSABrains narrows the search to a tiny candidate set via the index.

Visual: Work Ratio vs Columns (Noise = 0.35)

Baseline work ratio versus columns at noise 0.35

Even at high noise, the naive scan still does ~100–800× more proxy work than the indexed VSABrains approach. More columns add some work, but the gap remains very large.

Baseline ÷ VSA Work Ratio	Cols=1	Cols=5	Cols=9
Noise 0.15	~856×	~191×	~106×
Noise 0.25	~854×	~209×	~115×
Noise 0.35	~862×	~226×	~125×

Last run: 2026-01-27T12:59:38Z. Noise rate 0.25, window size 6, columns 5, baselineMatchThreshold=6 (exact match).

Sweep run: 2026-01-27T12:59:46Z. See eval/exp4-consensus/sweep.mjs for configuration.

Storage proxy caveat: VSA storage proxy can be larger than the naive list in small configs. The main advantage shown here is robustness plus much lower query work under noise.

Consensus Sweep (Columns × Noise)

As noise increases, 1 column degrades while 5–9 columns remain strong. The gain columns show how much accuracy is recovered by adding columns.

Noise	Cols=1 Acc	Cols=5 Acc	Cols=5 Gain	Cols=9 Acc	Cols=9 Gain
0.15	0.939	0.997	+0.146	1.000	+0.151
0.25	0.901	0.992	+0.224	1.000	+0.276
0.35	0.852	0.976	+0.335	1.000	+0.312