VSABrains - Evaluation Status

Summary

Full Success

Last full run: 2026-01-26T20:45:03Z. DS003 success criteria are met under the current evaluation configuration.

Recent updates: Exp4 consensus, work proxies, and compression sweep were refreshed on 2026-01-27.

Evaluation configuration notes: Exp2 time localization is measured via replay-based window verification against fast maps (not through a separate learned time model). Exp3 uses a deterministic extractor stub (no external LLM), so extractor stability and latency under real LLMs are not yet validated.

Exp6 uses a simulated semantic encoder to stress the frame/query pipeline for literature and dialogue.

Experiments

Click an experiment to load its dedicated page with context, interpretation, and visuals.

Known Gaps

Area	Gap	Impact
Exp3 Extraction	Extractor is deterministic stub, not a live LLM.	LLM variance, latency, and error rates not validated.
Exp2 Time Localization	Measured via replay verification, not a learned or independent time model.	Success reflects deterministic verification, not predictive time inference.
Compression Sweep	Compression sweep now covers grid size and heavy-hitters `k`, but only on Exp4 proxies.	Compression effects on Exp1/Exp2 and real memory usage remain unvalidated.
Exp6 Encoder	Semantic encoder is simulated; not a production‑grade model.	Frame quality reflects the generator, not a live LLM pipeline.