VSABrains

Evaluation Status

Current validation state against DS003 criteria.

Summary

Full Success

Last full run: 2026-01-26T20:45:03Z. DS003 success criteria are met under the current evaluation configuration.

Recent updates: Exp4 consensus, work proxies, and compression sweep were refreshed on 2026-01-27.

Evaluation configuration notes: Exp2 time localization is measured via replay-based window verification against fast maps (not through a separate learned time model). Exp3 uses a deterministic extractor stub (no external LLM), so extractor stability and latency under real LLMs are not yet validated.

Exp6 uses a simulated semantic encoder to stress the frame/query pipeline for literature and dialogue.

Experiments

Click an experiment to load its dedicated page with context, interpretation, and visuals.

Loading experiment…

Known Gaps

Area Gap Impact
Exp3 Extraction Extractor is deterministic stub, not a live LLM. LLM variance, latency, and error rates not validated.
Exp2 Time Localization Measured via replay verification, not a learned or independent time model. Success reflects deterministic verification, not predictive time inference.
Compression Sweep Compression sweep now covers grid size and heavy-hitters k, but only on Exp4 proxies. Compression effects on Exp1/Exp2 and real memory usage remain unvalidated.
Exp6 Encoder Semantic encoder is simulated; not a production‑grade model. Frame quality reflects the generator, not a live LLM pipeline.