The Anti-Transformer Manifesto
Why Bitsets? Why CPU? Understanding the philosophy behind the architecture.
The Elephant in the Room
Modern AI is dominated by the Transformer architecture. It's designed for GPUs, relying on massive dense matrix multiplications (O(N²)). This makes them heavy, static, and opaque.
The Core Bet: Sparsity by Design
BSP starts from a simple premise: Real-world concepts are sparse.
Out of 100,000 possible words, a sentence only uses ~10. Out of 1,000,000 visual features, a scene only contains ~50.
Transformers represent this by taking a vector of 100,000 zeros and putting non-zero floats in 10 slots. Then they multiply this mostly-zero vector by a dense matrix. This is mathematically correct but computationally wasteful.
BSP represents this as a Set (Bitset).
- Input:
{ "cat", "eats", "fish" }→ IDs[42, 105, 88] - Representation: A list of 3 integers.
- Operation:
Intersection,Union,Difference.
Visual Comparison
Figure 1: Dense Matrix Multiplication vs. Sparse Set Intersection
The "Learner" Philosophy (MDL)
How does it "learn" without Backpropagation? BSP relies on Minimum Description Length (MDL).
The brain is essentially a compression engine.
- If I see "The cat eats" and I predict "fish", and the next word is indeed "fish", I am not surprised. My internal model already "compressed" that pattern.
- If the next word is "lasagna", I am surprised.
The Online Learning Loop
- Predict what comes next based on current Groups.
- Measure Surprise:
Input \ Predicted. - Minimize Future Surprise:
- If the pattern repeats, create a new Group combining these elements.
- If a Group predicted wrongly, weaken its link.
This is Online Learning. There is no "Training Run". Every interaction updates the model instantly.
Why This Matters
By shifting from Dense/Float/GPU to Sparse/Int/CPU, we unlock:
- Run Anywhere: Raspberry Pi, Browser, Old Laptop.
- Continuous Adaptation: The model learns your name in the first sentence and remembers it in the second.
- Interpretability: You can look at
Group #402and see exactly:Members: {cat, dog, pet}, Salience: 0.8. No magic numbers.