AGISystem2 is built on Hyperdimensional Computing (HDC), a computational paradigm that represents information as high-dimensional binary vectors. This foundation enables deterministic, explainable reasoning with mathematical guarantees.
Unlike probabilistic neural networks, HDC operations are fully deterministic. The same input always produces the same output, enabling perfect reproducibility and debugging.
Complex structures can be built from simple parts using only two operations (Bind and Bundle). This enables systematic construction and deconstruction of knowledge.
High-dimensional representations are naturally robust to noise and errors. Small perturbations don't significantly affect similarity comparisons.
Core operations (XOR, majority vote) are extremely fast on modern hardware. Similarity is computed via simple bit counting.
The fundamental data structure is a hypervector - a binary vector with thousands of dimensions (default: 32,768 bits = 4KB).
In high-dimensional spaces, randomly generated vectors are almost orthogonal to each other:
| Property | Value (32K dimensions) |
|---|---|
| Expected similarity of random vectors | 0.500 |
| Standard deviation | ~0.003 |
| 99% confidence interval | [0.492, 0.508] |
| Probability of sim > 0.55 | < 0.0001% |
Implication: Any randomly initialized vector is "far enough" from all others to serve as a unique symbol.
Purpose: Create associations and tag concepts with roles.
loves(John, Mary) = Loves XOR (Pos1 XOR John) XOR (Pos2 XOR Mary)Agent XOR JohnPurpose: Store multiple items in one vector (superposition).
Key property: The result is similar to all inputs, enabling content-addressable memory.
similarity(A, B) = 1 - (hamming_distance(A, B) / dimension)
Output interpretation:
1.0 : Identical vectors
0.5 : Unrelated (random)
0.0 : Inverse (bitwise NOT)
| Similarity Range | Interpretation | Action |
|---|---|---|
| > 0.80 | Strong match | Trust the result |
| 0.65 - 0.80 | Good match | Probably correct |
| 0.55 - 0.65 | Weak match | Verify if critical |
| < 0.55 | Poor/no match | Don't trust |
XOR is commutative, so we need a mechanism to distinguish argument positions:
All queries reduce to this fundamental equation:
Where Query-1 is the partial query (everything except the hole). In XOR, a value is its own inverse.
# Stored fact
fact = Loves XOR (Pos1 XOR John) XOR (Pos2 XOR Mary)
# Query: Who loves Mary? (query variable ?who in position 1)
partial = Loves XOR (Pos2 XOR Mary)
# Unbind from KB
candidate = KB XOR partial
= fact XOR partial (if KB contains fact)
= (Pos1 XOR John) XOR noise
# Extract answer
raw = Pos1 XOR candidate = John XOR noise
# Find best match
answer = mostSimilar(raw, vocabulary) = "John"
Bundle capacity determines how many facts can be stored before accuracy degrades:
| Facts in KB | Expected Similarity | Quality |
|---|---|---|
| 10 | ~0.66 | Excellent |
| 50 | ~0.57 | Good |
| 100 | ~0.55 | Usable |
| 200 | ~0.535 | Threshold |
| 500 | ~0.52 | Poor (noise) |
Traditional HDC uses permutation (bit rotation) for argument ordering. We use position vectors instead because:
When extending a 16K vector to 32K by cloning, permutation produces different results than permuting then extending. Position vectors (XOR-based) extend correctly because XOR distributes over cloning.