Theory - AGISystem2

AGISystem2 is built on Hyperdimensional Computing (HDC), a computational paradigm that represents information as high-dimensional binary vectors. This foundation enables deterministic, explainable reasoning with mathematical guarantees.

Why Hyperdimensional Computing?

Determinism

Unlike probabilistic neural networks, HDC operations are fully deterministic. The same input always produces the same output, enabling perfect reproducibility and debugging.

Compositionality

Complex structures can be built from simple parts using only two operations (Bind and Bundle). This enables systematic construction and deconstruction of knowledge.

Noise Tolerance

High-dimensional representations are naturally robust to noise and errors. Small perturbations don't significantly affect similarity comparisons.

Efficiency

Core operations (XOR, majority vote) are extremely fast on modern hardware. Similarity is computed via simple bit counting.

The Hypervector

The fundamental data structure is a hypervector - a binary vector with thousands of dimensions (default: 32,768 bits = 4KB).

Hypervector Structure (32,768 bits)

Quasi-Orthogonality Property

In high-dimensional spaces, randomly generated vectors are almost orthogonal to each other:

Property	Value (32K dimensions)
Expected similarity of random vectors	0.500
Standard deviation	~0.003
99% confidence interval	[0.492, 0.508]
Probability of sim > 0.55	< 0.0001%

Implication: Any randomly initialized vector is "far enough" from all others to serve as a unique symbol.

The Three Core Operations

1. BIND (XOR) - Association

Bind Operation: A XOR B

Purpose: Create associations and tag concepts with roles.

loves(John, Mary) = Loves XOR (Pos1 XOR John) XOR (Pos2 XOR Mary)
Role binding: Agent XOR John

2. BUNDLE (Majority Vote) - Superposition

Bundle Operation: Majority([A, B, C])

Purpose: Store multiple items in one vector (superposition).

Key property: The result is similar to all inputs, enabling content-addressable memory.

3. SIMILARITY - Comparison

similarity(A, B) = 1 - (hamming_distance(A, B) / dimension)

Output interpretation:
  1.0  : Identical vectors
  0.5  : Unrelated (random)
  0.0  : Inverse (bitwise NOT)

Similarity Range	Interpretation	Action
> 0.80	Strong match	Trust the result
0.65 - 0.80	Good match	Probably correct
0.55 - 0.65	Weak match	Verify if critical
< 0.55	Poor/no match	Don't trust

Position Vectors: Argument Ordering

XOR is commutative, so we need a mechanism to distinguish argument positions:

Position Vectors Solve Argument Order

The Reasoning Equation

All queries reduce to this fundamental equation:

Answer = Knowledge XOR Query^-1

Where Query^-1 is the partial query (everything except the hole). In XOR, a value is its own inverse.

Example Derivation

# Stored fact
fact = Loves XOR (Pos1 XOR John) XOR (Pos2 XOR Mary)

# Query: Who loves Mary? (query variable ?who in position 1)
partial = Loves XOR (Pos2 XOR Mary)

# Unbind from KB
candidate = KB XOR partial
         = fact XOR partial   (if KB contains fact)
         = (Pos1 XOR John) XOR noise

# Extract answer
raw = Pos1 XOR candidate = John XOR noise

# Find best match
answer = mostSimilar(raw, vocabulary) = "John"

Capacity and Accuracy

Bundle capacity determines how many facts can be stored before accuracy degrades:

Facts in KB	Expected Similarity	Quality
10	~0.66	Excellent
50	~0.57	Good
100	~0.55	Usable
200	~0.535	Threshold
500	~0.52	Poor (noise)

Why Not Permutation?

Traditional HDC uses permutation (bit rotation) for argument ordering. We use position vectors instead because:

Permutation breaks geometry extension:

When extending a 16K vector to 32K by cloning, permutation produces different results than permuting then extending. Position vectors (XOR-based) extend correctly because XOR distributes over cloning.

Mathematical Guarantees

Closure: All operations produce valid hypervectors
Determinism: Same inputs always produce same outputs
Approximate reversibility: Bind is perfectly reversible; Bundle is approximately recoverable via similarity
Compositionality: Complex structures built from simple parts
Graceful degradation: Accuracy decreases smoothly with capacity

Theoretical Foundation