Architecture Part 5

Identity and Interning

The machinery that converts human-readable strings into machine-optimizable integers.

The Dual Identity Problem

A reasoning engine faces two conflicting requirements:

Stability: An identifier must not change between sessions. "User X" today must be "User X" tomorrow.
Performance: Array access is fast; Hash map access is slow. We want to use small, dense integers (0..N) as indices.

Stable IDs (hashes) are large and sparse. Dense IDs are small but transient. CNL-PL solves this with a two-tier system.

Tier 1: ConceptualID (Stable)

A ConceptualID is a 64-bit integer derived from the content.
ID = Hash(Kind, CanonicalString).

This ID is computed deterministically. If you load the same theory on two different machines, the Concept "user" will have the exact same ConceptualID. This allows for safe serialization and distributed computing.

Tier 2: DenseID (Runtime)

When a session starts, we load ConceptualIDs into an Interner. The Interner assigns a sequential DenseID (0, 1, 2...) to each unique Concept it encounters.

The Runtime Engine only speaks DenseIDs (EntityID, PredID).
KB.relations[PredID].rows[EntityID]

The SymbolTable maintains the mapping: DenseID <-> ConceptualID <-> String. This allows the system to be incredibly fast internally, while still being able to print readable error messages and export portable data.