The DSL-In/DSL-Out Principle:
Everything that enters Spock is a DSL script.
Everything that exits Spock is a DSL script.
No black boxes. Full transparency.
The Explainability Problem
❌ Neural Networks
Input: "Is Socrates mortal?"
Output: "Yes (0.94 confidence)"
Explanation: "I processed your query through 175 billion parameters across 96 transformer layers with attention heads that... [incomprehensible] "
✓ Spock AGISystem2
Input: DSL script
Output: DSL trace + score
Explanation: "Here's every step I took, as executable code you can verify."
The Dual Output Strategy
Every Spock API call returns two outputs:
Dual Output Architecture (FS-07)
DSL Input
@fact Socrates Is Human
@query fact Evaluate Truth
Spock Engine
Parse → Execute
→ Trace → Score
→ Format Output
resultTheory
Clean conclusion
For production use
executionTrace
Complete step log
For debug/eval
Output Format
Trace Anatomy
Each step in the trace records:
1
@step1 Socrates Bind Human
Inputs: Socrates (vector), Human (vector)
Verb: Bind (Hadamard product)
Output: @step1 (new vector)
Effect: Creates "Socrates-Human" relationship
2
@step2 step1 Move Mortal
Inputs: @step1 (from previous), Mortal (vector)
Verb: Move (vector addition)
Output: @step2 (new vector)
Effect: Transitions toward mortality concept
3
@result step2 Evaluate Truth
Inputs: @step2 (previous result), Truth (canonical)
Verb: Evaluate (cosine projection)
Output: 0.89 (scalar)
Effect: Measures truth alignment
Replayability
Any execution trace can be replayed to verify results:
Trace Replay Verification
Original Run
Score: 0.89
export trace
trace.dsl
📄
replay
Replayed Run
Score: 0.89 ✓
Same seed + same trace = identical results
Determinism Guarantees
For full reproducibility, Spock supports seeded random generation:
const engine = createSpockEngine({
dimensions: 512,
seed: 42
});
Component
Deterministic?
How
Vector generation
✓ Yes (with seed)
Mulberry32 PRNG with configurable seed
Symbol resolution
✓ Yes
Fixed LIFO order through overlays
Execution order
✓ Yes
Topological sort of dependency graph
Kernel operations
✓ Yes
Pure functions on Float32Arrays
Planning
✓ Yes (mostly)
Greedy selection; random_restart adds non-determinism
Evaluation Framework
The dual output enables automated evaluation:
Automated Evaluation Pipeline
Test Case
input.dsl
expected.dsl
Spock Engine
Run with seed
→ resultTheory
Compare
actual vs expected
✓ PASS / ✗ FAIL
DSL output comparison enables exact regression testing
API Response Structure
{
success : true,
score : 0.89,
resultTheory : "@conclusion Query HasTruth 0.89" ,
executionTrace : "@step1 ... @step2 ... @result ..." ,
symbols : Map { ... },
scores : { truth: 0.89, confidence: 0.78 },
trace : { ... }
}
Why This Matters
Spock AGISystem2 is fully auditable because:
Every input is a parseable DSL script
Every output is a parseable DSL script
Every step is logged with inputs, verb, and output
Traces can be replayed for verification
Deterministic seeds enable exact reproduction
No hidden state, no opaque transformations
For Researchers
This architecture enables:
Research Activity
How Spock Helps
Debugging reasoning
Step through trace, see exactly where it went wrong
Regression testing
Compare DSL outputs across versions
Ablation studies
Remove/modify steps in trace, replay, measure impact
Hybrid systems
LLM generates DSL, Spock executes with full trace
Formal verification
DSL can be analyzed statically for properties