The DSL-In/DSL-Out Principle:
Everything that enters Spock is a DSL script.
Everything that exits Spock is a DSL script.
No black boxes. Full transparency.

The Explainability Problem

❌ Neural Networks

Input: "Is Socrates mortal?"

Output: "Yes (0.94 confidence)"

Explanation: "I processed your query through 175 billion parameters across 96 transformer layers with attention heads that... [incomprehensible]"

✓ Spock AGISystem2

Input: DSL script

Output: DSL trace + score

Explanation: "Here's every step I took, as executable code you can verify."

The Dual Output Strategy

Every Spock API call returns two outputs:

Dual Output Architecture (FS-07) DSL Input @fact Socrates Is Human @query fact Evaluate Truth Spock Engine Parse → Execute → Trace → Score → Format Output resultTheory Clean conclusion For production use executionTrace Complete step log For debug/eval

Output Format

resultTheory (Production Output)
@conclusion Query HasTruth 0.92 # Clean, minimal, actionable
executionTrace (Debug Output)
# Full execution trace - replayable! @step1 Socrates Identity _ # Create Socrates vector @step2 Human Identity _ # Create Human vector @step3 step1 Bind Is # Bind with Is relation @step4 step3 Add step2 # Add Human concept @fact step4 Normalise _ # Normalize result @eval fact Distance Truth # Project onto Truth @result eval Persist _ # → 0.92

Trace Anatomy

Each step in the trace records:

1
@step1 Socrates Bind Human
Inputs: Socrates (vector), Human (vector)
Verb: Bind (Hadamard product)
Output: @step1 (new vector)
Effect: Creates "Socrates-Human" relationship
2
@step2 step1 Move Mortal
Inputs: @step1 (from previous), Mortal (vector)
Verb: Move (vector addition)
Output: @step2 (new vector)
Effect: Transitions toward mortality concept
3
@result step2 Evaluate Truth
Inputs: @step2 (previous result), Truth (canonical)
Verb: Evaluate (cosine projection)
Output: 0.89 (scalar)
Effect: Measures truth alignment

Replayability

Any execution trace can be replayed to verify results:

Trace Replay Verification Original Run Score: 0.89 export trace trace.dsl 📄 replay Replayed Run Score: 0.89 ✓ Same seed + same trace = identical results

Determinism Guarantees

For full reproducibility, Spock supports seeded random generation:

const engine = createSpockEngine({ dimensions: 512, seed: 42 // Deterministic vector generation }); // With same seed, "Cat" always produces the same vector // Traces are perfectly reproducible
Component Deterministic? How
Vector generation ✓ Yes (with seed) Mulberry32 PRNG with configurable seed
Symbol resolution ✓ Yes Fixed LIFO order through overlays
Execution order ✓ Yes Topological sort of dependency graph
Kernel operations ✓ Yes Pure functions on Float32Arrays
Planning ✓ Yes (mostly) Greedy selection; random_restart adds non-determinism

Evaluation Framework

The dual output enables automated evaluation:

Automated Evaluation Pipeline Test Case input.dsl expected.dsl Spock Engine Run with seed → resultTheory Compare actual vs expected ✓ PASS / ✗ FAIL DSL output comparison enables exact regression testing

API Response Structure

{ success: true, score: 0.89, // Quick numeric indicator resultTheory: "@conclusion Query HasTruth 0.89", executionTrace: "@step1 ... @step2 ... @result ...", // Legacy/debug fields symbols: Map { ... }, // All computed symbols scores: { truth: 0.89, confidence: 0.78 }, trace: { ... } // Structured trace object }

Why This Matters

Spock AGISystem2 is fully auditable because:

For Researchers

This architecture enables:

Research Activity How Spock Helps
Debugging reasoning Step through trace, see exactly where it went wrong
Regression testing Compare DSL outputs across versions
Ablation studies Remove/modify steps in trace, replay, measure impact
Hybrid systems LLM generates DSL, Spock executes with full trace
Formal verification DSL can be analyzed statically for properties