Induction Engine - AGISystem2

Induction is learning general rules from specific examples. Given positive and negative examples, an induction engine can discover rules that classify the examples correctly. This can help the knowledge base grow from data.

Status: Advanced reasoning is partial (DS06). Induction workflows are research-level and not fully implemented in the runtime.

1. Theoretical Foundation

Inductive Logic Programming (ILP):

Given: B (background knowledge), E⁺ (positive examples), E⁻ (negative examples)

Find: H (hypothesis) such that: B ∪ H ⊢ E⁺ and B ∪ H ⊬ E⁻

The learned hypothesis H must entail all positive examples while not entailing any negative examples.

Comparison of Inference Modes:

Mode	Given	Infer	Example
Deduction	Rule + Case	Result	All birds fly + Tweety is a bird → Tweety flies
Abduction	Rule + Result	Case	All birds fly + X flies → X might be a bird
Induction	Cases + Results	Rule	Robin flies, Sparrow flies, Eagle flies → Birds fly

2. Induction Process

3. Algorithm: Rule Induction

Algorithm: induce(E⁺, E⁻, B)

Initialize: H = most general hypothesis (covers everything)
Specialize: While H covers any negative example:
- Find specializations of H that exclude negatives
- Select best specialization (by information gain)
- H = selected specialization
Verify: Check that H still covers all positives
Simplify: Remove redundant conditions
Return: H with confidence score

Specialization Operators:

Add condition: H → H ∧ C (make hypothesis more specific)
Substitute variable: Replace ?x with constant
Add negation: H → H ∧ ¬C (exclude exception cases)
Restrict type: isA ?x Thing → isA ?x Animal

4. Evaluation Metrics

Coverage and Accuracy:

Coverage = |{e ∈ E⁺ | B ∪ H ⊢ e}| / |E⁺|

Accuracy = (TP + TN) / (TP + TN + FP + FN)

Metric	Formula	Meaning
Coverage	TP / (TP + FN)	What fraction of positives are covered?
Precision	TP / (TP + FP)	What fraction of predictions are correct?
Information Gain	H(D) - H(D\|A)	How much does condition A reduce uncertainty?
MDL Score	\|H\| + \|D\|H\|\|	Hypothesis size + exceptions (prefer simpler)

5. Learning Strategies

6. Handling Noise and Exceptions

Real-World Challenge: Data often contains noise (mislabeled examples) and exceptions (genuine outliers). Strict ILP fails on noisy data.

Probabilistic Relaxation:

// Instead of: H must cover ALL positives and NO negatives
// Allow: H may miss some positives and cover some negatives

Criterion: maximize(Coverage(H) - α × FalsePositives(H) - β × Complexity(H))

// Example: "Birds fly" covers 95% of examples
// Better than: "Birds except penguins, ostriches, emus, kiwis... fly"

Strategy	Description	When to Use
Hard constraints	No false positives allowed	Clean data, critical domains
Soft constraints	Allow some error for simpler rules	Noisy data, pattern discovery
Exception lists	Learn rule + explicit exceptions	Few outliers, interpretability
Confidence scores	Attach probability to rules	Uncertain domains

7. Example: Learning Family Relations

Background Knowledge:

parent Alice Bob
parent Alice Carol
parent Bob Dave
parent Carol Eve
male Bob
male Dave
female Alice
female Carol
female Eve

Positive Examples:

grandparent Alice Dave ✓
grandparent Alice Eve ✓

Negative Examples:

grandparent Bob Dave ✗
grandparent Alice Bob ✗

Induced Rule:

@r_induced Implies (And (parent ?x ?y) (parent ?y ?z)) (grandparent ?x ?z)

// "X is grandparent of Z if X is parent of Y and Y is parent of Z"
// Coverage: 100%, Precision: 100%

8. Integration with Knowledge Base

Rule Integration Pipeline:

Learn: Induce candidate rules from examples
Validate: Test on held-out examples
Check consistency: Verify no conflicts with existing KB
Assign confidence: Based on coverage and validation
Add to KB: With provenance marking
Monitor: Track rule performance over time

// Induced rules are marked with provenance
@r_learned_001 Implies (isA ?x Bird) (canFly ?x)
  ; induced_from: examples/birds.sys2
  ; confidence: 0.95
  ; coverage: 47/50
  ; exceptions: [Penguin, Ostrich, Emu]

9. Minimum Description Length (MDL)

MDL Principle: The best hypothesis is the one that minimizes the total description length.

MDL(H, D) = L(H) + L(D|H)

L(H): Length to describe the hypothesis (prefer simpler rules)
L(D|H): Length to describe data given hypothesis (prefer higher coverage)

MDL Trade-off:

H₁: "Birds fly"
    L(H₁) = 2 tokens
    L(D|H₁) = 3 exceptions = 3 tokens
    MDL = 5

H₂: "Birds except penguins, ostriches, emus fly"
    L(H₂) = 5 tokens
    L(D|H₂) = 0 exceptions = 0 tokens
    MDL = 5

// Equal MDL - but H₁ is preferred for generalization

10. API Reference

import { InductionEngine } from 'agisystem2/reasoning';

const engine = new InductionEngine(session);

// Learn from examples
const rules = await engine.induce({
  positives: [
    { operator: 'canFly', args: ['Robin'] },
    { operator: 'canFly', args: ['Eagle'] }
  ],
  negatives: [
    { operator: 'canFly', args: ['Penguin'] }
  ],
  background: session.kb
});

// With options
const rules = await engine.induce(examples, {
  maxRules: 5,              // Max rules to learn
  minCoverage: 0.8,         // Minimum coverage required
  maxComplexity: 3,         // Max conditions per rule
  allowExceptions: true,    // Learn exception lists
  noiseThreshold: 0.05      // Tolerate 5% noise
});

// Validate learned rules
const validation = await engine.validate(rules, testExamples);

11. Related Documentation

Proof Engine – Testing induced rules
Query Engine – Finding training examples
Abduction Engine – Complementary inference
Semantic Libraries – Sharing learned knowledge

Induction Engine Research