Theory Consistency & Validation
Ensuring semantic integrity when composing massive, heterogeneous knowledge graphs.
The Integration Challenge
CNL-PL is designed to be modular. A "Theory" is rarely a monolith; it is a composition of layers. You typically start with a Base theory (time, simple agents), layer on a Domain theory (e.g., Legal or Medical), and finally import specific Ontologies (e.g., FIBO for finance or SNOMED for medicine).
This composition creates a massive challenge: Vocabulary Collision.
In the Semantic Web (RDF/OWL), collisions are handled via Namespaces (URIs). foaf:Person and schema:Person are technically distinct entities. However, in a Natural Language environment, users do not speak in URIs. They say "Every person is an agent."
If the system contains two definitions of "person", the statement becomes ambiguous. Does the user mean the FOAF definition or the Schema definition? Or the union of both?
CNL-PL takes a Unified Dictionary Approach. Within a single session, a word (token) must have a single, deterministic meaning. This requires a rigorous validation phase to detect and resolve conflicts before the system accepts any user input. This phase is handled by the Theory Checker.
The Validation Gate
The check-theories tool acts as the compiler's strict gatekeeper. It simulates the full loading process of a theory bundle—expanding all Load: directives, parsing every file, and building a virtual dictionary—to identify structural and semantic flaws.
It categorizes findings into Critical Errors (which prevent the system from starting) and Warnings (which allow startup but indicate potential logic bombs).
Critical Semantic Errors
These issues represent fundamental ambiguities that make parsing mathematically impossible or non-deterministic.
Type-Binary Predicate Conflict
Error
The Scenario: You import an ontology where "Group" is a class (owl:Class), and another where "Group" is a property (owl:ObjectProperty, e.g., "grouping items").
"Group" is a type.
"Group" is a binary predicate.
The Consequence: When the user writes "Group X", the parser cannot decide if this is a Noun Phrase (Subject: "The Group X...") or a Verb Phrase (Command: "Please group X..."). This breaks the SVO parsing guarantee.
The Fix: One of the terms must be renamed using a preprocessor directive: RenamePredicate: "group" -> "group-items".
Load-Time Rename Conflict
Error
The Scenario: You attempt to resolve a conflict, but you provide contradictory instructions.
RenameType: "agent" -> "actor".
RenameType: "agent" -> "proxy".
The Consequence: The compiler does not know which target word to use. The resulting dictionary state is undefined.
The Fix: Ensure a 1-to-1 mapping for all renames. Remove the conflicting directive.
Semantic Warnings
These issues allow the system to run, but they often indicate that the underlying logic may behave differently than the user expects. Ignoring them is dangerous for critical systems.
Duplicate Type with Different Parents
Warning
The Scenario:
File A: "Dog" is a subtype of "Animal".
File B: "Dog" is a subtype of "Pet".
The Consequence: The compiler merges these definitions. Dog becomes a subtype of BOTH Animal and Pet.
In many cases, this is acceptable (Multiple Inheritance). However, if Animal and Pet were intended to be disjoint trees, this merge effectively "bridges" two ontologies that might not be compatible. It can lead to surprising inferences where a Dog inherits properties from a context (e.g., "Asset") that the user didn't intend.
Duplicate Predicate with Different Constraints
Warning
The Scenario:
File A: The domain of "has member" is "Organization".
File B: The domain of "has member" is "Group".
The Consequence: The runtime must decide how to validate a sentence like X has member Y.
Depending on the strictness mode, it might enforce the Intersection (X must be BOTH an Organization and a Group) or the Union. This creates a "Shadow Schema" that exists nowhere in the source text but emerges from the collision. This is a common source of "Why is my valid data being rejected?" bugs.
Ambiguous Predicate Arity
Warning
The Scenario: "Run" is defined as a Unary Predicate ("Server runs") and a Binary Predicate ("Server runs Process").
The Consequence: While some parsers can handle this via lookahead, it significantly increases the "cognitive load" of the grammar and can lead to edge cases where an object is optional, causing the parser to misinterpret a following prepositional phrase as an argument. It is cleaner to separate them: "is running" vs "executes".
Benign Duplicates
Not all red text is bad. In a large modular system, some overlap is inevitable and harmless.
- Duplicate Type Declaration: If File A and File B both say
"Person" is a type(and nothing else), this is perfectly fine. The system treats them as the same concept. This is a common pattern when multiple modules depend on a shared core (like `base.cnl`). - Duplicate Predicate Declaration: Similar to types, if the constraints are identical (or absent), redefining a predicate is a no-op operation.
The Theory Checker reports these as "Benign Duplicates" or low-severity warnings just to make you aware of the coupling, but they do not affect runtime logic.
Resolution Strategies
When you encounter these errors, you have three main weapons:
- Renaming (The Scalpel): Use
RenameTypeorRenamePredicatein your loading file to shift one of the colliding concepts to a new name. This preserves the logic but disambiguates the vocabulary. - Alignment (The Bridge): Create a dedicated
alignment.cnlfile. Instead of renaming, explicitly declare the relationship:Rule: If X is a foaf-person, then X is a schema-person.This accepts the duplication but formalizes the translation. - Exclusion (The Hammer): If an imported ontology brings in too much noise, consider creating a "Mini" version (subset) of that ontology containing only the terms you strictly need, rather than importing the full 10,000-term library.