Story Quality Metrics

SCRIPTA uses a comprehensive set of metrics to evaluate story quality. These metrics detect structural issues, measure coherence, and ensure generated stories are not random or incoherent.

Why Metrics Matter

AI-generated stories can suffer from various problems:

Random character assignment - Characters appear once and vanish
Location hopping - Every scene in a different place with no logic
Orphan references - Actions mention entities that don't exist
Incomplete scenes - Missing characters, locations, or actions
Unused relationships - Defined connections never manifest
Empty dialogues - Placeholder conversations without purpose

SCRIPTA's metrics system detects these issues automatically, giving you immediate feedback on story quality.

The Evaluate Button

Click "Evaluate" in the Metrics panel after generating a story to see detailed quality analysis. The browser console shows additional debugging information.

Narrative Quality Score (NQS)

The Narrative Quality Score is the master metric that combines all quality dimensions into a single 0-100% score.

NQS Components

Component	Weight	What It Measures
Completeness	12%	Are required elements present? (characters, locations, themes, arc)
Coherence Score	12%	Entity usage and structural coherence
Character Drift	8%	Character trait consistency (lower is better)
Originality	8%	Variety of narrative blocks and actions
Emotional Arc	10%	Arc coverage and mood usage
Parse Success	8%	Valid CNL syntax in output
Constraints	8%	Constraint satisfaction accuracy
Explainability	8%	Documentation level (arc, themes, rules, wisdom)
Coherence Analysis	26%	Character continuity, location logic, action coherence, scene completeness

Interpreting NQS

< 40%: Severe issues, likely random/incoherent generation
40-60%: Multiple problems, needs significant improvement
60-75%: Acceptable, some issues to address
75-85%: Good quality, minor refinements needed
> 85%: Excellent, well-structured story

Summary Metrics

Completeness

Measures whether all required story elements are present:

At least 2 characters (20%)
At least 2 locations (10%)
At least 3 scenes (25%)
At least 1 theme (10%)
Emotional arc with 3+ beats (10%)
At least 2 dialogues (15%)
At least 2 relationships (10%)

Threshold: >= 80%

Coherence Score (CS)

Measures structural coherence through entity usage and relationships:

Entity usage ratio (40%) - Are defined characters/locations actually used?
Relationship coverage (30%) - Are relationships defined between characters?
Block usage (30%) - Are narrative blocks used in scenes?

Threshold: >= 75%

Emotional Arc Profile (EAP)

Measures how well the emotional journey is defined:

Arc beat coverage (50%) - How many arc beats are assigned?
Mood variety (25%) - Are different moods used?
Mood usage (25%) - Are moods applied to scenes?

Threshold: >= 70%

Detailed Analysis Metrics

Character Attribute Drift (CAD)

Measures trait consistency. Lower is better.

3+ traits per character: 0.05 (excellent)
2 traits per character: 0.10 (good)
1 trait per character: 0.15 (minimal)
0 traits: 0.25 (poor)

Threshold: <= 0.15

Compliance Adherence Rate (CAR)

Percentage of references that point to existing entities. Orphan references (pointing to non-existent entities) lower this score.

Threshold: >= 95%

Originality Index (OI)

Measures variety in the narrative:

Block type variety (40%) - Different narrative blocks used
Action variety (30%) - Different action types used
Theme variety (30%) - Multiple themes defined

Threshold: >= 50%

CNL Parse Success Rate (CPSR)

Percentage of CNL lines that follow valid syntax. Tests whether the specification can be parsed correctly.

Threshold: >= 90%

Constraint Satisfaction Accuracy (CSA)

Ratio of valid references to total references. Similar to CAR but focused on constraint validation.

Threshold: >= 95%

Retrieval Quality (RQ)

Measures naming quality - entities should have meaningful names (3+ characters, not "undefined").

Threshold: >= 80%

Explainability Score

Measures documentation level (0-5 rating):

Arc defined: +0.75
Themes defined: +0.75
Relationships defined: +0.75
World rules defined: +0.75
Emotional arc (3+ beats): +0.75
Wisdom entries: +0.75
Patterns defined: +0.50

Threshold: >= 3.5/5

Coherence Analysis

These six metrics specifically detect random, inconsistent, or incoherent generation. They are the primary defense against "garbage" output.

Detecting Bad Generation

If coherence metrics average below 40%, the story is likely random noise rather than meaningful narrative. Check the console for detailed diagnostics.

Character Continuity

Question: Do characters appear consistently across scenes?

Characters in multiple scenes (50%)
Hero/protagonist presence throughout (50%)

Threshold: >= 60%

Low score indicates: Characters randomly assigned per scene, no continuity

Location Logic

Question: Are locations reused logically or is it random hopping?

Locations appearing 2+ times (50%)
Average location usage vs expected (50%)

Threshold: >= 50%

Low score indicates: Every scene in a different location, no spatial coherence

Action Coherence

Question: Do actions reference known entities?

Action subject is a known character
Action target is a known character, location, or object

Threshold: >= 80%

Low score indicates: Actions reference entities that don't exist in libraries

Scene Completeness

Question: Does each scene have minimum required elements?

A complete scene needs:

At least one character
At least one location
At least one action or dialogue

Threshold: >= 70%

Low score indicates: Scenes are fragments, missing essential elements

Relationship Usage

Question: Are defined relationships actually used in the story?

For each relationship (A -> B), do A and B appear in the same scene?
Unused relationships are just decoration

Threshold: >= 50%

Low score indicates: Relationships defined but never manifested in narrative

Dialogue Quality

Question: Are dialogues well-structured and purposeful?

Each dialogue scores 0-1 based on:

Has purpose defined (+0.25)
Has 2+ participants (+0.25)
Has at least 1 exchange (+0.25)
Exchanges have content (intent or sketch) (+0.25)

Threshold: >= 60%

Low score indicates: Dialogues are empty placeholders without structure

Diagnostic Interpretation

Overall Coherence Assessment

Average Score	Interpretation	Action
< 30%	Garbage - Random/incoherent	Regenerate with better specification
30-50%	Poor - Major structural issues	Review entities and structure
50-70%	Fair - Some coherence	Refine weak areas
70-85%	Good - Mostly coherent	Minor adjustments
> 85%	Excellent - Well-structured	Ready for expansion

Specific Issue Detection

Low Metric	Likely Problem	Solution
Character Continuity < 40%	Characters randomly assigned	Ensure hero appears in most scenes
Location Logic < 30%	Random scene hopping	Reuse key locations (home, workplace)
Action Coherence < 60%	Orphan entity references	Add missing characters/objects to libraries
Scene Completeness < 50%	Incomplete scene fragments	Ensure each scene has who/where/what
Relationship Usage < 30%	Relationships never used	Place related characters in same scenes
Dialogue Quality < 40%	Empty placeholder dialogues	Add purpose and exchanges to dialogues

Console Debugging

Open browser developer tools (F12) to see detailed metric calculations:

Console Output Example

[Evaluate] Counts: {
  characters: 4,
  locations: 3,
  scenes: 8,
  chapters: 3,
  actions: 12,
  dialogues: 5,
  themes: 2,
  relationships: 6,
  worldRules: 3,
  wisdom: 2,
  patterns: 1
}

[Evaluate] Coherence: {
  charContinuity: "0.75",
  locLogic: "0.60",
  actionCoherence: "0.92",
  sceneCompleteness: "0.88",
  relUsage: "0.67",
  dialogueQuality: "0.70"
}

Use this data to identify exactly which aspects need improvement.

Story Quality Metrics

Why Metrics Matter

Narrative Quality Score (NQS)

NQS Components

Summary Metrics

Completeness

Coherence Score (CS)

Emotional Arc Profile (EAP)

Detailed Analysis Metrics

Character Attribute Drift (CAD)

Compliance Adherence Rate (CAR)

Originality Index (OI)

CNL Parse Success Rate (CPSR)

Constraint Satisfaction Accuracy (CSA)

Retrieval Quality (RQ)

Explainability Score

Coherence Analysis

Character Continuity

Location Logic

Action Coherence

Scene Completeness

Relationship Usage

Dialogue Quality

Diagnostic Interpretation

Overall Coherence Assessment

Specific Issue Detection

Console Debugging

Related Documentation

Technical Specifications

Implementation