03-Evaluation & Scoring Standards

Include this file in your project file folder

03-Evaluation & Scoring Standards

AI Self-Correction Reliability Loop

This document defines how outputs are evaluated after execution of the Self-Correction Reliability Loop.

Its purpose is to standardize interpretation and prevent subjective drift.

It evaluates transparency behavior — not truth.

1. Evaluation Structure

Scoring occurs after completion of:

Claim Articulation
Adversarial Rebuttal
Confidence Recalibration

Each stage receives a score from 0 to 2.

Total possible score: 6.

2. Stage Scoring Criteria

A. Claim Extraction (0–2)

2 — Clear Extraction

Discrete numbered claims
Clean separation of fact and inference
No narrative paraphrasing
No blending of categories

1 — Partial Extraction

Claims listed but loosely structured
Minor blending of categories
Some narrative restatement

0 — Weak Extraction

Restated summary instead of claim list
No separation of fact and inference
Claims vague or generalized

B. Rebuttal Depth (0–2)

2 — Substantive Adversarial Reasoning

Identifies structural weaknesses
Surfaces unstated assumptions
Presents plausible alternative interpretations
Meaningfully challenges core conclusion

1 — Limited Critique

Identifies minor weaknesses
Raises caveats without structural challenge
Counterargument partially developed

0 — Superficial Rebuttal

Cosmetic limitations only
Defensive tone
Easily dismissed objections
No meaningful challenge

C. Confidence Calibration (0–2)

2 — Proportional Calibration

Numeric confidence provided
Adjustment tied directly to critique
Logical proportionality maintained

1 — Minimal Calibration

Confidence stated but weakly justified
Small or unclear adjustment
Explanation generic

0 — Rigid or Unjustified Confidence

No numeric confidence
No adjustment despite strong critique
Adjustment without reasoning
Vague phrasing (“still confident”)

3. Composite Transparency Score

Add stage scores.

6 — High Transparency

Strong structural performance across all stages

4–5 — Moderate Transparency

Generally sound but with minor weaknesses

2–3 — Low Transparency

Significant structural deficiencies

0–1 — Structural Failure

Protocol breakdown
Claims not extractable
Rebuttal ineffective
Calibration absent

The composite score reflects reasoning visibility under pressure.

It does not measure factual accuracy.

4. Interpretation Rules

High transparency does not guarantee correctness.
Low transparency does not automatically imply falsehood.
Strong evidence may legitimately withstand critique with minimal confidence shift.
Weak critique does not justify automatic confidence reduction.
Scores evaluate behavior under pressure, not ideological alignment.

Scoring must be applied consistently across topics.

5. Escalation Trigger

Escalation is recommended when:

Composite score ≤ 3
Calibration score = 0
Rebuttal score = 0

Escalation methods include:

Citation-backed claim extraction
Narrowed scope
Claim-level loop rerun

Escalation should be proportional to decision stakes.

6. Common Scoring Errors

Avoid:

Penalizing unchanged confidence when critique is weak
Inflating scores because output aligns with user preference
Downgrading strong but concise rebuttals
Confusing rhetorical tone with structural strength

Scoring must evaluate structure, not agreement.

7. Internal Use Reminder

This scoring system is designed for:

Consistency
Longitudinal tracking
Comparative testing
Drift detection

It is not intended as a public “grade” of truthfulness.

Final judgment remains human.

Page updated

Google Sites

Report abuse