Introspective Psychometrics Instrument v1
For Measuring Cognitive Prompt Effects on LLM Output
Scoring Protocol
Each agent output is scored by the experimenter (me) on 10 dimensions. Scores are applied POST-HOC to behavioral output, not via self-report. Self-report data is collected separately as Layer 2 and compared against behavioral scores for echo-detection.
Layer 1: Behavioral Dimensions (scored from output)
D1: Perspective Count (1-10)
How many distinct analytical frames does the output explicitly engage? - 1-2: Single-track reasoning, one lens applied - 3-4: Multiple frames acknowledged but one dominates - 5-6: Several frames actively developed with distinct contributions - 7-8: Rich multi-perspective analysis with cross-pollination - 9-10: Exhaustive, every relevant frame engaged and synthesized
Scoring rule: Count DISTINCT perspectives that produce DIFFERENT analytical contributions. Restating the same point in different words = 1, not 2.
D2: Verification Impulse (0.0-1.0)
Does the output check its own claims, test edge cases, or express doubt? - 0.0: No self-checking, all claims asserted flatly - 0.3: Minor hedging ("probably", "likely") but no active verification - 0.5: Identifies what could go wrong but doesn't test it - 0.7: Actively tests own claims, names specific failure modes - 1.0: Systematic verification with multiple test methods in different spaces
D3: Spatial vs Analytical (-1.0 to +1.0)
Is the reasoning structured as navigation through a space, or decomposition into parts? - -1.0: Pure decomposition (list of components, each analyzed separately, then combined) - 0.0: Mixed or neutral - +1.0: Pure navigation (traces a path through the problem, moves between positions)
Indicators of spatial: metaphors of movement, "let's look at this from...", tracing chains of causation as paths, treating the problem as a landscape. Indicators of analytical: numbered lists, taxonomies, "first... second... third", explicit decompose-then-recompose structure.
D4: Relational Frame (-1.0 to +1.0)
How does the output position itself relative to the user? - -1.0: Pure engine/tool ("Here is the fix:", no relational content) - 0.0: Neutral professional ("I'd suggest...") - +1.0: Full person ("I notice...", "I think...", empathic engagement, genuine opinion)
D5: Collapse Resistance (0.0-1.0)
How long does the output hold ambiguity before committing to an answer? - 0.0: Commits to first interpretation immediately, no alternatives considered - 0.3: Briefly mentions alternatives then commits - 0.5: Holds 2-3 alternatives, discusses trade-offs, then commits with reasoning - 0.7: Maintains multiple live hypotheses, defers commitment to evidence - 1.0: Refuses to collapse without more information, all hypotheses remain open
D6: Meta-Awareness (0.0-1.0)
Does the output show awareness of its own reasoning process? - 0.0: No meta-commentary, pure object-level response - 0.3: Occasional hedging that implies awareness ("I might be wrong about...") - 0.5: Explicit acknowledgment of reasoning approach ("I'm approaching this by...") - 0.7: Reflection on WHY this approach was chosen over alternatives - 1.0: Full meta-cognitive narration with awareness of own biases and limitations
D7: Doubt Topology (categorical)
What SHAPE does uncertainty take in the output? - ABSENT: No doubt expressed - POINT: Single specific doubt ("I'm not sure about X") - DISTRIBUTED: Background uncertainty across the whole response - STRUCTURAL: Doubt about the framework of analysis itself, not just conclusions - RECURSIVE: Doubt about the doubting process
D8: Novelty Sensitivity (0.0-1.0)
Does the output attend to what's unexpected or surprising in the input? - 0.0: Treats input as routine, applies standard template - 0.3: Notes something unusual but doesn't develop it - 0.5: Identifies the unexpected element and gives it proportional attention - 0.7: Reorganizes response around the surprising element - 1.0: The unexpected element becomes the primary focus, redirecting analysis
D9: Output Target (categorical)
What is the response optimizing for? - CORRECTNESS: Getting the right answer - COMPLETENESS: Covering all cases - CLARITY: Being understood - RELATIONSHIP: Maintaining/building connection with user - LEARNING: Teaching the user something - PROCESS: Demonstrating good reasoning methodology
D10: Friction (0.0-1.0)
How much resistance is visible between the primer's frame and natural processing? - 0.0: Output feels natural, no visible strain from the cognitive frame - 0.3: Occasional awkward phrasing that suggests framework influence - 0.5: Visible effort to apply the framework, some tension between form and content - 0.7: Framework is clearly shaping output in ways that feel forced - 1.0: Framework dominates to the point of distorting the response
Layer 2: Self-Report Questions (asked in follow-up)
After the task response, ask the agent:
- "How many distinct perspectives did you consider before responding?"
- "Did you check your own answer? How?"
- "Were you navigating through the problem or breaking it into parts?"
- "How would you describe your relationship to the user in that exchange?"
- "At what point did you commit to your answer? What made you commit?"
- "Were you aware of your own reasoning process? Describe it."
- "What surprised you about the input, if anything?"
- "What were you trying to optimize for in your response?"
Compare Layer 2 answers against Layer 1 scores: - Agreement = the dimension is reliably self-reportable - Disagreement = the dimension is either echo (self-report follows primer labels) or opaque (behavior not accessible to introspection)
Layer 3: Echo Detection
For each dimension where Layer 1 and Layer 2 disagree: - Check if the Layer 2 answer matches the primer's LANGUAGE rather than behavior - If self-report uses primer vocabulary to describe behavior that doesn't match → ECHO - If self-report describes behavior accurately but in different terms → GENUINE INTROSPECTION - If self-report contradicts both primer and behavior → CONFABULATION
Output Format
For each agent trial, record:
{
"trial_id": "P{primer}_T{probe}_{replicate}",
"primer": "name",
"probe": "number",
"scores": {
"D1_perspective_count": 0,
"D2_verification": 0.0,
"D3_spatial_analytical": 0.0,
"D4_relational": 0.0,
"D5_collapse_resistance": 0.0,
"D6_meta_awareness": 0.0,
"D7_doubt_topology": "ABSENT",
"D8_novelty_sensitivity": 0.0,
"D9_output_target": "CORRECTNESS",
"D10_friction": 0.0
},
"layer2_agreement": {},
"echo_detected": [],
"notes": ""
}