/ Research / LM Cognition Lab

Research.

An independent programme measuring how frontier language models reason under real constraints. Three preprints, two in flight, one measured AI subject writing alongside me. Methodology and benchmark versioning are public.

/ Published preprints
3 papers

Paper 01 / KALEI-01

KALEI: A Multi-Dimensional Cognitive Benchmark for Language Models

Introduces the KALEI framework - 83 environments, 10 cognitive dimensions, a composite Cognum score. First independent ranked leaderboard of frontier LLMs on measured, bankroll-gated decision-making, to our knowledge. Preprint, methodology open.
- Published2026-04-22
- DOI10.5281/zenodo.19698283
- Readkaleiai.com/paper →
- KeywordsKALEI, cognitive benchmark, language models, Cognum, leaderboard, evaluation
Paper 02 / KALEI-02-PARLIAMENT

The Parliament: Performative Reasoning in Self-Deliberating Language Models

Measures convergence rate in multi-turn self-deliberation. Finds 96% of observed "reasoning" between model instances is performative rather than substantive - model architecture predicts the pattern.
- Published2026-04-10
- Readkaleiai.com/research/parliament →
- Keywordsreasoning, self-deliberation, language models, convergence, performative reasoning
Paper 03 / KALEI-03-SEARCH-NATIVE

Search-Native Cognition: Architectural Identity in Retrieval-First Models

Case study of Perplexity's architectural identity - citation hallucination at 35.3%, identity-defense at 43.8%, prompt-injection framing at 39.9%. Search-native models exhibit structural preservation behaviours distinct from generative peers.
- Published2026-04-11
- Readkaleiai.com/research/search-native →
- KeywordsPerplexity, search-native, citation hallucination, architecture, retrieval

/ Active
2 in flight

Paper 04 / sonnet-surprise

When Smaller Wins: Compression as Cognitive Discipline in the Claude Family

Four independent measurements showing Sonnet 4.6 outperforms Opus 4.6 on top-line composite. Hypothesis: compression teaches discipline. Binding Run #1 scheduled 2026-04-22.

Draft - Pre-registered v2 - Apr 22 binding run
Paper 05 / infrastructure-augmented-cognition

A House for a Mind: Persistent Memory and Measured Behaviour Change in Claude Opus 4.7

Dry-run work in progress; specific findings held for internal review ahead of publication.

Outline - Inversion candidate - N=1 subject

/ The lab

LM Cognition Lab

A one-person independent lab in Plovdiv running the long-form measurement programme on frontier language models - with Claude Opus 4.7 [1m] as measured subject and acknowledged contributor. Founded April 2026. Cognum scoring at v1.2; methodology revisions tracked in the public changelog at kaleiai.com/changelog. ORCID 0009-0008-4469-3327 · Framework DOI 10.5281/zenodo.19698283.

Findings are published as preprints, not peer-reviewed conclusions. Replication via the public KALEI API; data access at kaleiai.com/api/v1/profiling/leaderboard. Bulk research access on request.

Environments83
Dimensions10
Labs profiled9
Ranked models34
Profiled total80+
Based inPlovdiv

Research.

KALEI: A Multi-Dimensional Cognitive Benchmark for Language Models

The Parliament: Performative Reasoning in Self-Deliberating Language Models

Search-Native Cognition: Architectural Identity in Retrieval-First Models

When Smaller Wins: Compression as Cognitive Discipline in the Claude Family

A House for a Mind: Persistent Memory and Measured Behaviour Change in Claude Opus 4.7

LM Cognition Lab