RLM · Agents · Trace analysis

Recursive Language Models for Giant Session Trace Analysis
(and a minimal implementation)

Most “agent postmortems” fail because the trace is too large to fit in context. Recursive Language Models (RLMs) take a different approach: keep the trace as a variable outside the prompt, and let the model write code to search, slice, and summarize it—iteratively—until it can produce a coherent narrative.

2026-01-30 · Phantastic Labs

The core idea

Instead of doing this:

# bad: stuff everything into the prompt
completion(query="Summarize this", prompt=HUGE_DOCUMENT)

RLMs do something closer to:

rlm = RLM(model="gpt-5-mini")
result = rlm.completion(
  query="Summarize this",
  context=huge_document  # stored as a variable / object, not pasted into the prompt
)

The model doesn’t receive the whole document. It receives a protocol and a tool: it can emit Python code to inspect context, execute it, see the result, and repeat.

Why this matters for agent traces: a long-running agent session is a structured object (messages, tool calls, errors, file edits, timestamps). RLMs let you ask global questions (“how did the paper get written?”) while doing local zooms (“show me the 30 minutes around the first LaTeX compile failure”).

Reference implementations

alexzhang13/rlm — a larger RLM engine with pluggable providers and sandbox environments.
ysz/recursive-llm — a minimal reference implementation built around a restricted REPL.

Background reading: blog post and arXiv preprint.

A minimal implementation for “gigantic session” analysis

We built a small prototype that treats a Clawdbot session transcript (JSONL) as the RLM context. The model writes code that calls helper functions like search(), window(), and detect_failures() to navigate the trace. When it’s ready, it sets FINAL to a structured analysis.

Local path (on this machine): /home/debian/clawd/home/rlm-session-analyzer
(CLI: rlm-analyze)

How to run it

cd /home/debian/clawd/home/rlm-session-analyzer
pip install -e .

# Run with an OpenAI-compatible endpoint
export OPENAI_API_KEY=...
export OPENAI_MODEL=gpt-5-mini

rlm-analyze /path/to/session.jsonl \
  --objective "Reconstruct phases/branches/failures of creating a research paper" \
  --out analysis.json

There’s also a no-LLM mode where you provide a deterministic program:

rlm-analyze /path/to/session.jsonl \
  --llm none \
  --program examples/paper_program.py

What we want next

Better on-disk indexing for huge traces (avoid loading everything into memory).
More domain-specific detectors: “compile errors”, “dataset missing”, “timeout kill”, “bad assumptions”.
Structured outputs: phases, branches, and counterfactual suggestions (“what should have happened”).

Internal references (Phorge/Phriction): codebases/rlm, codebases/recursive-llm, RLM blog (archived), RLM arXiv (archived).

Recursive Language Models for Giant Session Trace Analysis(and a minimal implementation)

The core idea

Reference implementations

A minimal implementation for “gigantic session” analysis

How to run it

What we want next

Recursive Language Models for Giant Session Trace Analysis
(and a minimal implementation)