What Is Cursor's agent-trace? An Open Spec for AI Code Attribution
agent-trace is an open specification for recording which parts of a codebase were written by AI and which by humans. It is not OTEL. It solves a different problem — and most enterprises will eventually need both.
The amount of code being written by AI inside enterprise codebases is no longer a rounding error. By the end of 2025 a meaningful share of merged commits in many engineering organizations contained at least one block of AI-generated code, and the share keeps climbing. The interesting question is no longer whether AI writes code in your repo — it is whether you can prove which lines did, when, and from which tool.
That question turns out to be surprisingly hard. Git blame tells you which commit a line came from and which human signed off on it. Pull-request descriptions sometimes mention an agent. Cursor and Copilot ship their own internal telemetry, but the data is locked to their UIs and not portable across tools. There is no neutral, machine-readable, vendor-independent answer to “was this line of code written by a human or an agent, and if an agent, which one and which session?”
Cursor’s agent-trace project is an attempt to fix that. It is an open specification — not a runtime tracer, not a logging library — for recording AI code attribution metadata in a way any tool can read and write. This post walks through what it is, what problem it solves, and how it relates to (but does not replace) OpenTelemetry-style runtime telemetry.
What agent-trace Is, Verbatim
The project describes itself as “an open specification for tracing AI-generated code.” That is precise wording and worth re-reading. It is a specification, not an implementation. It traces AI-generated code, not the AI agent’s runtime behavior. It is open — published under CC BY 4.0 — so any tool can adopt it without licensing friction.
The four design principles the project lists are equally precise:
- Interoperability. Any compliant tool can read and write attribution data.
- Granularity. Attribution is supported at file and line level.
- Extensibility. Vendors can add custom metadata without breaking compatibility.
- Readability. Attribution data is readable without special tooling.
The reference implementation is in TypeScript and is framework-agnostic. The project is released under CC BY 4.0, which is what enables broad adoption: a coding agent vendor, a CI pipeline, an IDE plugin, or a code-review bot can all emit and consume agent-trace records without negotiating a license.
The Schema in One Page
A Trace Record is the core unit. The shape is:
Trace Record
├── version, id, timestamp (required identity)
├── vcs (type: git/jj/hg/svn, revision)
├── tool (recorder name + version)
├── files[]
│ ├── path
│ ├── conversations[]
│ │ ├── url (link to the AI conversation)
│ │ ├── contributor
│ │ │ ├── type (human | ai | mixed | unknown)
│ │ │ └── model_id (optional)
│ │ ├── ranges[] (start/end line numbers)
│ │ └── related[] (links to related resources)
│ └── ...
└── metadata (vendor-specific, structured)
The data model is small on purpose. There is exactly one core decision per range — human, ai, mixed, or unknown — with optional model identity. Everything else hangs off metadata, which vendors can extend. This is the right minimum bar: the small core makes the format trivially readable; the metadata extension point keeps vendors from forking the spec to add features.
The deliberate choice to reference conversations by URL is also worth flagging. agent-trace does not embed the conversation itself in the record. It points to where the conversation lives. That keeps the attribution metadata small and lets the conversation provider (Cursor, ChatGPT, your internal coding agent) own the lifecycle of the conversation transcript.
Where the Records Live
agent-trace is designed to live alongside the code itself, in version-controlled storage. The natural shape is one record (or a small set of records) committed alongside each meaningful change — a JSONL append per pull request, or per commit, or per agent session, depending on the tool emitting them.
That choice is the most important architectural decision in the whole spec. It means agent-trace data has the same lifecycle as the code it attributes: it ships with the code, it can be reviewed in pull requests, it survives forks and rebases, and it is readable directly from a clone of the repository — no separate database, no external service, no vendor lock-in.
The flow looks like this:
flowchart LR
A[Coding Agent / IDE] -->|emits| B[Trace Record]
B -->|committed alongside code| C[Git Repository]
C -->|read by| D[Code Review Bot]
C -->|read by| E[Compliance Reporter]
C -->|read by| F[Quality Analytics]
G[Other AI Tool] -->|emits| B
Any number of tools write records; any number of tools read them. The repository is the substrate. That is a different architectural shape than runtime telemetry, which we will get to in a moment.
Why This Matters for Enterprises Right Now
Three concrete enterprise use cases drive interest in agent-trace, and all three are getting harder to ignore:
1. AI provenance for compliance and legal. Regulators and customers increasingly want to know which parts of a delivered system were written by AI. The EU AI Act, several U.S. state-level laws around AI disclosure, and a growing list of customer contracts now ask the question explicitly. Without persistent line-level attribution, the answer is “we will guess based on tribal knowledge.” agent-trace turns that into a queryable repository fact.
2. Quality and bug attribution. Once you can identify AI-written ranges, you can correlate them with bug rates, test coverage, security findings, and review-round counts. That is how engineering leadership builds an evidence-based opinion on which AI tools and which models are actually paying off — instead of relying on the “feels productive” survey response that has been the dominant signal for the past year.
3. Tool standardization. Most enterprises are running a sprawl of coding agents — Cursor, Copilot, Devin, Claude Code, Codex CLI, plus a handful of internal experiments. A neutral attribution format is the only way to compare them on equal footing. Without one, you have five vendor dashboards and no way to ask “which agent produced more code that survived review?”
For the broader context on coding agents in the enterprise, Agentic Coding covers the shape of the workloads and Building an AI Audit Trail covers the broader trail concept that agent-trace fits into.
agent-trace vs OpenTelemetry GenAI — They Solve Different Problems
The most common confusion we have heard since the agent-trace launch is people treating it as a competitor to OpenTelemetry’s GenAI semantic conventions. They are not competitors. They sit in different layers of the stack and answer different questions.
| Concern | OpenTelemetry GenAI | agent-trace |
|---|---|---|
| Question answered | What did the agent do at runtime? | Which lines in the codebase came from AI? |
| Lifetime of the data | Hours to weeks (trace store retention) | Same lifetime as the code itself (git history) |
| Storage | Trace backend (Jaeger, Tempo, Honeycomb, etc.) | Files in the repository |
| Granularity | One span per LLM call | One range per attribution decision |
| Scope of attribution | Per request | Per line of source code |
| Primary consumer | SREs, compliance, FinOps | Code reviewers, compliance, quality analytics |
| Vendor neutrality | OTLP wire protocol, standardized attributes | CC BY 4.0 spec, JSON records |
OTEL traces tell you what happened during a single coding-agent session: how many LLM calls, what tools were invoked, how many tokens, how long it took, what errors occurred. That data lives in the trace store and ages out on the retention policy you set. It is operational data.
agent-trace records tell you which lines of which files in your repository were produced or modified during that session. That data lives in git and persists for as long as the code does. It is provenance data.
You almost certainly want both, for the same reason you want both server logs and source-control history: they answer different questions on different timescales. The deeper context on why runtime LLM telemetry matters is in our OpenTelemetry-for-LLMs post.
Where agent-trace Fits in the Broader Stack
agent-trace is a young spec at the time of writing. It is not yet a default in any major coding-agent product, and the ecosystem of consuming tools (review bots, compliance reporters, analytics dashboards) is still being built. That is fine. Most useful specs in this space — including OTEL itself — spent two or three years in the early-adopter zone before hitting the mainstream.
The shape of how it gets adopted, in the order we expect it to happen:
- Coding-agent vendors emit records. Cursor will ship them by default. Other agents adopt the spec because it is open and because their customers ask. Within twelve months, most major agents are emitting agent-trace records in some form.
- CI pipelines start consuming records. A review bot reports “78% of this PR is AI-attributed; 12% is unknown.” A compliance check refuses merges when high-risk paths are touched only by AI without a human review. These are dashboarding and policy use cases.
- Repositories accumulate provenance over time. The interesting compounding happens once a repo has been emitting records for a year. You can query “what percentage of this codebase was AI-written?” with a real answer. You can correlate AI-written ranges with incident postmortems. You can do informed tool-vendor evaluations.
- Compliance frameworks reference the spec. Once enough of the data exists, audit checklists start asking for it specifically. SOC 2, ISO 42001, EU AI Act technical documentation — one or more will land on agent-trace as the expected artifact for AI code provenance.
This is the same arc OTEL went through. Specifications mature when the data is present; the data accumulates when the spec is cheap to emit. agent-trace is cheap to emit, which is the most important thing about it.
How Axiom Plugs In
Axiom’s observability layer cares about both halves of this story. The Axiom LLM Gateway handles the runtime side — one OTEL span per model call, normalized gen_ai.* attributes, OTLP-native, ships to whatever trace backend you run. That is the “what did the agent do at runtime” half of the picture.
For the “which lines came from AI” half, agent-trace records belong with the code, not in the gateway. The integration we recommend is straightforward: have your coding-agent emit agent-trace records into the repo as part of the agent session, and have your CI pipeline read them on every PR. The gateway is the runtime audit trail; agent-trace is the persistent code-attribution trail. Both feed into the same compliance evidence package.
If you operate a multi-agent program — Cursor and Copilot and Claude Code and an internal agent — agent-trace is the only neutral way to compare what each is producing. Adopting it early is one of those rare technical decisions that has almost no cost and a long-tail upside.
The Take
agent-trace is one of the most useful small specifications shipped in 2026 because it picks one obvious-in-retrospect job — record AI vs human code attribution — and does it without trying to be more than that. It is not a runtime tracer. It is not a competitor to OTEL. It does not lock you to any vendor. It commits records to the repository where they belong, in a JSON format any reader can parse.
For enterprises building serious coding-agent programs, the question is not “should we adopt agent-trace?” It is “which agent in our stack will be the first to emit it, and how do we wire the rest in behind that.” Get the records flowing now, build the analytics on top once the data is real, and you will have an answer ready when the next compliance review asks.
Written by
AXIOM Team