Files

140 lines
4.5 KiB
Markdown

# Runtime Receipts
This document sketches a future read-only receipt export for completed runtime
turns. It is a protocol note, not an implemented endpoint.
The goal is to let a local supervisor audit one completed turn without
screen-scraping the terminal transcript. A receipt should summarize the durable
runtime records that CodeWhale already owns: thread metadata, turn status, turn
items, event sequence lineage, usage when available, approval decisions, and
side-effect boundaries.
## Non-Goals
A receipt is not a safety certification, provider compatibility certification,
or hosted attestation. It must not call providers, execute tools, write memory,
write project files, mutate runtime state, or expose API keys.
Receipts should not export raw chain-of-thought or private reasoning by default.
When reasoning custody is represented, use stable item ids, counts, hashes, or
explicit `unavailable` fields rather than raw hidden content.
## Candidate Surfaces
Potential local-only surfaces:
```text
codewhale receipt export --thread <thread_id> --turn <turn_id> --format json
GET /v1/threads/{thread_id}/turns/{turn_id}/receipt
```
Both surfaces should share the existing runtime API auth boundary. They should
only read persisted runtime records and append-only events.
## Current Data Sources
The current runtime store already persists the core inputs a receipt builder
would need:
- `ThreadRecord`: model, workspace, mode, shell/trust/auto-approve flags,
title, task linkage, and latest turn metadata.
- `TurnRecord`: turn status, input summary, timestamps, duration, usage, error,
steer count, and item ids.
- `TurnItemRecord`: item kind, lifecycle status, summary, optional detail,
metadata, artifact refs, and item timestamps.
- `RuntimeEventRecord`: thread id, turn id, item id, event name, JSON payload,
timestamp, and monotonic `seq` values per runtime store.
Not every receipt field can be filled from those records today. If a provider or
store does not persist a value, the receipt should say `available: false` or
`unavailable`, not infer it from UI text.
## Draft Schema Shape
```json
{
"schema_id": "codewhale.conformance-receipt/v0",
"thread": {
"id": "thr_...",
"model": "deepseek-v4-pro",
"mode": "agent",
"auto_approve": false,
"trust_mode": false,
"allow_shell": false
},
"turn": {
"id": "turn_...",
"status": "completed",
"started_at": "2026-06-02T01:00:00Z",
"ended_at": "2026-06-02T01:00:12Z",
"duration_ms": 12000
},
"reasoning_custody": {
"raw_reasoning_exported": false,
"available": false,
"reason": "reasoning blocks are not persisted as receipt-ready records"
},
"tool_lineage": {
"tool_call_count": 1,
"tool_result_count": 1,
"unmatched_tool_call_ids": [],
"unmatched_tool_result_ids": []
},
"usage_evidence": {
"available": true,
"usage": {
"prompt_tokens": 123,
"completion_tokens": 45
},
"provider_cache_breakdown_available": false
},
"source_event_lineage": {
"first_seq": 10,
"last_seq": 42,
"event_count": 33,
"missing_event_ranges": []
},
"side_effect_boundary": {
"approval_required_count": 1,
"approval_allowed_count": 0,
"approval_denied_count": 1,
"command_execution_count": 0,
"file_change_count": 0,
"sandbox_denied_count": 0
},
"claim_ceiling": [
"local_receipt_only",
"not_safety_certification",
"not_provider_compatibility_certification"
]
}
```
## Builder Rules
A receipt builder should be deterministic and conservative:
1. Load the thread and turn by id, then reject mismatched `thread_id` values.
2. Load only item ids referenced by the turn.
3. Read event records for the thread and filter by `turn_id`.
4. Preserve event sequence boundaries with `first_seq`, `last_seq`, and any
detected gaps.
5. Count approval, command, file, sandbox, and tool events from typed records or
known event names only.
6. Mark unavailable evidence explicitly instead of deriving it from free-form
summaries.
7. Emit no raw tool output beyond existing item summaries unless a later schema
adds a separate redaction policy.
## Incremental Implementation Path
The safest implementation path is:
1. Land this protocol note and settle field names/non-goals.
2. Add protocol structs and JSON snapshot fixtures for completed, failed, and
approval-denied turns.
3. Add a pure builder over `ThreadRecord`, `TurnRecord`, `TurnItemRecord`, and
`RuntimeEventRecord`.
4. Expose the local runtime API endpoint.
5. Add the CLI export command and optional validation mode.