a528ea9824
When an OpenAI-compatible backend (vLLM, Ollama, LM Studio, Together AI,
self-hosted vLLM/SGLang, etc.) streams an assistant message containing
multiple tool_calls in a single round, only the **last** tool's
`Event::ToolCallStarted` was firing. The preceding N-1 tool calls
executed and produced tool_result events, but never announced their
start to consumers (TUI / runtime API / embedder bridges), leaving them
with N orphan tool_result blocks and no matching tool_use blocks in the
assistant history.
## Reproduction
```text
backend dispatches: 7 × write_file + 1 × exec_shell
log shows: 7 × ApprovalRequired events ✓
listeners receive: 1 × chat:tool_start, 7 × chat:tool_end
session history: 1 tool_use + 7 tool_result (6 orphans)
```
Tested against vLLM 0.7 + Qwen3.6-35B-A3B with a "scaffold 7-file Tauri
template" prompt. Any model+backend combo that emits batch tool_calls
trips this — typical when a single LLM round asks for multiple parallel
file writes or edits.
## Root cause
`run_turn` tracked the currently-streaming tool block with a single
`current_tool_index: Option<usize>`. The Anthropic-style adapter
(non-streaming response → events at `chat.rs::L1807`) emits
Start/Stop pairs in lockstep so the slot never overlaps. But the
OpenAI streaming parser (`chat.rs::L1954-2064`) emits every
`ContentBlockStart::ToolUse` as soon as a tool_call delta lands, then
batches every `ContentBlockStop` at `finish_reason`:
```text
Start { index: 0 } // tool #1
Delta { index: 0, .. }
Start { index: 1 } // tool #2 — overwrites current_tool_index
Delta { index: 1, .. }
…
Start { index: 6 } // current_tool_index = Some(6)
Delta { index: 6, .. }
Stop { index: 0 } // take() returns Some(6) ← wrong tool!
Stop { index: 1 } // take() returns None
Stop { index: 2 } // take() returns None
…
```
The first `Stop` consumes the last index and emits `ToolCallStarted`
for the wrong `tool_uses` entry; every subsequent `Stop` finds the
slot already `None` and skips the entire `if let Some(index) = …`
branch, dropping the announcement.
## Fix
Replace the single slot with `HashMap<u32 block_index, usize
tool_uses_idx>`:
- `ContentBlockStart::ToolUse` and `::ServerToolUse` insert the
`(event.index → tool_uses.len())` mapping.
- `InputJsonDelta` looks up by the `ContentBlockDelta` outer index.
- `ContentBlockStop` removes by the stop's index, so each Stop routes
to its own `tool_uses` entry regardless of arrival order.
Routing no longer depends on `current_block_kind` (which has the same
single-slot overwrite problem); `current_tool_indices.remove(&index)`
returning `Some(_)` already proves the Stop belongs to a tool block.
## Tests
Added `batch_tool_calls_preserve_all_tool_use_indices` in
`core/engine/turn_loop.rs::tests` — feeds 7 Starts and 7 Stops through
the same `HashMap` API used by `run_turn`, asserts every index round-trips.
Manual end-to-end verification: vLLM + Qwen3.6-35B + 7-file Tauri
template prompt → frontend `messages` history now contains all 7
`write_file` tool_use blocks paired with their tool_result blocks.
Co-authored-by: hexin <he.xin@h3c.com>