741f36d2be
Closes the order-of-events race in #861 root cause 3: when the engine bursts events, the dispatch loop can pull `MessageComplete` off the channel ahead of `ThinkingComplete`. Today's `MessageComplete` reads `app.last_reasoning.take()` to attach the reasoning block to the assistant message in `api_messages`. If `ThinkingComplete` has not fired yet, `last_reasoning` is `None` and the thinking content is dropped — DeepSeek V4 then returns HTTP 400 on the next turn because it requires `reasoning_content` replay for assistant messages that carry tool calls. Adds a defensive head-of-handler drain in `MessageComplete`: when `streaming_thinking_active_entry.is_some()`, finalize the active thinking entry and stash the reasoning buffer into `last_reasoning` before the existing body runs. The drain is a no-op in the normal case where `ThinkingComplete` arrived first (the entry has already been cleared), so this branch is order-independent. Adds `message_complete_drain_preserves_thinking_when_thinking_complete_lost` which exercises the head-of-handler invariant directly: with a thinking entry still active and `last_reasoning` empty, the drain must move the buffer into `last_reasoning` before downstream reads. Refs #861 (RC3). RC1 and RC2 are already addressed by the existing `finalize_current_streaming_thinking` plumbing in `apply_engine_error_to_app` and `start_streaming_thinking_block`; RC4 (streaming-time truncation affordance) is left out of this PR to keep the scope on the data-loss path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>