mirror of
https://github.com/microsoft/vscode.git
synced 2026-07-04 05:45:47 +01:00
136b39f373
Adds GenAI semantic-convention streaming signals alongside the legacy copilot_chat.time_to_first_token, tagged with gen_ai.response.model for per-model slicing (microsoft/vscode#320651): - gen_ai.request.stream (span attr, bool) - gen_ai.response.time_to_first_chunk (span attr, seconds) - gen_ai.client.operation.time_to_first_chunk (histogram metric) - gen_ai.client.operation.time_per_output_chunk (histogram metric) The main GitHub streaming path (chatMLFetcher) carries the full set including per-output-chunk latency, computed from inter-chunk gaps in FetchStreamRecorder. BYOK providers (Anthropic, Gemini) emit the stream attr + time_to_first_chunk only, as they lack per-chunk arrival timing. The Claude agent path inherits all signals via the shared chatMLFetcher proxy; the Copilot CLI path emits them natively from the runtime. Updates agent_monitoring.md and adds unit tests.