Files
vscode/extensions/copilot/docs
Zhichao Li 136b39f373 Emit gen_ai streaming OTel signals for chat responses
Adds GenAI semantic-convention streaming signals alongside the legacy
copilot_chat.time_to_first_token, tagged with gen_ai.response.model for
per-model slicing (microsoft/vscode#320651):

- gen_ai.request.stream (span attr, bool)
- gen_ai.response.time_to_first_chunk (span attr, seconds)
- gen_ai.client.operation.time_to_first_chunk (histogram metric)
- gen_ai.client.operation.time_per_output_chunk (histogram metric)

The main GitHub streaming path (chatMLFetcher) carries the full set
including per-output-chunk latency, computed from inter-chunk gaps in
FetchStreamRecorder. BYOK providers (Anthropic, Gemini) emit the stream
attr + time_to_first_chunk only, as they lack per-chunk arrival timing.
The Claude agent path inherits all signals via the shared chatMLFetcher
proxy; the Copilot CLI path emits them natively from the runtime.

Updates agent_monitoring.md and adds unit tests.
2026-06-09 17:21:11 -07:00
..