vscode

mirror of https://github.com/microsoft/vscode.git synced 2026-05-21 15:49:15 +01:00

Author	SHA1	Message	Date
Bhavya U	100cbe59eb	Background inline summarization v1 (#308923 ) * Refactor inline summarization handling in ToolCallingLoop * Refactor conversation summarization settings and improve logging in AgentIntent * Refactor agent intent to improve telemetry and remove obsolete test file * Refactor inline summarization handling: remove unused properties and related tests * Remove unused summarization instruction from AgentPromptProps interface * Refactor AgentIntentInvocation to streamline model capabilities handling in background summarization * Update debugName for background summarization to reflect inline context * Update logging message in AgentIntentInvocation for clarity and remove unused test suite for inline summarization	2026-04-10 03:16:57 +00:00
Bhavya U	80cb675c3b	Refactor enableThinking/reasoningEffort into IModelCapabilityOptions (#308387 ) * Refactor enableThinking/reasoningEffort into IModelCapabilityOptions * Strip reasoningEffort from search/execution subagent loops * Fix stale comment referencing options.enableThinking	2026-04-08 00:08:29 +00:00
Bhavya U	c75b65f00e	Revert "Refactor enableThinking/reasoningEffort into IModelCapabilityOptions" (#308330 ) Revert "Refactor enableThinking/reasoningEffort into IModelCapabilityOptions …" This reverts commit `6b33538396`.	2026-04-07 21:50:23 +00:00
Bhavya U	6b33538396	Refactor enableThinking/reasoningEffort into IModelCapabilityOptions (#308294 )	2026-04-07 14:28:15 -07:00
Bhavya U	c7c0fac6f3	Inline summarization: summarize within the agent loop for maximum prompt cache hits (#4956 ) * Add inline summarization feature for agent conversation history - Introduced configuration option for inline summarization in package.json and configurationService.ts. - Updated agentIntent.ts to handle inline summarization logic during conversation. - Modified summarizedConversationHistory.tsx to support inline summarization instructions. - Enhanced tests to cover inline summarization scenarios and extraction of inline summaries. * Remove cache-friendly summarization prompt and related configurations * Refactor inline summarization handling in ToolCallingLoop and add summary application method * Add failure telemetry, deferred cleanup, and debugName tracking for inline summarization * Address PR review: fix empty string check, telemetry counts, cache token reporting, and test naming	2026-04-03 15:52:18 +00:00
Megan Rogge	68e9282f38	add `isSystemInitiated` and relevant changes (#4944 ) * add isSystemInitiated and relevant changes * simplify agent interaction type determination and add utility function for request kind mapping * update proposed api * resolve issue * Address feedback * update * revert unrelated changes * revert changes * fix tests	2026-04-03 00:14:21 +00:00
Zhichao Li	548eda26df	Add workspace metadata (git branch, commit, remote, file path) to OTel events and GH telemetry (#4844 ) * feat: define workspace metadata OTel attributes and resolver Add CopilotChatAttr constants for repo.head_branch_name, repo.head_commit_hash, repo.remote_url, and file.relative_path. Create WorkspaceOTelMetadata interface and resolveWorkspaceOTelMetadata() helper that synchronously resolves git metadata from activeRepository. Refs: microsoft/vscode#306397, microsoft/vscode-internalbacklog#7297 * feat: extend OTel edit events with optional workspace metadata Add optional WorkspaceOTelMetadata param to emitEditFeedbackEvent, emitEditHunkActionEvent, emitInlineDoneEvent, emitEditSurvivalEvent. Existing callers compile unchanged since the new param is optional. Refs: microsoft/vscode#306397 * feat: add workspace metadata to invoke_agent OTel span Inject IGitService into ToolCallingLoop and spread resolved workspace metadata (branch, commit, remote) onto the invoke_agent span attributes. Refs: microsoft/vscode#306397 * feat: extend EditSurvivalResult with workspace metadata Add workspace field to EditSurvivalResult interface and populate it in EditSurvivalReporter._report() using resolveWorkspaceOTelMetadata(). The reporter already injects IGitService and has the document URI, so no new DI is needed. Refs: microsoft/vscode#306397 * feat: inject IGitService into UserActions and pass workspace metadata Add workspace metadata to emitEditFeedbackEvent, emitEditHunkActionEvent, and emitInlineDoneEvent calls in UserFeedbackService using the file URI from each event action. Refs: microsoft/vscode#306397 * feat: pass workspace metadata to OTel survival events Forward res.workspace from EditSurvivalResult to emitEditSurvivalEvent at all 4 call sites: inline_chat, code_mapper, apply_patch, replace_string. Refs: microsoft/vscode#306397 * feat: add workspace metadata to GH telemetry edit events Add headBranchName, headCommitHash, remoteUrl, fileRelativePath to sendGHTelemetryEvent/sendEnhancedGHTelemetryEvent calls for: - inline.trackEditSurvival - fastApply/trackEditSurvival - applyPatch/trackEditSurvival - replaceString/trackEditSurvival - fastApply/editOutcome Refs: microsoft/vscode-internalbacklog#7297 * test: add unit tests for workspace metadata resolver and events Test resolveWorkspaceOTelMetadata (branch, commit, URL, relative path, edge cases) and workspaceMetadataToOTelAttributes (OTel key mapping). Add tests for emitEditFeedbackEvent and emitEditSurvivalEvent verifying workspace metadata is included/omitted correctly. Refs: microsoft/vscode#306397 * fix: propagate IGitService to ToolCallingLoop subclasses Pass the new IGitService constructor parameter through to super() in all 5 ToolCallingLoop subclasses: McpToolCallingLoop, CodebaseToolCallingLoop, DefaultToolCallingLoop, ExecutionSubagentToolCallingLoop, SearchSubagentToolCallingLoop. Refs: microsoft/vscode#306397 * fix: address review - URI-safe path, brace style, trim tests - Fix path prefix false-positive by using isEqualOrParent/relativePath instead of string startsWith (e.g. /repo matching /repo2/file.ts) - Expand one-line if blocks to multi-line per repo coding standards - Remove as-any mutation in test, remove trivial conversion tests, add test for path prefix false-positive edge case	2026-03-30 23:24:34 +00:00
Christof Marti	3c2eb6a61b	Use one WebSocket per conversation (#4827 )	2026-03-30 20:24:56 +00:00
Paul	41b7ef8514	Support dynamic prompt variables (#4742 ) * wip * fixes * update * update * updates * updates * clean * clean * clean * fix tests * update * fix test	2026-03-27 21:55:42 +00:00
Vijay Upadya	9a4dc95e9b	Support session references in troubleshoot skill + write models.json to debug logs (#4729 ) * session reference for troubleshoot * feedback update * handle non-local session schemes * handle non local path decode * fix test failure	2026-03-26 21:24:07 +00:00
Ulugbek Abdullaev	e47502f509	fix: remove flattenSingleChild/promoteMainEntry from CapturingToken (#4701 ) XtabProvider entries appeared unnested at the top level when the sibling NES MarkdownContentRequest entry was hidden (via setIsSkipped or cancellation). This happened because flattenSingleChild promoted the sole remaining child to the top level. Instead of adding the log context document as a child that gets 'promoted' into the parent, ChatPromptItem now directly owns the document via setMainEntry(). A MarkdownContentRequest whose debugName matches the token label is wired to the parent's icon and click command — never added as a tree child. Groups with no visible children render as non-expandable leaf items. This removes flattenSingleChild and promoteMainEntry from CapturingToken, making it a pure correlation token with no rendering hints. The tree view owns all rendering conventions.	2026-03-26 17:40:30 +00:00
Logan Ramos	844616523f	Add rate limit time (#4641 )	2026-03-24 14:55:51 +00:00
Bhavya U	adeddfb164	Refactor thinking and effort control: per-request opt-in (#4515 ) * Refactor thinking and effort control: make per-request opt-in via enableThinking and reasoningEffort - Add reasoning_effort to IChatModelCapabilities from CAPI model list - Add supportsReasoningEffort on ChatEndpoint/IChatEndpoint - Add enableThinking and reasoningEffort to IMakeChatRequestOptions - Build configurationSchema on VS Code LM API models for model picker effort dropdown - Remove disableThinking, AnthropicThinkingEffort, ResponsesApiReasoningEffort configs - Thinking is off by default; callers opt in with enableThinking: true - Agent mode (toolCallingLoop): enables thinking, passes reasoningEffort from modelConfiguration - ResponsesProxy / MessagesProxy: enables thinking - Inline chat, utility requests, LM wrapper: thinking off (default) - Effort level driven by configurationSchema in model picker (no default, user must choose) - BYOK Anthropic provider reads effort from options.modelConfiguration * refactor: Improve reasoningEffort handling across multiple components * Fix tests: add enableThinking: true to Agent location tests, restore maxThinkingBudget cap * Add defaultReasoningEffort, thread enableThinking/reasoningEffort to subagent loops and proxy endpoints - Add defaultReasoningEffort to IChatEndpoint (computed per model family: high for Anthropic/Gemini, medium for OpenAI) - Use defaultReasoningEffort as fallback in responsesApi, messagesApi, and configurationSchema - Delegate supportsReasoningEffort/defaultReasoningEffort in pass-through endpoints - Thread enableThinking/reasoningEffort through execution and search subagent loops - Add enableThinking: true to oaiLanguageModelServer and claudeLanguageModelServer - Restore maxThinkingBudget cap in customizeCapiBody * refactor: Adjust thinking budget calculation to use endpoint's maxThinkingBudget * Address PR feedback: fix comment, validate effort, remove defaultReasoningEffort - Fix misleading comment in messagesApi (thinking gated by enableThinking, not reasoningEffort) - Validate reasoningEffort against known values before sending to Messages API - Remove defaultReasoningEffort from IChatEndpoint and ChatEndpoint - Compute picker default locally in buildConfigurationSchema (UI concern only) - Remove effort fallbacks from messagesApi and responsesApi (pure caller control) * Address PR feedback round 2: validate effort, conditional schema default, location-gated thinking in fetch - Validate reasoningEffort against known values in messagesApi before sending - Fix comment to reflect enableThinking gating (not reasoningEffort) - Remove defaultReasoningEffort from endpoint (picker default is UI-only concern) - Compute picker default locally in buildConfigurationSchema - Gate thinking by location in DefaultToolCallingLoop.fetch() (Agent/MessagesProxy only) - Remove enableThinking from IToolCallingLoopOptions (decision made at fetch level) - Validate effort in BYOK anthropicProvider * refactor: Enable effort picker only for Claude and GPT models in configuration schema	2026-03-20 00:46:24 +00:00
Logan Ramos	8263960c55	Word smithing (#4496 )	2026-03-18 19:34:34 +00:00
Logan Ramos	d57d655398	Allow user to retry model rate limits automatically with auto (#4486 ) * Allow user to retry model rate limits automatically with auto * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>	2026-03-18 14:06:59 +00:00
Ross Wollman	2527bcd8a6	debugging/evals: allow static overrides of system prompts and tooldescriptions (#3959 ) * Add debug setting to override system prompt and tool descriptions via YAML file Adds a new advanced debug setting `github.copilot.chat.advanced.debug.promptOverrideFile` that allows specifying a YAML file to override the system prompt and/or tool descriptions sent to the model at runtime. This enables prompt engineering and debugging without modifying source code. The override is applied in toolCallingLoop.ts before the debug view event fires, so the chat debug view reflects the actual overridden content. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Initial plan * Address PR review: use IFileSystemService/URI, throttle warnings, avoid mutating buildPromptResult Co-authored-by: rwoll <11915034+rwoll@users.noreply.github.com> * Remove unnecessary migration for new DebugPromptOverrideFile setting This is a brand-new setting with no prior key to migrate from. Use defineSetting directly instead of defineAndMigrateSetting. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Fix promptOverrideFile default to use null instead of undefined Add explicit "default": null in package.json for the debug.promptOverrideFile setting and update the code default to match, fixing the configuration defaults test. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: rwoll <11915034+rwoll@users.noreply.github.com>	2026-03-17 22:29:36 +00:00
Bhavya U	f487a678f3	Track compaction summaries as an array with detailed metrics metadata (#4413 ) * Add Anthropic compaction metadata to IResultMetadata for evals Captures context editing compaction data from Anthropic Messages API and surfaces it through IResultMetadata so the evaluation system can access it. Changes: - Add anthropicCompaction field to IToolCallRound (parallels OpenAI compaction) - Capture Anthropic ContextManagementResponse deltas in tool calling loop - Aggregate compaction metrics (cleared tokens, tool uses, thinking turns) across rounds - Surface compactionMetrics on IResultMetadata via AnthropicCompactionMetadata - Merge into result in defaultIntentRequestHandler's resultWithMetadatas() * Add compaction metrics metadata for evals Surfaces background and foreground compaction (conversation summarization) metrics through IResultMetadata.compactionMetrics so the eval system can track when compaction is triggered and its cost. - Add compactionMetrics to IResultMetadata with type (foreground/background) and token usage - Create CompactionMetadata class on Turn - Set CompactionMetadata in agentIntent for both foreground and background paths - Merge CompactionMetadata into result in defaultIntentRequestHandler * Add durationMs to summarization metadata and update related logic * Refactor compaction metadata handling to use SummarizedConversationHistoryMetadata and remove CompactionMetadata references * Add usage metadata to AgentIntentInvocation and remove compaction metrics documentation * Enhance summary handling by replacing single summary with an array of summaries and updating related logic in conversation normalization * Refactor _persistSummaryOnTurn to use IBackgroundSummarizationResult for improved type safety * Simplify render result handling in AgentIntentInvocation by directly returning the await result * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * Refactor SummarizedConversationHistoryMetadata to use options object for improved readability and maintainability --------- Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>	2026-03-15 06:17:40 +00:00
Paul	2472428c0d	Move troubleshoot skill to files-based approach (#4365 ) * v3 * wip * wip * nit * wip * permissions * update * update * clean * update * clean * update * nit * update	2026-03-12 07:22:28 +00:00
Denny Abraham Cheriyan	82b1b81b99	Add resolved model for events (#4210 ) * Add resolved model for events * Refactor code * Restore snapshot file * Fix snapshot tests and add resolvedModel integration test	2026-03-11 19:32:15 +00:00
Logan Ramos	b65327e196	Make Copilot GitHub Status aware (#4165 ) * Make Copilot GitHub Status aware * Update src/platform/chat/common/commonTypes.ts Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Fix tests * Tests pass now I hope thanks --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2026-03-03 21:28:14 +00:00
Zhichao Li	ddb6f98ce6	feat(otel): Add OpenTelemetry GenAI instrumentation to Copilot Chat (#3917 ) * feat: add OTel GenAI instrumentation foundation Phase 0 complete: - spec.md: Full spec with decisions, GenAI semconv, dual-write, eval signals, lessons from Gemini CLI + Claude Code - plan.md: E2E demo plan (chat ext + eval repo + Azure backend) - src/platform/otel/: IOTelService, config, attributes, metrics, events, message formatters, NodeOTelService, file exporters - package.json: Added @opentelemetry/* dependencies OTel opt-in behind OTEL_EXPORTER_OTLP_ENDPOINT env var. * refactor: reorder OTel type imports for consistency * refactor: reorder OTel type imports for consistency * feat(otel): wire OTel spans into chat extension — Phase 1 core - Register IOTelService in DI (NodeOTelService when enabled, NoopOTelService when disabled) - Add OTelContrib lifecycle contribution for OTel init/shutdown - Add `chat {model}` inference span in ChatMLFetcherImpl._doFetchAndStreamChat() - Add `execute_tool {name}` span in ToolsService.invokeTool() - Add `invoke_agent {participant}` parent span in ToolCallingLoop.run() - Record gen_ai.client.operation.duration, tool call count/duration, agent metrics - Thread IOTelService through all ToolCallingLoop subclasses - Update test files with NoopOTelService - Zero overhead when OTel is disabled (noop providers, no dynamic imports) * feat(otel): add embeddings span, config UI settings, and unit tests - Add `embeddings {model}` span in RemoteEmbeddingsComputer.computeEmbeddings() - Add VS Code settings under github.copilot.chat.otel.* in package.json (enabled, exporterType, otlpEndpoint, captureContent, outfile) - Wire VS Code settings into resolveOTelConfig in services.ts - Add unit tests for: - resolveOTelConfig: env precedence, kill switch, all config paths (16 tests) - NoopOTelService: zero-overhead noop behavior (8 tests) - GenAiMetrics: metric recording with correct attributes (7 tests) * test(otel): add unit tests for messageFormatters, genAiEvents, fileExporters - messageFormatters: 18 tests covering toInputMessages, toOutputMessages, toSystemInstructions, toToolDefinitions (edge cases, empty inputs, invalid JSON) - genAiEvents: 9 tests covering all 4 event emitters, content capture on/off - fileExporters: 5 tests covering write/read round-trip for span, log, metric exporters plus aggregation temporality Total OTel test suite: 63 tests across 6 files * feat(otel): record token usage and time-to-first-token metrics Add gen_ai.client.token.usage (input/output) and copilot_chat.time_to_first_token histogram metrics at the fetchMany success path where token counts and TTFT are available from the processSuccessfulResponse result. * docs: finalize sprint plan with completion status * style: apply formatter changes to OTel files * feat(otel): emit gen_ai.client.inference.operation.details event with token usage Wire emitInferenceDetailsEvent into fetchMany success path where full token usage (prompt_tokens, completion_tokens), resolved model, request ID, and finish reasons are available from processSuccessfulResponse. This follows the OTel GenAI spec pattern: - Spans: timing + hierarchy + error tracking - Events: full request/response details including token counts The data mirrors what RequestLogger captures for chat-export-logs.json. * feat(otel): add aggregated token usage to invoke_agent span Per the OTel GenAI agent spans spec, add gen_ai.usage.input_tokens and gen_ai.usage.output_tokens as Recommended attributes on the invoke_agent span. Tokens are accumulated across all LLM turns by listening to onDidReceiveResponse events during the agent loop, then set on the span before it ends. Ref: https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-agent-spans/ * feat(otel): add token usage attributes to chat inference span Defer the `chat {model}` span completion from _doFetchAndStreamChat to fetchMany where processSuccessfulResponse has extracted token counts. The chat span now carries: - gen_ai.usage.input_tokens (prompt_tokens) - gen_ai.usage.output_tokens (completion_tokens) - gen_ai.response.model (resolved model) The span handle is returned from _doFetchAndStreamChat via the result object so fetchMany can set attributes and end it after tokens are known. This matches the chat-export-logs.json pattern where each request entry carries full usage data alongside the response. * style: apply formatter changes * fix: correct import paths in otelContrib and add IOTelService to test * feat: add diagnostic span exporter to log first successful export and failures * feat: add content capture to OTel spans (messages, responses, tool args/results) - Chat spans: add copilot.debug_name attribute for identifying orphan spans - Chat spans: capture gen_ai.input.messages and gen_ai.output.messages when captureContent enabled - Tool spans: capture gen_ai.tool.call.arguments and gen_ai.tool.call.result when captureContent enabled - Extension chat endpoint: capture input/output messages when captureContent enabled - Add CopilotAttr.DEBUG_NAME constant * fix: register IOTelService in chatLib setupServices for NES test * fix: register OTel ConfigKey settings in Advanced namespace for configurations test * fix: register IOTelService in shared test services (createExtensionUnitTestingServices) * fix: register IOTelService in platform test services * feat(otel): enhance GenAI span attributes per OTel semantic conventions - Change gen_ai.provider.name from 'openai' to 'github' for CAPI models - Rename CopilotAttr to CopilotChatAttr, prefix values with copilot_chat.* - Add GITHUB to GenAiProviderName enum - Replace copilot.debug_name with gen_ai.agent.name on chat spans - Add gen_ai.request.temperature, gen_ai.request.top_p to chat spans - Add gen_ai.response.id, gen_ai.response.finish_reasons on success - Add gen_ai.usage.cache_read.input_tokens from cached_tokens - Add copilot_chat.request.max_prompt_tokens and copilot_chat.time_to_first_token - Add gen_ai.tool.description to execute_tool spans - Fix gen_ai.tool.call.id to read chatStreamToolCallId (was reading nonexistent prop) - Fix tool result capture to handle PromptTsxPart and DataPart (not just TextPart) - Add gen_ai.input.messages and gen_ai.output.messages to invoke_agent span (opt-in) - Move gen_ai.tool.definitions from chat spans to invoke_agent span (opt-in) - Add gen_ai.system_instructions to chat spans (opt-in) - Fix error.type raw strings to use StdAttr.ERROR_TYPE constant - Centralize hardcoded copilot.turn_count and copilot.endpoint_type into CopilotChatAttr - Add COPILOT_OTEL_CAPTURE_CONTENT=true to launch.json for testing - Document span hierarchy fixes needed in plan.md * feat(otel): connect subagent spans to parent trace via context propagation - Add TraceContext type and getActiveTraceContext() to IOTelService - Add storeTraceContext/getStoredTraceContext for cross-boundary propagation - Add parentTraceContext option to SpanOptions for explicit parent linking - Implement in NodeOTelService using OTel remote span context - Capture trace context when execute_tool runSubagent fires (keyed by toolCallId) - Restore parent context in subagent invoke_agent span (via subAgentInvocationId) - Auto-cleanup stored contexts after 5 minutes to prevent memory leaks - Update test mocks with new IOTelService methods - Update plan.md with investigation findings * fix(otel): fix subagent trace context key to use parentRequestId The previous implementation stored trace context keyed by chatStreamToolCallId (model-assigned tool call ID), but looked it up by subAgentInvocationId (VS Code internal invocation.callId UUID). These are different IDs that don't match across the IPC boundary. Fix: key by chatRequestId on store side (available on invocation options), and look up by parentRequestId on subagent side (same value, available on ChatRequest). Both reference the parent agent's request ID. Verified: 21-span trace with subagent correctly nested under parent agent. * fix(otel): add model attrs to invoke_agent and max_prompt_tokens to BYOK chat - Set gen_ai.request.model on invoke_agent span from endpoint - Track gen_ai.response.model from last LLM response resolvedModel - Add copilot_chat.request.max_prompt_tokens to BYOK chat spans - Document upstream gaps in plan.md (BYOK token usage, programmatic tool IDs) * test(otel): add trace context propagation tests for subagent linkage Tests verify: - storeTraceContext/getStoredTraceContext round-trip and single-use semantics - getActiveTraceContext returns context inside startActiveSpan - parentTraceContext makes child span inherit traceId from parent - Independent spans get different traceIds without parentTraceContext - Full subagent flow: store context in tool call, retrieve in subagent * fix(otel): add finish_reasons and ttft to BYOK chat spans, document orphan spans - Set gen_ai.response.finish_reasons on BYOK chat success - Set copilot_chat.time_to_first_token on BYOK chat success - Document Gap 4: duplicate orphan spans from CopilotLanguageModelWrapper - Identify all orphan span categories (title, progressMessages, promptCategorization, wrapper) * docs(otel): update Gap 4 analysis — wrapper spans have actual token usage data The copilotLanguageModelWrapper orphan spans are the actual CAPI HTTP handlers, not duplicates. They contain real token usage, cache read tokens, resolved model names, and temperature — all missing from the consumer-side extChatEndpoint spans due to VS Code LM API limitations. Updated plan.md with: - Side-by-side attribute comparison table - Three fix approaches (context propagation, span suppression, enrichment) - Recommendation: Option 1 (propagate trace context through IPC) * feat(otel): propagate trace context through BYOK IPC to link wrapper spans - Pass _otelTraceContext through modelOptions alongside _capturingTokenCorrelationId - Inject IOTelService into CopilotLanguageModelWrapper - Wrap makeRequest in startActiveSpan with parentTraceContext when available - This creates a byok-provider bridge span that makes chatMLFetcher's chat span a child of the original invoke_agent trace, bringing real token usage data into the agent trace hierarchy * debug(otel): add debug attribute to verify trace context capture in BYOK path * fix(otel): remove debug attribute, BYOK trace context propagation verified working Verified: 63-span trace with Azure BYOK (gpt-5) correctly shows: - byok-provider bridge spans linking wrapper chat spans into agent trace - Real token usage (in:21458 out:1730 cache:19072) visible on wrapper chat spans - hasCtx:true on all extChatEndpoint spans confirming context capture - Two subagent invoke_agent spans correctly nested under main agent - Zero orphan copilotLanguageModelWrapper spans * refactor(otel): replace byok-provider bridge span with invisible context propagation Add runWithTraceContext() to IOTelService — sets parent trace context without creating a visible span. The wrapper's chat spans now appear directly as children of invoke_agent, eliminating the noisy byok-provider intermediary span. Before: invoke_agent → byok-provider → chat (wrapper) After: invoke_agent → chat (wrapper) * refactor(otel): remove duplicate BYOK consumer-side chat span The extChatEndpoint no longer creates its own chat span. The wrapper's chatMLFetcher span (via CopilotLanguageModelWrapper) is the single source of truth with full token usage, cache data, and resolved model. Before: invoke_agent → chat (empty, extChatEndpoint) + chat (rich, wrapper) After: invoke_agent → chat (rich, wrapper only) * fix(otel): restore chat span for non-wrapper BYOK providers (Anthropic, Gemini) The previous commit removed the extChatEndpoint chat span, which was correct for Azure/OpenAI BYOK (served by CopilotLanguageModelWrapper via chatMLFetcher). But Anthropic and Gemini BYOK providers call their native SDKs directly, bypassing CopilotLanguageModelWrapper — so they need the consumer-side span. Now: always create a chat span in extChatEndpoint with basic metadata (model, provider, response.id, finish_reasons). For wrapper-based providers, the chatMLFetcher also creates a richer sibling span with token usage. * fix(otel): skip consumer chat span for wrapper-based BYOK providers Only create the extChatEndpoint chat span for non-wrapper providers (Anthropic, Gemini) that need it as their only span. Wrapper-based providers (Azure, OpenAI, OpenRouter, Ollama, xAI) get a single rich span from chatMLFetcher via CopilotLanguageModelWrapper. Result: 1 chat span per LLM call for all provider types. * fix: remove unnecessary 'google' from non-wrapper vendor set * feat(otel): add rich chat span with usage data for Anthropic BYOK provider Move chat span creation into AnthropicLMProvider where actual API response data (token usage, cache reads) is available. The span is linked to the agent trace via runWithTraceContext and enriched with: - gen_ai.usage.input_tokens / output_tokens - gen_ai.usage.cache_read.input_tokens - gen_ai.response.model / response.id / finish_reasons Remove consumer-side extChatEndpoint span for all vendors (nonWrapperVendors now empty) since both wrapper-based and Anthropic providers create their own spans with full data. Next: apply same pattern to Gemini provider. * feat(otel): add rich chat span for Gemini BYOK, clean up extChatEndpoint - Add OTel chat span with full usage data to GeminiNativeBYOKLMProvider - Remove all consumer-side span code from extChatEndpoint (dead code) - Each provider now owns its chat span with real API response data: * CAPI: chatMLFetcher * OpenAI-compat BYOK: CopilotLanguageModelWrapper → chatMLFetcher * Anthropic: AnthropicLMProvider * Gemini: GeminiNativeBYOKLMProvider - Fix Gemini test to pass IOTelService * feat(otel): enrich Anthropic/Gemini chat spans with full metadata Add to both providers: - copilot_chat.request.max_prompt_tokens (model.maxInputTokens) - server.address (api.anthropic.com / generativelanguage.googleapis.com) - gen_ai.conversation.id (requestId) - copilot_chat.time_to_first_token (result.ttft) Now matches CAPI chat span attribute parity. * feat(otel): add server.address to CAPI/Azure BYOK chat spans Extract hostname from urlOrRequestMetadata when it's a URL string and set as server.address on the chat span. Works for both CAPI and CopilotLanguageModelWrapper (Azure BYOK) paths. * feat(otel): add max_tokens and output_messages to Anthropic/Gemini chat spans - gen_ai.request.max_tokens from model.maxOutputTokens - gen_ai.output.messages (opt-in) from response text - Closes remaining attribute gaps vs CAPI/Azure BYOK spans * fix(otel): capture tool calls in output_messages for chat spans When model responds with tool calls instead of text, the output_messages attribute was empty. Now captures both text parts and tool call parts in the output_messages, matching the OTel GenAI output messages schema. Also: Azure BYOK invoke_agent zero tokens is a known upstream gap — extChatEndpoint returns hardcoded usage:0 since VS Code LM API doesn't expose actual usage from the provider side. * fix(otel): capture tool calls in output_messages for Anthropic/Gemini BYOK spans Same fix as CAPI — when model responds with tool calls, include them in gen_ai.output.messages alongside text parts. All three provider paths (CAPI, Anthropic, Gemini) now consistently capture both text and tool call parts in output messages. * fix(otel): add input_messages and agent_name to Anthropic/Gemini chat spans - gen_ai.input.messages (opt-in) captured from provider messages parameter - gen_ai.agent.name set to AnthropicBYOK / GeminiBYOK for identification Closes the last attribute gaps vs CAPI/Azure BYOK chat spans. * fix(otel): fix input_messages serialization for Anthropic/Gemini BYOK - Map enum role values to names (1→user, 2→assistant, 3→system) - Extract text from LanguageModelTextPart content arrays instead of showing '[complex]' for all messages - Use OTel GenAI input messages schema with role + parts format * docs(otel): add remaining metrics/events work to plan.md Coverage matrix showing: - Anthropic/Gemini BYOK missing: operation.duration, token.usage, time_to_first_token metrics, and inference.details event - CAPI and Azure BYOK (via wrapper) fully covered - Tool/agent/session metrics covered across all providers - 4 tasks (M1-M4) to close the gap * feat(otel): add metrics and inference events to Anthropic/Gemini BYOK providers Both providers now record: - gen_ai.client.operation.duration histogram - gen_ai.client.token.usage histograms (input + output) - copilot_chat.time_to_first_token histogram - gen_ai.client.inference.operation.details log event All metrics/events now have full parity across CAPI, Azure BYOK, Anthropic BYOK, and Gemini BYOK. * fix(otel): fix LoggerProvider constructor — use 'processors' key (SDK v2) The OTel SDK v2 changed the LoggerProvider constructor option from 'logRecordProcessors' to 'processors'. The old key was silently ignored, causing all log records to be dropped. This is why logs never appeared in Loki despite traces working fine. * docs: add agent monitoring guide with OTel usage and Claude/Gemini comparison * docs: remove Claude/Gemini comparison from monitoring guide * docs: add OTel comparison with Claude Code and Gemini CLI * docs: reorganize monitoring docs — user guide + dev architecture - agent_monitoring.md: polished user-facing guide (for VS Code website) - agent_monitoring_arch.md: developer-facing architecture & instrumentation guide - Removed internal plan/spec/comparison files from repo (moved to ~/Documents) * fix(otel): restore _doFetchViaHttp body and _fetchWithInstrumentation after rebase * fix(otel): propagate otelSpan through WebSocket/HTTP routing paths The otelSpan was created in _doFetchAndStreamChat but not included in returns from _doFetchViaWebSocket and _doFetchViaHttp, causing the caller (fetchMany) to always receive undefined for otelSpan. Fix: await both routing paths and spread otelSpan into the result. * docs(otel): improve monitoring docs, add collector setup, fix trace context - Expand agent_monitoring.md with detailed span/metric/event attribute tables - Add BYOK provider coverage, subagent trace propagation docs - Add Backend Considerations: Azure App Insights (via collector), Langfuse, Grafana - Add End-to-End Setup & Verification section with KQL examples - Add OTel Collector config + docker-compose for Azure App Insights - Fix: emit inference details event before span.end() in chatMLFetcher (fixes 'No trace ID' log records in App Insights) - Fix: pass active context in emitLogRecord for trace correlation - Update launch.json to point at OTel Collector (localhost:4328) * docs(otel): merge Backend Considerations and E2E sections to remove redundancy * docs(otel): remove internal dev debug reference from user-facing guide * docs(otel): remove Grafana section and Jaeger refs from App Insights section * docs(otel): trim Backend section to factual setup guides, remove claims * docs(otel): final accuracy audit — fix false claims against code - Mark copilot_chat.session.start event as 'not yet emitted' (defined but no call site) - Mark copilot_chat.agent.turn event as 'not yet emitted' (defined but no call site) - Mark copilot_chat.session.count metric as 'not yet wired up' - Fix OTEL_EXPORTER_OTLP_PROTOCOL desc: only 'grpc' changes behavior - Fix telemetry kill switch claim: vscodeTelemetryLevel not wired in services.ts - Remove false toolCalling.tsx instrumentation point from arch doc - Fix docker-compose comments: wrong port numbers (16686→16687, 4318→4328) - Add reference to full collector config file from inline snippet * docs(otel): remove telemetry.telemetryLevel references — OTel is independent * feat(otel): wire up session.start event, agent.turn event, and session.count metric - emitSessionStartEvent + incrementSessionCount at invoke_agent start (top-level only) - emitAgentTurnEvent per LLM response in onDidReceiveResponse listener - Remove 'not yet wired' markers from docs * chore: untrack .playwright-mcp/ and add to .gitignore * chore: remove otel spec reference files * chore(otel): remove OpenTelemetry environment variables from launch configurations * fix(otel): add 64KB truncation limit for content capture attributes Prevents OTLP batch export failures when large prompts/responses are captured. Aligned with gemini-cli's limitTotalLength pattern. Applied truncateForOTel() to all JSON.stringify calls feeding span attributes across chatMLFetcher, toolCallingLoop, toolsService, anthropicProvider, geminiNativeProvider, and genAiEvents. * refactor(otel): make GenAiMetrics methods static to avoid per-call allocations Aligned with gemini-cli pattern of module-level metric functions. Eliminates 17+ throwaway GenAiMetrics instances per agent run. * fix(otel): fix timer leak, cap buffered ops, rate-limit export logs - storeTraceContext: track timers for clearTimeout on retrieval/shutdown, add 100-entry max with LRU eviction - BufferedSpanHandle: cap _ops at 200 to prevent unbounded growth - DiagnosticSpanExporter: rate-limit failure logs to once per 60s * docs(otel): fix Jaeger UI port to match docker-compose (16687) * chore(otel): update sprint plan — mark P0/P1 tasks done * fix(otel): remove as any casts in BYOK provider content capture Use proper Array.isArray + instanceof checks instead of as any[] casts for LanguageModelChatMessage.content iteration. * refactor(otel): extract OTelModelOptions shared interface Replaces 3 duplicated inline type assertions for _otelTraceContext and _capturingTokenCorrelationId with a single shared interface. * refactor(otel): route OTel logs through ILogService output channel Replace console.info/error/warn in NodeOTelService with a log callback. OTelContrib logs essential status to the Copilot Chat output channel for user troubleshooting (enabled/disabled, exporter config, shutdown). * fix(otel): remove orphaned OTel ConfigKey definitions OTel config is read via workspace.getConfiguration in services.ts, not through IConfigurationService.get(ConfigKey). These constants were unused dead code. * test(otel): add comprehensive OTel instrumentation tests - Agent trace hierarchy (invoke_agent → chat → execute_tool, subagent propagation, error states, metrics, events) - BYOK provider span emission (CLIENT kind, token usage, error.type, content capture gating, parentTraceContext linking) - chatMLFetcher two-phase span lifecycle (create → enrich → end, error path, operation duration metric) - Service robustness (runWithTraceContext, startActiveSpan error lifecycle, storeTraceContext overwrite) - CapturingOTelService reusable test mock for all OTel assertions * chore: apply formatter import sorting * chore: remove outdated sprint plan document * feat(otel): add OTel configuration settings for tracing and logging * fix(otel): ensure metric reader is flushed and shutdown properly	2026-03-02 20:46:30 +00:00
Rob Lourens	c5daa20856	Remove edits2 setting/participant (#4099 ) * Remove edits2 setting/participant * Remove EditTestStrategy.Edits2 from simulation tests (Written by Copilot)	2026-03-02 06:20:19 +00:00
Rob Lourens	37db22a960	Send 'custom' agent name correctly for subagent requests (#4086 ) * Send 'custom' agent name correctly for subagent requests https://github.com/microsoft/vscode-internalbacklog/issues/6884 * Update vscodeCommit and proposed dts files (Written by Copilot)	2026-02-28 23:09:39 +00:00
Christof Marti	8649964a4d	Support CAPI WebSocket connections (#4068 ) (#4069 )	2026-02-27 16:58:06 +00:00
Paul	b4d241f523	update (#4008 )	2026-02-26 00:03:20 +00:00
Connor Peet	035b9995b4	intents: safely handle empty hook return values (#4003 ) Adds optional chaining to safely access nested properties in hook outputs. This prevents exceptions when hooks like hookify return empty objects that don't contain the expected hookSpecificOutput structure. Changes: - toolCallingLoop.ts: Add optional chaining for SessionStart and SubagentStart hook output handling - defaultIntentRequestHandler.ts: Add optional chaining for UserPromptSubmit hook output handling (Commit message generated by Copilot)	2026-02-25 20:58:12 +00:00
Rob Lourens	c6faea4f7a	Set new request headers for agent requests (#3950 ) * Add task-id * Add interaction-type * Only set for agent requests	2026-02-23 23:49:58 +00:00
Bhavya U	08a0293bb3	Refactor Anthropic context editing configuration and tool search integration (#3797 ) * Refactor Anthropic context editing configuration and tool search integration * Remove redundant toolSearchTool configuration and update settings structure	2026-02-17 19:15:20 +00:00
Rob Lourens	5eac4ba0dd	Revert "Better tracking of subagent requests in capi request headers (#3712 )" (#3721 ) This reverts commit `04260412bb`.	2026-02-13 17:35:45 +00:00
Rob Lourens	04260412bb	Better tracking of subagent requests in capi request headers (#3712 )	2026-02-13 04:38:38 +00:00
Harald Kirschner	02051a11da	Segment for plan agent (#3565 ) * Segment for plan agent * Update src/extension/prompt/node/defaultIntentRequestHandler.ts Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Harald Kirschner <digitarald@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2026-02-11 21:34:51 +00:00
Paul	4b34b40ec2	Additional hook fixes (#3652 ) * fixes * clean	2026-02-11 07:28:19 +00:00
Rob Lourens	1c4434d8c9	Move chat hook execution to the extension (#3622 ) * Refactor hook execution * Fix merge issues: extra braces and incomplete function calls * Sync DTS: inline ChatRequestHooks type to avoid cross-proposal dependency * Fix bugs from code review: stopReason empty string, onError continuation, missing sessionId/cwd in hook input * Move hooks to chatHooks proposal, sync DTS, fix test layering violation, add executeHook tests * Remove stale Hooks output channel reference from error messages * Add dedicated Hooks output channel with structured logging Mirrors the output channel that was in VS Code core: - Dedicated 'Hooks' output channel created lazily on first hook execution - Logs hook command, input (with redaction of sensitive fields), result, and timing - Request counter for correlating multi-hook executions - Error messages reference the Hooks output channel again * Tweak * Tweak * Fix DTS: revert chatParticipantPrivate to extension main version Our branch had VS Code main's DTS changes (sessionResource, editor, Uri types) that the extension main hasn't adopted yet. Revert to extension main's version which already has preToolUseResult. * Address PR review: fix logService.error usage, add hookProgress optional chaining * Rewrite hookExecutor tests: mock child_process.spawn instead of spawning real processes * Format * Fix DTS comments: env and timeoutSec are values, not implementation promises	2026-02-10 23:09:32 +00:00
Paul	b080669f8f	Add support for stopping using UserPromptSubmit hook (#3597 ) * block * Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * PR * PR * update tests --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2026-02-10 03:27:06 +00:00
Rob Lourens	e3889c3cde	Revert "Revert "Add stopReason and warningMessage support for hooks (#3548 )" (#3551 )" (#3553 ) This reverts commit `33c760a93f`.	2026-02-07 20:48:10 +00:00
Rob Lourens	33c760a93f	Revert "Add stopReason and warningMessage support for hooks (#3548 )" (#3551 ) This reverts commit `cc8aeb72ca`.	2026-02-07 18:19:06 +00:00
Paul	cc8aeb72ca	Add stopReason and warningMessage support for hooks (#3548 ) * wip * wip * cleanups * nit * fix test * update * update * updates * updates * fix * Update src/extension/intents/node/toolCallingLoop.ts Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2026-02-07 16:56:27 +00:00
Rob Lourens	84b8b46bd5	Add session transcript service for hooks (#3545 ) * Add session transcript service for hooks Implement JSONL-based session transcript service that records conversation turns, tool executions, and assistant messages for hook consumers. Key changes: - New ISessionTranscriptService interface and SessionTranscriptService implementation with buffered writes and automatic old transcript cleanup - Integrate transcript logging into tool calling loop: user messages, assistant messages, tool execution start/complete, turn boundaries - Auto-flush transcript and inject transcript_path into hook input - Race flush with 500ms timeout to avoid blocking hook execution - Gate transcript creation on ChatRequest.hasHooksEnabled - Include copilotVersion and vscodeVersion in session.start entry - Add timestamp to IToolCallRound for transcript timing - Add hasHooksEnabled to ChatRequest interface and all implementors * Fixes	2026-02-07 02:04:46 +00:00
Paul	e706d58a71	Add start session hook (#3497 ) * fix * wip * update * wip * fix * clean * small clean * clean * nit * clean * PR * update * nit * PR * subagent hooks wip * clean * runSubagent * fix * reverts * clean * add context * cleanup * nit * fixes * nit * fixes * fix * start session * clean * tests	2026-02-06 01:19:08 +00:00
Connor Peet	ba2e206efe	tools: cleanup unused pause logic (#3488 ) We removed the 'pause' feature back in ~June, cleanup logic we still had around it. This removes: - The PauseController class which was no longer being used - onPaused event parameters from chat participant, request, and intent handlers - Pause-related event listening and stream pausing logic - Simplified throwIfCancelled to be synchronous since it no longer needs to wait for pause resumption (Commit message generated by Copilot)	2026-02-05 18:24:07 +00:00
Connor Peet	cc915b0c33	tools: cleanup mirrored loop for virtual tools (#3489 ) Removes temporary virtual tool grouping evaluation logic that is no longer needed. This includes: - Remove _doMirroredCallWithVirtualTools() method and related evaluation logic - Remove _didParallelToolCallLoop tracking flag - Remove ICopilotTokenStore dependency injection - Remove unused imports (IResponseDelta, OptionalChatRequestParams) No longer used since virtual tools stabilized. (Commit message generated by Copilot)	2026-02-05 18:14:05 +00:00
Connor Peet	8ca2807e66	chat: wire up yieldrequested for steering messages (#3473 ) * chat: wire up yieldrequested for steering messages Allows the client to do a 'soft cancel' after a tool call happens before returning back to the model, or before the next turn. * fix compile	2026-02-05 16:05:57 +00:00
Paul	ac6a513f0d	Add support for stop hook and prompt input for UserPromptSubmit (#3461 ) * fix * wip * update * wip * fix * clean * small clean * clean * nit * clean * PR * update * nit * PR	2026-02-05 05:48:14 +00:00
Logan Ramos	23299a88a9	Fix errors to just say switch to auto (#3455 )	2026-02-04 22:47:53 +00:00
Rob Lourens	a1fc8f2717	Adopt new hook name (#3457 )	2026-02-04 22:36:59 +00:00
Rob Lourens	bf7bcce698	Add parentRequestId for subagent telemetry (#3397 ) * Add parentRequestId for subagent telemetry * Update test snapshot	2026-02-04 03:36:23 +00:00
Paul	6a523f4677	Add initial hooks support (#3422 ) * Implement service and sample hook * Adapt to new shape * API version * Update src/extension/chat/vscode-node/chatHookService.ts Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update src/extension/prompt/node/defaultIntentRequestHandler.ts Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Rob Lourens <roblourens@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2026-02-04 01:40:37 +00:00
Copilot	ca80313cee	feat: support Harbor/ATIF trajectory capture and export (#2893 ) * Initial plan * Add trajectory format types and core implementation Co-authored-by: zhichli <57812115+zhichli@users.noreply.github.com> * Add unit tests for trajectory logger and fix implementation Co-authored-by: zhichli <57812115+zhichli@users.noreply.github.com> * Add comprehensive documentation for trajectory format Co-authored-by: zhichli <57812115+zhichli@users.noreply.github.com> * Add implementation status and next steps documentation Co-authored-by: zhichli <57812115+zhichli@users.noreply.github.com> * Add trajectory implementation quick reference summary * Add trajectory export commands and enhance trajectory logging functionality * Refactor trajectory metrics calculation and update schema version to ATIF v1.5 * trajectory scaffolding * Enhance trajectory tracking by adding subAgentName and agentName to tool metadata in SearchSubagentTool and TrajectoryLoggerAdapter * add export trajectory cmd for tree nodes * use sessionId for main trajectory * Update command categories from 'Copilot' to 'Chat' in package.json * Update trajectory schema version to ATIF-v1.5 and enhance error handling in export commands * Remove obsolete trajectory documentation and integration files * Refactor trajectory export commands and update README for clarity * Update src/platform/trajectory/common/trajectoryLogger.ts Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update src/platform/trajectory/node/trajectoryLogger.ts Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update src/platform/trajectory/node/trajectoryLoggerAdapter.ts Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update src/extension/trajectory/vscode-node/trajectoryExportCommands.ts Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Refactor adapter to avoid double trajectory start and add comprehensive tests - Remove redundant second startTrajectory call in adapter sync loop - Add 12 comprehensive tests for TrajectoryLoggerAdapter covering: - Basic trajectory creation from request logs - User message deduplication - Tool call correlation (single, parallel, orphan) - Subagent trajectory linking - Metrics tracking - Session ID management - Non-conversation request handling - Update TestRequestLogger to expose toolMetadata property - Add async wait in tests for proper event processing Co-authored-by: zhichli <57812115+zhichli@users.noreply.github.com> * rm readme * enhance trajectory export functionality with session ID mapping and folder selection * add comment * add arch * add bounding for logs and rm tests for now --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: zhichli <57812115+zhichli@users.noreply.github.com> Co-authored-by: Zhichao Li <zhichli@microsoft.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2026-02-03 00:34:07 +00:00
Rob Lourens	22b4bcaaef	Add 'subType: subagent' for request telemetry (#3310 ) * Add 'subType: subagent' for request telemetry * Fix messageSource on search_subagent requests * Set subtype	2026-01-30 05:40:29 +00:00
Bhavya U	8a230f08da	add toolTokenCount to telemetry measurements and requests (#3132 ) * add toolTokenCount to telemetry measurements and requests * test: add toolTokenCount to snapshot tests for tool call iterations	2026-01-23 19:48:36 +00:00

1 2

91 Commits