Commit Graph

67 Commits

Author SHA1 Message Date
Bhavya U 256a46f76a Re-enable multifile-edit-claude stests with claude-sonnet-4.5 (#316126)
Fixes #315940
2026-05-12 23:07:56 +00:00
Bhavya U 6b5334c5f4 feat: make stream a caller-controlled passthrough in Messages API (#311003)
* feat: make stream a caller-controlled passthrough in Messages API

Allow callers to set stream: false via requestOptions instead of
hardcoding stream: true. Add non-streaming response handler for the
Anthropic Messages API that parses single JSON responses.

- createMessagesRequestBody: stream: true → options.requestOptions?.stream ?? true
- preparePostOptions: stream: true as default before spread (callers can override)
- processResponseFromMessagesEndpoint: auto-detect via Content-Type header
- processNonStreamingResponseFromMessagesEndpoint: new handler for JSON responses
  with tool call support in finishedCb delta, defensive parsing, cache-token
  consistency warning, unknown block type logging
- Remove stale 'stream not respected' comment from fetch.ts
- Remove stream: false from agentIntent.ts inline summarization
- 10 new tests for non-streaming handler

* fix: add telemetry parity for non-streaming path and bump cache salt

* regenerate simulation cache for review-inline tests

* Regenerate simulation cache after rebase

* Temporarily disable multifile-edit-claude variant (#315940)

claude-3.5-sonnet returns model_not_supported from the endpoint, breaking
simulation cache regen. Re-enable when the test is updated to use a
currently-supported Claude model.

* Fix terminal strict-mode crash on empty suggestions + update baseline

- terminal.stest.ts: guard strict-mode `ok()` predicate so when the model
  returns no code block, the test fails cleanly with the existing message
  instead of crashing with 'Cannot read properties of undefined (reading match)'.
  Also drop the stale commented-out debug block.
- baseline.json: refresh scores (68.01 -> 68.69) and drop the 14 entries for
  the disabled multifile-edit-claude variant (see #315940).
- Remove now-orphaned multifile-edit-claude-panel.json outcome file.

* Apply CI-observed score improvements for cpp inline scenarios

CI on Linux scores 4 cpp InlineChatIntent scenarios higher than my local
macOS run does (likely platform-specific line-ending/whitespace normalization
in the cpp grader). Update baseline.json to match the Linux scores:

- edit-InlineChatIntent [inline] [cpp] - edit for cpp:               5 -> 9
- edit-InlineChatIntent [inline] [cpp] - edit for macro:             0 -> 2
- generate-InlineChatIntent [inline] [cpp] - cpp code generation:    3 -> 10
- generate-InlineChatIntent [inline] [cpp] - templated code gen:     0 -> 10

Overall score: 68.69 -> 68.86.

* Populate cpp diagnostic cache via Docker for cross-platform parity

The earlier rebase cache regen produced new LLM responses for the cpp
inline tests but failed to populate the clang diagnostic provider cache
for those new inputs, because clang detection on macOS is broken (Apple
clang prints '-v' output to stderr, but findIfInstalled only checks
stdout) and Docker wasn't running. As a result the cpp diagnostic cache
was missing entries for the new LLM responses, and CI re-ran clang live
on each platform with diverging results:

  - Linux CI:   clang available, scored highest (9, 2, 10, 10)
  - Windows CI: no clang, errored out (5, 0, 10, 10 with worsening)
  - macOS:      Apple clang misdetected as missing, Docker off, errored

This commit:

  1. Bumps CLANG_DIAGNOSTICS_PROVIDER_CACHE_SALT 5 -> 6 to invalidate
     any contaminated entries.
  2. Adds two new cache layers populated by running cpp tests via Docker
     (using the mcr.microsoft.com/devcontainers/cpp image, same Linux
     clang as CI). All 14 cpp scenarios now produce deterministic,
     platform-independent diagnostic results when read from cache.

Verified with --require-cache: all cpp scenarios pass without invoking
clang/docker at runtime.
2026-05-11 21:59:58 -07:00
Johannes d314fbc8c7 stests 2026-04-24 09:34:51 +02:00
Johannes Rieken 9ac60c7956 inline chat: drop EditFile tool when better edit tools are available (#4479)
* inline chat: drop EditFile tool when better edit tools are available

Fixes https://github.com/microsoft/vscode/issues/302062

* stest

* stest
2026-03-18 11:20:28 +00:00
Johannes Rieken 1554132d4b Better inline chat exit (#4361)
* Refactor inline chat exit tool handling and update prompt instructions for clarity

re https://github.com/microsoft/vscode/issues/296601#top

* stests

* stests...
2026-03-11 21:31:24 +00:00
Connor Peet 7e7c1a6cc7 tools: add binary file support with hexdump display (#4331)
* tools: add binary file support with hexdump display

Adds support for reading and displaying binary files in the read file tool
with a hexdump-formatted view. This enables better handling of binary content
in the IDE context without attempting to interpret them as text.

- Adds hexdump utility to format binary data in a readable hex/ASCII view
- Extends readFileTool to detect binary files and provide formatted output
- Adds binaryFileHexdump prompt component for displaying binary content
- Integrates binary file variable support in file variable display
- Updates test fixtures with binary file handling scenarios

Fixes https://github.com/microsoft/vscode/issues/284178
Fixes https://github.com/microsoft/vscode/issues/299973

(Commit message generated by Copilot)

* pr comments

* baseline update

* baseline update
2026-03-10 21:49:13 +00:00
Logan Ramos 62215a8a23 Remove hard coded gpt 4.1 in favor of copilot base (#4124)
* Remove hard coded gpt 4.1 in favor of copilot base

* Update simulator cache
2026-03-02 22:05:28 +00:00
Johannes Rieken ab69ce9740 inline chat: fix tool call round ordering in prompt and surface edit … (#4102)
* inline chat: fix tool call round ordering in prompt and surface edit failures

* stests
2026-03-02 10:48:47 +00:00
Johannes Rieken 1265160099 Inline chat v2 should use readTool when dealing with large files (#3990)
* Inline chat: support large-file read rounds and prompt read history

* stests
2026-02-25 12:15:07 +00:00
Ulugbek Abdullaev 7d728b836e swb: fix support for external NES stests for windows & "run test once" (#3768)
* swb: fix: make external NES stest running windows-compatible

* swb: support running a single external test

* address ccr
2026-02-16 15:21:10 +00:00
Ulugbek Abdullaev 5e0ca23866 nes: flush copilot-nes-oct cache and run external nes stests (#3605)
* nes: flush copilot-nes-oct cache and run external nes stests

* update nes stests
2026-02-10 13:45:19 +00:00
Rob Lourens e5cd86beeb Rename /summarize to /compact, add optional extra instructions (#3556)
* Rename /summarize to /compact, add optional extra instructions

* Update simulation baseline
2026-02-08 18:11:03 +00:00
Matt Bierner 34b17f117d Remove @workspace chat participant (#3492)
* Remove `@workspace` chat participant

For https://github.com/microsoft/vscode/issues/292972

Removes the `@workspace` chat participant since this is now an outdated (and confusing) way to use code search. For now we'll keep the commands but I've moved them under the default agent instead

* Updating tests too and fixing some references
2026-02-06 00:21:09 +00:00
Johannes Rieken e8e9b03ba7 Tweak prompt selection and also remind model to do all edits in a single tool call (#3125)
* Tweak prompt selection and also remind model to do all edits in a single tool call

* (fix) correct tool name

* stest

* more stest drama
2026-01-26 08:36:17 +00:00
Johannes Rieken 1b8308f1fd Inline chat handles empty selections explicitly (#2535)
* handle empty selection better in inline chat

* stests

* add unit tests

* Update src/extension/prompts/node/inline/inlineChat2Prompt.tsx

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* stests

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-12-11 13:50:17 +00:00
Ulugbek Abdullaev 3417fc6d7c update cache (#2533) 2025-12-11 10:36:12 +00:00
Johannes Rieken 36f0ab1f6f inline chat fixes (#2348)
* make sure the exit-tool is called when nothing else has been called

fixes https://github.com/microsoft/vscode/issues/280775

* tweak inline prompt for better prefix-caching

https://github.com/microsoft/vscode-internalbacklog/issues/6337
2025-12-03 10:14:20 +00:00
Ulugbek Abdullaev 3c78ed81dd nes: support /models on proxy and model picker (#2325) 2025-12-02 14:14:53 +00:00
Johannes Rieken 0231290715 keep intent detection for inline v1 intact (#2264) 2025-11-28 14:29:05 +00:00
Johannes Rieken bfc3fe5285 check edit tool results for errors and try again if editing failed (#2246)
* check edit tool results for errors and try again if editing failed

https://github.com/microsoft/vscode/issues/275056

* Update src/extension/prompts/node/inline/inlineChat2Prompt.tsx

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix prompt

* update-baseline

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-27 17:09:20 +00:00
Johannes Rieken a951ba62d1 fix https://github.com/microsoft/vscode/issues/277850 (#2079) 2025-11-19 15:17:31 +00:00
Johannes Rieken 34393c52c7 Always use github.copilot.editingSessionEditor, enable slash-command based on v2-config (#2060)
* Always use `github.copilot.editingSessionEditor`, enable slash-command based on v2-config

This makes all requests go to the `InlineChatIntent` and it decides (based on the v2-config) if it uses the "old world" for making inline chat requests happen

* re-run tests
2025-11-18 16:57:12 +00:00
Johannes Rieken a3a3829ee3 remove temporal context experiment (#1957)
fixes https://github.com/microsoft/vscode-copilot/issues/17115
fixes https://github.com/microsoft/vscode-copilot/issues/12756
fixes https://github.com/microsoft/vscode-copilot/issues/11674
2025-11-12 16:52:08 +00:00
Johannes Rieken 72dc56bdf3 Use cacheBreakpoint for inline chat prompt (#1954) 2025-11-12 16:15:07 +00:00
Ulugbek Abdullaev 1bd445ec60 update cache (#1822) 2025-11-07 10:39:59 +00:00
Ulugbek Abdullaev 4e4997c6d8 update cache (#1824) 2025-11-06 14:04:19 +00:00
Ladislau Szomoru 2c92092e40 Git - add repository/branch name to the commit message generation context (#1756) 2025-11-02 19:26:50 +00:00
Ulugbek Abdullaev fe96b6d014 update cache (#1705) 2025-10-30 13:09:03 +00:00
Ulugbek Abdullaev 66fc3dc3fd update cache (#1688) 2025-10-29 11:39:45 +00:00
Johannes Rieken fcbff5831a InlineChatIntent (#1549)
* remove references to old setting `github.copilot.chat.advanced.inlineChat2`

* play with `InlineChatIntent`

* wip

* move things, better/simpler prompt

* cleanup, renames, stuff

* more wip

* done after tool call

* edit and generate stest for new InlineChatIntent

* use codebook for diagnostics

* inline chat fixing stests

* stest run

* remove old Inline2 tests

* remove slash commands for v2, remove the editCodeIntent path for v2

* 💄

* 💄

* Don't use `diagnosticsTimeout` when with inline chat because the new diagnostics will never be read but slow down the result

* fix compile error

* stest run

* update baseline

* prevent some JSON errors from empty output

* unfresh baseline.json

* use `MockGithubAvailableEmbeddingTypesService` in stests

* back to hamfisted skipping of stests

* send telemetry from inline chat intent

* tweak some stests
2025-10-29 10:44:00 +00:00
Kyle Cutler d316af6ada Support generating PR descriptions based on a template (#1431)
* Support generating PR descriptions based on a template

* Try fix tests

* Comment

* Update cache
2025-10-27 14:07:07 +00:00
Kyle Cutler 7be42d291f Regenerate model metadata cache and fix outdated s-test (#1551)
Co-authored-by: Ulugbek Abdullaev <ulugbekna@gmail.com>
2025-10-24 15:49:26 +00:00
Ulugbek Abdullaev 10ffc78324 update cache (#1396)
* update cache

* update cache
2025-10-17 12:35:42 +00:00
Pierce Boggan 0af1ef6fce Model should self-identify if asked (#1267)
* Update model response guidelines in agent prompts

- Modified the response instructions in `agentPrompt.tsx` to clarify when to state the model name.
- Enhanced the `CopilotIdentityRules` class in `copilotIdentity.tsx` to include model name responses.

| File                             | Changes                                                                 |
|----------------------------------|-------------------------------------------------------------------------|
| `agentPrompt.tsx`               | Updated guidance on stating model name when asked.                     |
| `copilotIdentity.tsx`           | Added model name response instructions in identity rules.               |

* Update guidance on model name disclosure in agent prompts

- Revised instruction to clarify that the model name should not be volunteered unless explicitly asked by the user.

| File                                   | Changes Made                                      |
|----------------------------------------|--------------------------------------------------|
| src/extension/prompts/node/agent/agentPrompt.tsx | Updated reminder text regarding model name disclosure. |

* Update agent prompts to include model disclosure

- Added instruction to state the model being used when asked about it in multiple agent prompt snapshots.

| File Path                                                                 | Changes Made                                                                 |
|---------------------------------------------------------------------------|------------------------------------------------------------------------------|
| src/extension/prompts/node/agent/test/__snapshots__/agentPrompt.spec.tsx.snap | Updated responses to include model disclosure when asked about the model.    |

* Add SQLite cache file for simulation layer

This commit introduces a new SQLite cache file for the simulation layer to enhance data retrieval efficiency.

| File                                                                 | Changes                       |
|----------------------------------------------------------------------|-------------------------------|
| test/simulation/cache/layers/2171978e-88a1-4218-afac-dc1fe7ecc095.sqlite | New file created with versioning info |

* Add SQLite cache file for simulation layer

This commit introduces a new SQLite cache file to enhance the simulation layer's performance.

| File                                                                 | Changes                |
|----------------------------------------------------------------------|------------------------|
| test/simulation/cache/layers/94afd615-5805-4860-a1ba-3f9ebbf7b9a4.sqlite | New file added 📁      |

* remove bad cache layers

* npm run simulate

* update baseline

---------

Co-authored-by: João Moreno <joaomoreno@users.noreply.github.com>
2025-10-16 18:21:10 +00:00
Ladislau Szomoru 063d9a6842 Git - switch commit message generation to gpt5-mini (#1249) 2025-10-07 07:57:48 +00:00
Rob Lourens 5eaa17af31 gpt-5-codex prompt improvements (#1177)
* gpt-5-codex prompt improvements

* update baseline
2025-09-28 22:56:41 +00:00
Rob Lourens 9ce9f6cb2b Reduce INTERNAL_RESTRICTED (#1029)
* Reduce INTERNAL_RESTRICTED
and delete edits linkification setting

* Update
2025-09-12 04:43:26 +00:00
Matt Bierner d7c09e68fe Don't index files outside of the workspace (#957)
For https://github.com/microsoft/vscode/issues/261532

Makes sure callers don't accidentally try indexing files outside of the workspace
2025-09-09 01:35:22 +00:00
Aaron Munger b3868ec970 debug chat replay with replay intent (#693)
* stub out new participant for replay

* add debugger to step through replay file

* parse and debug json replay

* make edits from the replay

* create absolute path

* update for the latest json format

* show tool calls with tool call renderer

* use singleton object for response queue

* cleanup

* formatting

* baseline update

* baseline update again

* just disable tool from normal agent calls

* reverting stest changes

* nes: clean up enforcing minimum response delay and cancellation handling (#746)

* nes: do not enforce minimum response delay for NES in 2 cases  (#748)

* nes: clean up enforcing minimum response delay and cancellation handling

* nes: do not enforce minimum response delay for NES in 2 cases

    1. for NES that's a subsequent edit (ie non-first edit in a set of multiple edits that come from a single model request)
    2. for NES that was cached and is returned again after rebasing on user edits

* Move to latest completion core (#749)

* update baseline

* use existing util

---------

Co-authored-by: Ulugbek Abdullaev <ulugbekna@gmail.com>
Co-authored-by: Dirk Bäumer <dirkb@microsoft.com>
2025-08-25 17:59:45 +00:00
Rob Lourens 912e4ce46f Set max_tokens on intentDetection to avoid runaway responses (#648)
* Set max_tokens on intentDetection to avoid runaway responses

* Update baseline
2025-08-18 20:01:51 +00:00
Connor Peet ad4d063779 prompts: avoid wrapping file attachments in code fences (#569)
* prompts: avoid wrapping file attachments in code fences

We previously included attached files like this

```
<attachment id="test.js" filePath="/home/jola/test.js">
\```javascript
bar

asd (
\```
</attachment>
```

Which, very reasonably, caused the model to get confused and think that
the file actually contained those backticks. This change removes that
so that it's just

```
<attachment id="test.js" filePath="/home/jola/test.js">
bar

asd (
</attachment>
```

Refs https://github.com/microsoft/vscode/issues/260772#issuecomment-3176793591

Busted all the caches with associated baseline noise.

* fixup
2025-08-12 19:06:46 +00:00
Daniel Imms e30b068b22 Move terminal selection/command tools to core (#545)
* Move terminal selection/command tools to core

Part of microsoft/vscode#259260

* Update baseline
2025-08-11 19:11:17 +00:00
Don Jayamanne d2713c5e3e Add some simple NES tests for alt format of notebooks (#543)
* Add some simple NES tests for alt format of notebooks

* Updates

* Updates
2025-08-11 10:40:52 +00:00
Rob Lourens bba605a8e0 Remove IntentParams and intent:true (#534)
Took a long time, this started as a PR on the old vscode-copilot repo
2025-08-09 18:43:43 +00:00
Rob Lourens 4cc98c62a4 Add a switch for Instant Apply to use short model for lower context prompts (#497)
* use enum for model

* add small proxy endpoint and use when needed

* extend endpoint to reduce duplication and imply dependance

* use experiment service

* rename small -> short

* fix model name

* fix type

* separate endpoint impls for upcoming header changes

* remove extra spaces

* Update baseline

* Full update

* Update

---------

Co-authored-by: Vritant Bhardwaj <vrtoku@gmail.com>
Co-authored-by: Vritant Bhardwaj <vrbhardw@microsoft.com>
2025-08-07 17:29:28 +00:00
Rob Lourens 3bd01fab1d Use lineNumberStyle for FileVariable (#227)
* Use lineNumberStyle for FileVariable

* vitest

* update test files with no baseline...

* Update baseline scores for toolCalling tests to reflect recent changes

* tests

* Update baseline

---------

Co-authored-by: Ulugbek Abdullaev <ulugbekna@gmail.com>
2025-08-07 06:04:26 +00:00
Rob Lourens 2e245cc182 Fix thinking tool (#474)
* Fix thinking tool
#259934

* Update
2025-08-05 23:52:34 +00:00
Bhavya U 613da9522b remove unused language fields and clean up project setup info (#404)
* remove unused language fields and clean up project setup info

* fix: add missing punctuation in instructions for clarity

* Update cache

* Remove cache layer file that requires signed commit

* Update cache
2025-07-30 23:47:50 +00:00
Megan Rogge 219996961b rm docSearchClient from startDebugging (#379)
* rm docSearchClient from startDebugging

* update baseline
2025-07-28 16:23:06 +00:00
Megan Rogge f99ca51227 rm task tools (#357)
* rm task tools

* rm unused

* update baseline

* update baseline
2025-07-24 21:42:31 +00:00