* feat: make stream a caller-controlled passthrough in Messages API
Allow callers to set stream: false via requestOptions instead of
hardcoding stream: true. Add non-streaming response handler for the
Anthropic Messages API that parses single JSON responses.
- createMessagesRequestBody: stream: true → options.requestOptions?.stream ?? true
- preparePostOptions: stream: true as default before spread (callers can override)
- processResponseFromMessagesEndpoint: auto-detect via Content-Type header
- processNonStreamingResponseFromMessagesEndpoint: new handler for JSON responses
with tool call support in finishedCb delta, defensive parsing, cache-token
consistency warning, unknown block type logging
- Remove stale 'stream not respected' comment from fetch.ts
- Remove stream: false from agentIntent.ts inline summarization
- 10 new tests for non-streaming handler
* fix: add telemetry parity for non-streaming path and bump cache salt
* regenerate simulation cache for review-inline tests
* Regenerate simulation cache after rebase
* Temporarily disable multifile-edit-claude variant (#315940)
claude-3.5-sonnet returns model_not_supported from the endpoint, breaking
simulation cache regen. Re-enable when the test is updated to use a
currently-supported Claude model.
* Fix terminal strict-mode crash on empty suggestions + update baseline
- terminal.stest.ts: guard strict-mode `ok()` predicate so when the model
returns no code block, the test fails cleanly with the existing message
instead of crashing with 'Cannot read properties of undefined (reading match)'.
Also drop the stale commented-out debug block.
- baseline.json: refresh scores (68.01 -> 68.69) and drop the 14 entries for
the disabled multifile-edit-claude variant (see #315940).
- Remove now-orphaned multifile-edit-claude-panel.json outcome file.
* Apply CI-observed score improvements for cpp inline scenarios
CI on Linux scores 4 cpp InlineChatIntent scenarios higher than my local
macOS run does (likely platform-specific line-ending/whitespace normalization
in the cpp grader). Update baseline.json to match the Linux scores:
- edit-InlineChatIntent [inline] [cpp] - edit for cpp: 5 -> 9
- edit-InlineChatIntent [inline] [cpp] - edit for macro: 0 -> 2
- generate-InlineChatIntent [inline] [cpp] - cpp code generation: 3 -> 10
- generate-InlineChatIntent [inline] [cpp] - templated code gen: 0 -> 10
Overall score: 68.69 -> 68.86.
* Populate cpp diagnostic cache via Docker for cross-platform parity
The earlier rebase cache regen produced new LLM responses for the cpp
inline tests but failed to populate the clang diagnostic provider cache
for those new inputs, because clang detection on macOS is broken (Apple
clang prints '-v' output to stderr, but findIfInstalled only checks
stdout) and Docker wasn't running. As a result the cpp diagnostic cache
was missing entries for the new LLM responses, and CI re-ran clang live
on each platform with diverging results:
- Linux CI: clang available, scored highest (9, 2, 10, 10)
- Windows CI: no clang, errored out (5, 0, 10, 10 with worsening)
- macOS: Apple clang misdetected as missing, Docker off, errored
This commit:
1. Bumps CLANG_DIAGNOSTICS_PROVIDER_CACHE_SALT 5 -> 6 to invalidate
any contaminated entries.
2. Adds two new cache layers populated by running cpp tests via Docker
(using the mcr.microsoft.com/devcontainers/cpp image, same Linux
clang as CI). All 14 cpp scenarios now produce deterministic,
platform-independent diagnostic results when read from cache.
Verified with --require-cache: all cpp scenarios pass without invoking
clang/docker at runtime.
Enables the same `no-unexternalized-strings` with have in `vscode` in this repo. This make sure we have a more consistent style across repos and when generating edits
* remove references to old setting `github.copilot.chat.advanced.inlineChat2`
* play with `InlineChatIntent`
* wip
* move things, better/simpler prompt
* cleanup, renames, stuff
* more wip
* done after tool call
* edit and generate stest for new InlineChatIntent
* use codebook for diagnostics
* inline chat fixing stests
* stest run
* remove old Inline2 tests
* remove slash commands for v2, remove the editCodeIntent path for v2
* 💄
* 💄
* Don't use `diagnosticsTimeout` when with inline chat because the new diagnostics will never be read but slow down the result
* fix compile error
* stest run
* update baseline
* prevent some JSON errors from empty output
* unfresh baseline.json
* use `MockGithubAvailableEmbeddingTypesService` in stests
* back to hamfisted skipping of stests
* send telemetry from inline chat intent
* tweak some stests