* ci: split Electron test jobs into unit/integration and smoke
The Linux, Windows and macOS Electron PR test jobs are the slowest in CI,
dominated by the smoke test run. Split each into two parallel jobs - one
running unit + integration tests, the other running smoke tests - to cut
wall-clock time.
Done via two new parameters on the reusable workflows
(unit_and_integration_tests and smoke_tests, both defaulting to true) so
Browser and Remote jobs are unchanged. Artifact names get a -smoke suffix
on the smoke-only job to avoid upload collisions.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* ci: gate build and diagnostics to correct Electron test phase
Follow-up to the Electron job split. Ensure each half only does the work
it needs:
- Gate "Build integration tests" on unit_and_integration_tests so the
smoke-only job skips it.
- Scope the before/after diagnostics steps to their phase (combined with
always()) so they don't run in the wrong job.
- Move the Copilot extension build into the smoke phase (gated on
smoke_tests) instead of compiling it unconditionally; align Linux,
Windows and macOS on the same ordering.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* ci: drop space and parens from Electron-Smoke job name
The Windows 1ES runner builds its JobId label from job_name, producing
"windows-test-Electron (Smoke)-...". The space and parentheses prevented
the runner from picking up the job. Rename the smoke job to Electron-Smoke
on all three platforms so the JobId is a plain slug.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Fixes
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
refactor: update restore-node-modules action to support lookup-only functionality
- Replaced 'extract' input with 'lookup-only' to allow cache entry checks without downloading or extracting.
- Updated action logic to conditionally extract node_modules based on the new 'lookup-only' input.
- Adjusted workflow files to utilize 'lookup-only' for cache-warming jobs on Linux, macOS, and Windows.
* CI: speed up node_modules cache with zstd + shared scripts
Switch the Linux/macOS node_modules cache from single-threaded gzip
(tar -czf) to multi-threaded zstd. The "Create node_modules archive"
step was spending ~5min of single-core gzip on a multi-GB tree on every
cache miss; zstd -T0 uses all cores and decompresses much faster, so
cache-hit jobs benefit too. Windows stays on 7-Zip (already threaded).
Extract the archive/extract commands into shared per-platform scripts
under .github/workflows/node_modules_cache/ (cache.sh / cache.ps1, each
dispatching on an archive|extract argument) so the format and flags live
in one place instead of being duplicated across ~8 workflows. Bump
build/.cachesalt to invalidate existing gzip caches.
Also remove the obsolete extensions/copilot CI workflows
(copilot-setup-steps.yml, ensure-node-modules-cache.yml, pr.yml) and the
unused build/listBuildCacheFiles.js, and drop their now-stale entries
(plus lit-html and signals-core) from .eslint-allowed-javascript-files.
* ci: seed copilot node_modules cache on main and rename cache keys
Add copilot-linux and copilot-windows jobs to pr-node-modules.yml so the
copilot node_modules cache is populated on main. Rename the copilot cache
keys to copilot-node_modules-linux / copilot-node_modules-windows in pr.yml.
* ci: extract node_modules cache into composite actions
Factor the repeated node_modules cache plumbing into two local composite
actions, restore-node-modules and save-node-modules, and migrate all
workflows that used the cache.sh/cache.ps1 archive flow (pr, pr-node-modules,
pr-{linux,darwin,win32}-test, copilot-setup-steps, component-fixtures,
css-order-scan).
- restore-node-modules computes the key, restores the cache, optionally
extracts on a hit, and exports the resolved key via $GITHUB_ENV.
- save-node-modules archives node_modules and saves it to the cache, reusing
the key exported by restore so callers don't repeat the prefix.
- Bespoke install steps stay in the workflows, so per-job env/secrets never
cross the action boundary.
- Only seed the cache on branch pushes (component-fixtures skips PRs, whose
caches aren't shared).
* save the node_modules cache for now to test it
* ci: fix node_modules cache save dropping the archive
cache.sh wrote its archive as cache.tzst, but actions/cache reserves that
name for its own tarball and passes --exclude cache.tzst, so our archive was
excluded and an empty (~200 B) cache was saved on Linux/macOS. Rename the
archive to node-modules.tzst and bump build/.cachesalt to invalidate the
broken cache entries.
* empty commit
* Remove again saving to the node modules cache from PR steps
* ci: restore chat pipeline to windows-latest
* chore: remove node-gyp override
* chore: restore node-gyp override with comment
* refactor: rm dependency on key:sqlite
The module locks the node-gyp dependency to 8.x due to
its transitive sqlite3 native module dependency this in turn
blocks using newer windows CI, refs https://github.com/microsoft/vscode/issues/321267
The module can be replaced with built-in sqlite support
from Node.js without losing the on-disk cache format has
already been committed.
* chore: restore minimist
* chore: set sqlite busy timeout
* fix: decode json-buffer values for keyv cache compat
The "chat-lib tests (windows-latest)" job started failing at the
"Extract chat-lib" step (npm ci in extensions/copilot). npm ci builds
the native sqlite3@5.1.7 module — a transitive dependency of the
@keyv/sqlite devDependency — via `prebuild-install -r napi || node-gyp
rebuild`. prebuild-install finds no matching prebuilt, so it falls back
to node-gyp, which fails on the runner because the GitHub-hosted
windows-latest label now resolves to the Windows Server 2025 + Visual
Studio 2026 image, whose VS 18 toolchain the bundled node-gyp cannot
detect ("unknown version undefined ... could not find a version of
Visual Studio 2017 or newer").
This was the only npm ci job exposed to the new image: every other
Windows npm ci job runs on self-hosted pools pinned to windows-2022
(still VS 2022), and all other copilot npm ci jobs run on Linux/macOS.
Pin this matrix entry to windows-2022 to match, as recommended by the
runner-images migration notice (actions/runner-images#14017).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Lots of logging for chat smoke tests
* PR test workflows: build extensions/copilot before smoke tests
* PR test workflows: drop duplicate copilot compile from linux/win32 (was already built before integration tests)
* smoke tests: remove musl Claude binary on Linux glibc runner
The musl variant is probed first by @anthropic-ai/claude-agent-sdk and
fails to exec on glibc (ENOENT from missing ELF interpreter), which
caused the Test Claude session tests to time out.
Follow-up to #313128. The VSCODE_OSS fallback isn't needed for the
api.github.com calls in core-ci — secrets.GITHUB_TOKEN already
authenticates those reads with permissions: contents: read (added in
#304929), so we don't hit the anonymous rate limit on 1ES.
* ci: switch PR workflows back to 1ES self-hosted runners with JobId
Re-applies #311975 (reverted in #312033). Adds per-run+attempt JobId
labels to scope 1ES agents to specific GitHub Actions runs and prevent
intermittent runner cancellations.
Also switches the pr.yml compile job's GITHUB_TOKEN from the
ephemeral repo-scoped runner token to secrets.VSCODE_OSS so cross-repo
GitHub API release fetches (vscode-js-debug, vscode-js-debug-companion,
vscode-js-profile-visualizer, etc.) authenticate properly. On 1ES pools
the shared egress IPs hit the anonymous 60/hr api.github.com rate limit
and produced 403 fan-out across PRs last time.
* ci: fall back to GITHUB_TOKEN for fork PRs
Match the historical pattern from before #255987 — fork PRs can't
access secrets.VSCODE_OSS, so use the conditional to pick GITHUB_TOKEN
for forks.
* Allow cherry-pick bot PRs in engineering system changes check
Add an exception for PRs created by vs-code-engineering[bot] whose title
starts with [cherry-pick] and that carry the cherry-pick-artifact label.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Fetch cherry-pick-artifact label via API at runtime
The label is applied ~2s after PR creation, so the webhook payload may
not include it. Fetch current labels from the API instead, gated behind
cheap event-payload checks to avoid extra API calls on unrelated PRs.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Add label retry loop and consolidate guard expressions
Retry the cherry-pick-artifact label check up to 3 times (2s apart) to
handle the ~2s delay between PR creation and label application.
Consolidate the repeated exception guards into a single 'allowed' step
with a 'blocked' output, simplifying downstream conditions.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>