The GitHub Actions PR test workflows run integration/smoke tests out of
sources, so each test section launches scripts/code.bat, which runs
build/lib/preLaunch.ts. Unlike the Azure Pipelines product builds, the
GitHub workflows did not set VSCODE_SKIP_PRELAUNCH, so preLaunch ran on
every section and getElectron() unconditionally deleted and re-downloaded
.build/electron each time. On Windows this races with file locks held by
the just-exited Electron process and intermittently fails the whole job
with the bare 'The system cannot find the path specified.' error.
- Set VSCODE_SKIP_PRELAUNCH=1 on the unit/integration/remote test steps of
the win32, linux and darwin PR workflows, matching Azure Pipelines (the
workflows already prepare node_modules, out, built-in extensions and
Electron in dedicated steps before the tests run).
- Make getElectron() version-aware: skip the destructive re-download when
the installed Electron already matches the expected version, falling back
to a download on any detection failure.
- Make scripts/code.bat fail fast with a clear message when preLaunch.ts
fails instead of falling through to launch a missing executable.
- Retry rimraf on EBUSY/EPERM (Windows file-lock codes), not just ENOTEMPTY.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Symbolizing the intermittent Electron-startup SIGSEGV crash dumps (Ubuntu debug
symbols + frame-pointer walk) shows the fault is an upstream Pango concurrency
bug, not a VS Code bug:
pango: fc_thread_func -> init_in_thread -> FcInit() (pangofc-fontmap.c)
fontconfig: FcInit -> FcConfigParseAndLoadFromMemory -> _FcConfigParse
libexpat: XML_ParseBuffer -> libc (NULL deref, SIGSEGV)
Pango >= 1.52's pango_fc_font_map_init() unconditionally spawns a
"[pango] fontconfig" thread that runs FcInit(); that races with the
Electron/Chromium main thread's own fontconfig use during startup and corrupts
fontconfig's global config while it is being parsed. The threaded design is a
known-bad area upstream (pango#784 "single fontconfig thread introduces a hang
... seems to be due to a race condition", pango#872), and there is no env var to
disable it (still present in Pango 1.56).
It only manifests in our CI because the race window is microscopic: it needs a
cold process, two threads hitting first-time FcInit() simultaneously, and a slow
machine. Our smoke job is a near-perfect trigger — fresh contended runners, a
wiped fontconfig cache + custom FONTCONFIG_FILE (so FcInit re-parses cold), and
~25 cold Electron starts per run. (This also explains why the expat version was
irrelevant and why dropping the config DOCTYPE made it worse: it is pure timing,
not parser/content.)
Fix: initialize fontconfig once, single-threaded, from an ELF constructor that
runs before main() (and thus before any thread exists), via a tiny LD_PRELOAD
shim. Pango's later threaded FcInit() then finds fontconfig already initialized
and returns immediately, so the concurrent parse never happens and the race is
eliminated.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The Linux smoke-test job works around the expat 2.6.1 fontconfig NULL-deref
CVEs by pointing FONTCONFIG_FILE at a minimal config with <include> removed.
However that config still declared an external DTD:
<!DOCTYPE fontconfig SYSTEM "urn:fontconfig:fonts.dtd">
fontconfig feeds that DTD to expat as an external *parameter* entity, which
still hits the not-yet-backported CVE-2026-32776 / CVE-2026-32778 crash paths
on expat 2.6.1 even with <include> gone. This was observed in CI as a SIGSEGV
inside libexpat (called from libfontconfig) during Chromium browser-process
font initialization, which crashed Electron at startup. Because the smoke-test
launch used no timeout, that crash surfaced only as an opaque 120s Mocha
"before all" hook timeout.
fontconfig does not require the DOCTYPE, so drop it to remove the last
external-entity codepath. The full workaround can be removed once the runner
ships libexpat >= 2.7.5 (the step already auto-disables itself in that case).
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* ci: split Electron test jobs into unit/integration and smoke
The Linux, Windows and macOS Electron PR test jobs are the slowest in CI,
dominated by the smoke test run. Split each into two parallel jobs - one
running unit + integration tests, the other running smoke tests - to cut
wall-clock time.
Done via two new parameters on the reusable workflows
(unit_and_integration_tests and smoke_tests, both defaulting to true) so
Browser and Remote jobs are unchanged. Artifact names get a -smoke suffix
on the smoke-only job to avoid upload collisions.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* ci: gate build and diagnostics to correct Electron test phase
Follow-up to the Electron job split. Ensure each half only does the work
it needs:
- Gate "Build integration tests" on unit_and_integration_tests so the
smoke-only job skips it.
- Scope the before/after diagnostics steps to their phase (combined with
always()) so they don't run in the wrong job.
- Move the Copilot extension build into the smoke phase (gated on
smoke_tests) instead of compiling it unconditionally; align Linux,
Windows and macOS on the same ordering.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* ci: drop space and parens from Electron-Smoke job name
The Windows 1ES runner builds its JobId label from job_name, producing
"windows-test-Electron (Smoke)-...". The space and parentheses prevented
the runner from picking up the job. Rename the smoke job to Electron-Smoke
on all three platforms so the JobId is a plain slug.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Fixes
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* CI: speed up node_modules cache with zstd + shared scripts
Switch the Linux/macOS node_modules cache from single-threaded gzip
(tar -czf) to multi-threaded zstd. The "Create node_modules archive"
step was spending ~5min of single-core gzip on a multi-GB tree on every
cache miss; zstd -T0 uses all cores and decompresses much faster, so
cache-hit jobs benefit too. Windows stays on 7-Zip (already threaded).
Extract the archive/extract commands into shared per-platform scripts
under .github/workflows/node_modules_cache/ (cache.sh / cache.ps1, each
dispatching on an archive|extract argument) so the format and flags live
in one place instead of being duplicated across ~8 workflows. Bump
build/.cachesalt to invalidate existing gzip caches.
Also remove the obsolete extensions/copilot CI workflows
(copilot-setup-steps.yml, ensure-node-modules-cache.yml, pr.yml) and the
unused build/listBuildCacheFiles.js, and drop their now-stale entries
(plus lit-html and signals-core) from .eslint-allowed-javascript-files.
* ci: seed copilot node_modules cache on main and rename cache keys
Add copilot-linux and copilot-windows jobs to pr-node-modules.yml so the
copilot node_modules cache is populated on main. Rename the copilot cache
keys to copilot-node_modules-linux / copilot-node_modules-windows in pr.yml.
* ci: extract node_modules cache into composite actions
Factor the repeated node_modules cache plumbing into two local composite
actions, restore-node-modules and save-node-modules, and migrate all
workflows that used the cache.sh/cache.ps1 archive flow (pr, pr-node-modules,
pr-{linux,darwin,win32}-test, copilot-setup-steps, component-fixtures,
css-order-scan).
- restore-node-modules computes the key, restores the cache, optionally
extracts on a hit, and exports the resolved key via $GITHUB_ENV.
- save-node-modules archives node_modules and saves it to the cache, reusing
the key exported by restore so callers don't repeat the prefix.
- Bespoke install steps stay in the workflows, so per-job env/secrets never
cross the action boundary.
- Only seed the cache on branch pushes (component-fixtures skips PRs, whose
caches aren't shared).
* save the node_modules cache for now to test it
* ci: fix node_modules cache save dropping the archive
cache.sh wrote its archive as cache.tzst, but actions/cache reserves that
name for its own tarball and passes --exclude cache.tzst, so our archive was
excluded and an empty (~200 B) cache was saved on Linux/macOS. Rename the
archive to node-modules.tzst and bump build/.cachesalt to invalidate the
broken cache entries.
* empty commit
* Remove again saving to the node modules cache from PR steps
* Lots of logging for chat smoke tests
* PR test workflows: build extensions/copilot before smoke tests
* PR test workflows: drop duplicate copilot compile from linux/win32 (was already built before integration tests)
* smoke tests: remove musl Claude binary on Linux glibc runner
The musl variant is probed first by @anthropic-ai/claude-agent-sdk and
fails to exec on glibc (ENOENT from missing ELF interpreter), which
caused the Test Claude session tests to time out.
The Electron main process intermittently crashes during startup on the
`[pango] FcInit` thread with a NULL pointer dereference in expat's XML
string processing, triggered by fontconfig parsing `<include>` directives
in fonts.conf via `XML_ExternalEntityParserCreate`.
Set FONTCONFIG_FILE to a minimal config based on upstream
fontconfig 2.15.0 fonts.conf.in with `<include>` directives removed and
generic family aliases inlined. This avoids the external entity parser
codepath entirely. A version check will fail the build once the runner
ships expat >= 2.7.5, prompting removal of the workaround.
* Bump version to 1.117.0
* npm i
* wait to do engine version bump
* Revert "wait to do engine version bump"
This reverts commit 9db1c0feb6.
* Add Copilot extension tests to Linux/Windows Electron integration test runs
* Remove failing step that we moved to the main build
---------
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* ci: debug fontconfig crash during app launch
* chore: update commands
* ci: bump to ubuntu-24.04 to fix fontconfig crash
Fixes an intermittent SIGSEGV on the [pango] FcInit thread during
Electron startup in CI integration tests.
Root cause: Chromium's InitializeGlobalFontConfigAsync() posts FcInit()
to a thread pool worker (crbug.com/404311), while pango's pangoft2
backend independently calls FcInit() from its own thread during GTK
initialization. fontconfig 2.13.1 (shipped in ubuntu-22.04) lacks
thread-safe initialization — concurrent first-time FcInit() calls
race and both enter FcConfigParse(), corrupting shared global state.
This causes expat (called by fontconfig to parse fonts.conf) to
dereference a NULL pointer.
ubuntu-24.04 ships fontconfig 2.15.0 which includes the thread-safe
initialization from 2.14+.
* ci: enable namespace sandbox
- Add bubblewrap and socat to Linux CI apt-get install
- Make sandbox test assertions platform-aware (macFileSystem vs linuxFileSystem)
- Make /etc/shells test accept both macOS and Linux first-line format
- Broaden wrapped prompt fragment regex to handle path chars (ts/testWorkspace$)
- Fix continuation pattern to match user@host:path wrapped lines
- Apply stripCommandEchoAndPrompt to getOutput() in BasicExecuteStrategy
(basic shell integration lacks reliable 133;C markers so getOutput()
can include command echo)
- Keep RichExecuteStrategy getOutput() unstripped (rich integration
has reliable markers)
* Introduce compilation error
* Engineering - limit the tasks that we run
* Limit available memory to simulate an OOM
* Try to update the task
* Remove the use of npm-run-all
* Fix script
* Another try
* Try npm-run-all2
* Restore tasks, keep npm-run-all2
* Switch from npm-run-all to npm-run-all2
* Revert changes that were used for testing
* Run our build scripts directly as typescript #277567
Follow up on #276864
For #277526
* Remove a few more ts-node references
* Fix linux and script reference
* Remove `_build-script` ref
* Fix script missing closing quote
* use type only import
* Fix export
* Make sure to run copy-policy-dto
* Make sure we run the copy-policy-dto script
* Enable `verbatimModuleSyntax`
* Pipelines fixes
* Try adding explicit ext to path
* Fix bad edit
* Revert extra `--`
---------
Co-authored-by: João Moreno <joaomoreno@users.noreply.github.com>