Files
vscode/extensions/copilot/test
Ulugbek Abdullaev bca6df1d1d Fix xtab-275 edit-window spillover in nes-datagen (#313483)
* fix(nes-datagen): discard xtab-275 oracle edits outside edit window

The NES model can only edit lines inside the prompt's <|code_to_edit|>
window [K, N). formatAsEditWindowOnly was expanding the window to cover
stray oracle edits, so the assistant text spilled out of the window and
duplicated surrounding context when applied.

Instead, keep edits in order and drop the first edit that isn't fully
contained in [K, N) along with every later edit (later offsets assume
earlier edits were applied). Add unit tests covering both the spillover
repro and the all-in-window control case.

* refactor(nes-datagen): extract filterEditsInsideEditWindow helper

Pull the line-containment filter out of formatAsEditWindowOnly into a
reusable helper, plus a small getEditLineRange utility to dedupe the
offset->line conversion. No behavior change.

* refactor(nes-datagen): route xtab-275 dropped-edit warnings through pipeline logger

Replace the console.warn in formatAsEditWindowOnly with a structured
return shape ({ assistant, droppedCount }) and a ResponseLogger threaded
through generateResponse / generateAllResponses. The pipeline now logs
dropped edits via its existing log callback (so warnings are visible to
dataset curators and captured by the e2e test logs), and surfaces the
count on IGeneratedResponse.droppedEditCount.

* fix(nes-datagen): correct netLineChange math for full-line deletions

splitLines("L6\n") returns ['L6', ''] (length 2), so deleting a single
terminated line was counted as -2 lines instead of -1. Compute the delta
by counting newlines in the old segment vs. the new text, which gives
the correct line-count delta for any combination of insertions,
deletions, or replacements without the trailing-empty quirk.

* test(nes-datagen): cover line-changing and boundary-straddling oracle edits

Add cases for in-window inserts that grow the slice, in-window deletes
that shrink it, edits that straddle the window boundary (must be
discarded), and the all-edits-outside fallback (assistant equals the
original window slice with droppedCount equal to all edits).

* fix(nes-datagen): filter oracle edits independently against edit window

Oracle edits come from StringEdit.compose().replacements upstream
(processor.ts) and applyEditsToContent applies them all to the original
doc by sorting offset-descending — so each edit's offsets are
independent. The earlier drop-and-truncate rule (which discarded every
edit after the first out-of-window one) was over-conservative and threw
away in-window edits that were perfectly applicable.

Filter each edit on its own merits and update the test that relied on
the truncate behavior to assert independent filtering instead.

* feat(nes-datagen): surface dropped-edit count in pipeline summary

Sum droppedEditCount across all generated responses and append it to
the [4/5] log line when nonzero, so dataset curators running the
pipeline can see at a glance how many oracle edits were discarded as
out-of-window. Silent omission was the original bug; making it visible
in the summary closes the feedback loop.
2026-05-02 14:59:18 +05:00
..
2026-04-30 02:04:46 +00:00