Initial cut of WebContentExtractorService & a fetch tool (#243609)

* Initial cut of WebContentExtractorService & a fetch tool

I will likely move the tool into Copilot so it can take advantage of prompt-tsx and embeddings indexing... but this is the first cut to play around with it.

This leverages Chrome DevTools Protocol's `Accessibility.getFullAXTree` command in order to get a representation of a page while marking what is useful on the page and what is not. We take the output of the command and turn it into a string that the caller can easily consume. This transformer will get more sophisticated over time to make sure we keep content that's important, and ditch content that is not.

On the tool side of things... this implements a Confirmation flow that verifies if the urls being requested is a trusted domain. We are _rendering_ these urls (albiet, sandboxed without JS) so we want to make sure they're safe. If it's not trusted, they'll be asked to confirm.

* fix naming
This commit is contained in:
Tyler James Leonhardt
2025-03-14 16:25:06 -07:00
committed by GitHub
parent e0aa795754
commit f99a4603d6
9 changed files with 497 additions and 1 deletions

View File

@@ -120,6 +120,8 @@ import { normalizeNFC } from '../../base/common/normalization.js';
import { ICSSDevelopmentService, CSSDevelopmentService } from '../../platform/cssDev/node/cssDevService.js';
import { INativeMcpDiscoveryHelperService, NativeMcpDiscoveryHelperChannelName } from '../../platform/mcp/common/nativeMcpDiscoveryHelper.js';
import { NativeMcpDiscoveryHelperService } from '../../platform/mcp/node/nativeMcpDiscoveryHelperService.js';
import { IWebContentExtractorService } from '../../platform/webContentExtractor/common/webContentExtractor.js';
import { NativeWebContentExtractorService } from '../../platform/webContentExtractor/electron-main/webContentExtractorService.js';
/**
* The main VS Code application. There will only ever be one instance,
@@ -1048,6 +1050,9 @@ export class CodeApplication extends Disposable {
// Native Host
services.set(INativeHostMainService, new SyncDescriptor(NativeHostMainService, undefined, false /* proxied to other processes */));
// Web Contents Extractor
services.set(IWebContentExtractorService, new SyncDescriptor(NativeWebContentExtractorService, undefined, false /* proxied to other processes */));
// Webview Manager
services.set(IWebviewManagerService, new SyncDescriptor(WebviewMainService));
@@ -1195,6 +1200,10 @@ export class CodeApplication extends Disposable {
mainProcessElectronServer.registerChannel('nativeHost', nativeHostChannel);
sharedProcessClient.then(client => client.registerChannel('nativeHost', nativeHostChannel));
// Web Content Extractor
const webContentExtractorChannel = ProxyChannel.fromService(accessor.get(IWebContentExtractorService), disposables);
mainProcessElectronServer.registerChannel('webContentExtractor', webContentExtractorChannel);
// Workspaces
const workspacesChannel = ProxyChannel.fromService(accessor.get(IWorkspacesService), disposables);
mainProcessElectronServer.registerChannel('workspaces', workspacesChannel);