Agent CDP Relay

Control your real Chrome browser over HTTP. Navigate pages, click elements, fill forms, take screenshots, run JavaScript — all via simple POST/GET requests to localhost:9223.

Uses your actual Chrome profile with all your cookies, sessions, and logins. No --remote-debugging-port needed.

How it works

Your agent (any language, any framework)
  |
  |  HTTP requests to localhost:9223
  v
browser.py (:9223)       HTTP API gateway — auto-starts everything
  |
  |  WebSocket
  v
relay.py (:9222)          CDP message router
  |
  |  WebSocket
  v
Chrome extension          Translates CDP to chrome.debugger API
  |
  |  chrome.debugger
  v
Chrome                    Your real browser, real profile, real cookies

Chrome blocks --remote-debugging-port on non-temporary profiles. This project works around that by using a Chrome extension that calls chrome.debugger internally, bridged to a standard CDP endpoint through a local relay server. browser.py wraps the whole thing in a clean HTTP API so any agent that can make HTTP calls can drive the browser.

Setup

Requirements

Python 3.8+
aiohttp and requests (pip install aiohttp requests)
Chrome (Windows, macOS, or Linux)

Install

git clone https://github.com/autonet-code/web.git
cd web
pip install aiohttp requests

Load the Chrome extension

Open chrome://extensions in Chrome
Enable Developer mode (toggle in top right)
Click Load unpacked and select the extension/ folder
The extension popup should show "Agent CDP Relay"

Start the stack

python browser.py

This auto-starts the relay server and connects to Chrome. Output:

Starting relay server...
Relay server ready on port 9222
Extension connected
Browser API ready on http://127.0.0.1:9223

If Chrome is already running with the extension loaded, it connects directly. If not, it launches Chrome with the extension.

Quick start

import requests

API = "http://127.0.0.1:9223"

# Open a page (creates a new tab and auto-attaches)
requests.post(f"{API}/navigate", json={"url": "https://example.com"})

# Read the title
requests.get(f"{API}/title").json()["value"]  # "Example Domain"

# Click a link
requests.post(f"{API}/click", json={"selector": "a"})

# Type into an input (works with React, Vue, etc.)
requests.post(f"{API}/type", json={"selector": "input", "text": "hello"})

# Run any JavaScript
requests.post(f"{API}/eval", json={"js": "document.title"})

# Take a screenshot
requests.get(f"{API}/screenshot/file").json()["path"]  # /tmp/screenshot.png

# Detach when done (removes debugger, keeps tab open)
requests.post(f"{API}/detach")

API reference

Connection & tabs

Endpoint	Method	Body	Returns
`/status`	GET	—	`{connected, session_id, target_id}`
`/tabs`	GET	—	`[{id, title, url}, ...]`
`/navigate`	POST	`{url, new_tab?}`	`{target_id, session_id, url}`
`/attach`	POST	`{target_id}`	`{session_id, target_id}`
`/detach`	POST	—	`{detached, session_id}`
`/close`	POST	`{target_id?}`	`{success}`

Page interaction

Endpoint	Method	Body	Returns
`/eval`	POST	`{js}`	`{value, type}`
`/click`	POST	`{selector}`	`{value: "clicked"}`
`/type`	POST	`{selector, text, clear?, submit?}`	`{value: "typed", method}`
`/wait`	POST	`{selector, timeout?, visible?}`	`{found, elapsed_ms, tag?, text?}`
`/text`	GET	—	`{value: "page text..."}`
`/url`	GET	—	`{value: "https://..."}`
`/title`	GET	—	`{value: "Page Title"}`
`/screenshot`	GET	`?format=file`	`{path}` or `{data_length, data}`
`/screenshot/file`	GET	—	`{path}`

DOM observation

Endpoint	Method	Returns
`/observe`	POST	`{value: "observer_started"}`
`/events`	GET	`{active, count, events: [...]}`

Injects a MutationObserver that tracks dialogs, modals, toasts, forms, iframes, images, canvas, video/audio, and other dynamic DOM changes.

Raw CDP

Endpoint	Method	Body
`/cdp`	POST	`{method, params, session?}`

Full Chrome DevTools Protocol passthrough for anything the convenience endpoints don't cover.

Required workflow

You must attach to a tab before interacting with it. Without an attached session, interaction endpoints return {"error": "No session attached"}.

1. GET /tabs                          → find an existing tab
   or POST /navigate {"url": "..."}   → open a new tab (auto-attaches)
2. POST /attach {"target_id": "..."}  → attach to existing tab
3. POST /eval, /click, /type, etc.    → interact with the page
4. POST /detach                       → release the tab when done

/navigate auto-attaches to the new tab. Existing tabs need an explicit /attach.

Active tabs show a red dot (🔴) in their title so the user knows which tab is agent-controlled. The marker is automatically removed on /detach and cleaned up on startup if a previous session crashed.

`/type` — framework-compatible text input

/type works with standard inputs, React/Vue/Angular controlled components, and contenteditable elements (ChatGPT, Notion, etc.):

Standard inputs/textareas: Uses nativeInputValueSetter to bypass React's internal value tracking, then fires input and change events
Contenteditable: Uses document.execCommand('insertText') which triggers all framework event handlers
clear (default true): Clears existing value before typing
submit (default false): After typing, clicks a submit button or simulates Enter

`/wait` — wait for elements

# Wait up to 5 seconds for an element to appear
r = requests.post(f"{API}/wait", json={"selector": ".results", "timeout": 5000})
r.json()  # {"found": true, "elapsed_ms": 1200, "tag": "DIV", "text": "..."}

# Wait for element to be visible (not just in DOM)
requests.post(f"{API}/wait", json={"selector": ".modal", "visible": true, "timeout": 3000})

Tips

Use `textContent` instead of `innerText` for background tabs

Chrome skips layout computation for non-visible tabs, so innerText returns empty. Always use textContent:

r = requests.post(f"{API}/eval", json={"js": "document.body.textContent.substring(0, 5000)"})

GET /text uses innerText internally — prefer /eval with textContent when the tab might be in the background.

Use existing login sessions

The browser runs your real Chrome profile. If you're logged into Gmail, GitHub, etc., those sessions are available:

requests.post(f"{API}/navigate", json={"url": "https://github.com/notifications"})

Extract structured data

r = requests.post(f"{API}/eval", json={"js": """(function() {
    var items = [];
    document.querySelectorAll('.item').forEach(function(el) {
        items.push({
            title: el.querySelector('h3').textContent.trim(),
            url: el.querySelector('a').href
        });
    });
    return JSON.stringify(items);
})()"""})
data = json.loads(r.json()["value"])

App models

The apps/ directory contains DOM models for specific websites — CSS selectors, interaction recipes, and data extraction patterns that have been verified against live pages.

apps/
  chatgpt.com.json       Selectors and extraction for ChatGPT
  mail.google.com.json   Selectors and extraction for Gmail

These are living documents. Agents can load them before interacting with a site to avoid re-discovering selectors, and update them when sites change. See apps/README.md for the schema.

File structure

browser.py              HTTP API gateway (port 9223) — the main entry point
relay.py                CDP relay server (port 9222) — started by browser.py
chrome_cdp.py           Standalone relay launcher (for custom integrations)
extension/
  manifest.json         Chrome MV3 extension manifest
  service-worker.js     CDP command handler via chrome.debugger
  keep-alive.js         Prevents MV3 service worker suspension
  popup.html/js         Extension popup showing connection status
apps/
  chatgpt.com.json      ChatGPT DOM model
  mail.google.com.json  Gmail DOM model
  README.md             App model schema and conventions

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agent CDP Relay

How it works

Setup

Requirements

Install

Load the Chrome extension

Start the stack

Quick start

API reference

Connection & tabs

Page interaction

DOM observation

Raw CDP

Required workflow

`/type` — framework-compatible text input

`/wait` — wait for elements

Tips

Use `textContent` instead of `innerText` for background tabs

Use existing login sessions

Extract structured data

App models

File structure

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
apps		apps
extension		extension
.gitignore		.gitignore
README.md		README.md
browser.py		browser.py
chrome_cdp.py		chrome_cdp.py
relay.py		relay.py

autonet-code/web

Folders and files

Latest commit

History

Repository files navigation

Agent CDP Relay

How it works

Setup

Requirements

Install

Load the Chrome extension

Start the stack

Quick start

API reference

Connection & tabs

Page interaction

DOM observation

Raw CDP

Required workflow

/type — framework-compatible text input

/wait — wait for elements

Tips

Use textContent instead of innerText for background tabs

Use existing login sessions

Extract structured data

App models

File structure

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

`/type` — framework-compatible text input

`/wait` — wait for elements

Use `textContent` instead of `innerText` for background tabs

Packages