Skip to content

Control your real Chrome browser over HTTP — agent-friendly API for the Chrome DevTools Protocol

Notifications You must be signed in to change notification settings

autonet-code/web

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Agent CDP Relay

Control your real Chrome browser over HTTP. Navigate pages, click elements, fill forms, take screenshots, run JavaScript — all via simple POST/GET requests to localhost:9223.

Uses your actual Chrome profile with all your cookies, sessions, and logins. No --remote-debugging-port needed.

How it works

Your agent (any language, any framework)
  |
  |  HTTP requests to localhost:9223
  v
browser.py (:9223)       HTTP API gateway — auto-starts everything
  |
  |  WebSocket
  v
relay.py (:9222)          CDP message router
  |
  |  WebSocket
  v
Chrome extension          Translates CDP to chrome.debugger API
  |
  |  chrome.debugger
  v
Chrome                    Your real browser, real profile, real cookies

Chrome blocks --remote-debugging-port on non-temporary profiles. This project works around that by using a Chrome extension that calls chrome.debugger internally, bridged to a standard CDP endpoint through a local relay server. browser.py wraps the whole thing in a clean HTTP API so any agent that can make HTTP calls can drive the browser.

Setup

Requirements

  • Python 3.8+
  • aiohttp and requests (pip install aiohttp requests)
  • Chrome (Windows, macOS, or Linux)

Install

git clone https://github.com/autonet-code/web.git
cd web
pip install aiohttp requests

Load the Chrome extension

  1. Open chrome://extensions in Chrome
  2. Enable Developer mode (toggle in top right)
  3. Click Load unpacked and select the extension/ folder
  4. The extension popup should show "Agent CDP Relay"

Start the stack

python browser.py

This auto-starts the relay server and connects to Chrome. Output:

Starting relay server...
Relay server ready on port 9222
Extension connected
Browser API ready on http://127.0.0.1:9223

If Chrome is already running with the extension loaded, it connects directly. If not, it launches Chrome with the extension.

Quick start

import requests

API = "http://127.0.0.1:9223"

# Open a page (creates a new tab and auto-attaches)
requests.post(f"{API}/navigate", json={"url": "https://example.com"})

# Read the title
requests.get(f"{API}/title").json()["value"]  # "Example Domain"

# Click a link
requests.post(f"{API}/click", json={"selector": "a"})

# Type into an input (works with React, Vue, etc.)
requests.post(f"{API}/type", json={"selector": "input", "text": "hello"})

# Run any JavaScript
requests.post(f"{API}/eval", json={"js": "document.title"})

# Take a screenshot
requests.get(f"{API}/screenshot/file").json()["path"]  # /tmp/screenshot.png

# Detach when done (removes debugger, keeps tab open)
requests.post(f"{API}/detach")

API reference

Connection & tabs

Endpoint Method Body Returns
/status GET {connected, session_id, target_id}
/tabs GET [{id, title, url}, ...]
/navigate POST {url, new_tab?} {target_id, session_id, url}
/attach POST {target_id} {session_id, target_id}
/detach POST {detached, session_id}
/close POST {target_id?} {success}

Page interaction

Endpoint Method Body Returns
/eval POST {js} {value, type}
/click POST {selector} {value: "clicked"}
/type POST {selector, text, clear?, submit?} {value: "typed", method}
/wait POST {selector, timeout?, visible?} {found, elapsed_ms, tag?, text?}
/text GET {value: "page text..."}
/url GET {value: "https://..."}
/title GET {value: "Page Title"}
/screenshot GET ?format=file {path} or {data_length, data}
/screenshot/file GET {path}

DOM observation

Endpoint Method Returns
/observe POST {value: "observer_started"}
/events GET {active, count, events: [...]}

Injects a MutationObserver that tracks dialogs, modals, toasts, forms, iframes, images, canvas, video/audio, and other dynamic DOM changes.

Raw CDP

Endpoint Method Body
/cdp POST {method, params, session?}

Full Chrome DevTools Protocol passthrough for anything the convenience endpoints don't cover.

Required workflow

You must attach to a tab before interacting with it. Without an attached session, interaction endpoints return {"error": "No session attached"}.

1. GET /tabs                          → find an existing tab
   or POST /navigate {"url": "..."}   → open a new tab (auto-attaches)
2. POST /attach {"target_id": "..."}  → attach to existing tab
3. POST /eval, /click, /type, etc.    → interact with the page
4. POST /detach                       → release the tab when done

/navigate auto-attaches to the new tab. Existing tabs need an explicit /attach.

Active tabs show a red dot (🔴) in their title so the user knows which tab is agent-controlled. The marker is automatically removed on /detach and cleaned up on startup if a previous session crashed.

/type — framework-compatible text input

/type works with standard inputs, React/Vue/Angular controlled components, and contenteditable elements (ChatGPT, Notion, etc.):

  • Standard inputs/textareas: Uses nativeInputValueSetter to bypass React's internal value tracking, then fires input and change events
  • Contenteditable: Uses document.execCommand('insertText') which triggers all framework event handlers
  • clear (default true): Clears existing value before typing
  • submit (default false): After typing, clicks a submit button or simulates Enter

/wait — wait for elements

# Wait up to 5 seconds for an element to appear
r = requests.post(f"{API}/wait", json={"selector": ".results", "timeout": 5000})
r.json()  # {"found": true, "elapsed_ms": 1200, "tag": "DIV", "text": "..."}

# Wait for element to be visible (not just in DOM)
requests.post(f"{API}/wait", json={"selector": ".modal", "visible": true, "timeout": 3000})

Tips

Use textContent instead of innerText for background tabs

Chrome skips layout computation for non-visible tabs, so innerText returns empty. Always use textContent:

r = requests.post(f"{API}/eval", json={"js": "document.body.textContent.substring(0, 5000)"})

GET /text uses innerText internally — prefer /eval with textContent when the tab might be in the background.

Use existing login sessions

The browser runs your real Chrome profile. If you're logged into Gmail, GitHub, etc., those sessions are available:

requests.post(f"{API}/navigate", json={"url": "https://github.com/notifications"})

Extract structured data

r = requests.post(f"{API}/eval", json={"js": """(function() {
    var items = [];
    document.querySelectorAll('.item').forEach(function(el) {
        items.push({
            title: el.querySelector('h3').textContent.trim(),
            url: el.querySelector('a').href
        });
    });
    return JSON.stringify(items);
})()"""})
data = json.loads(r.json()["value"])

App models

The apps/ directory contains DOM models for specific websites — CSS selectors, interaction recipes, and data extraction patterns that have been verified against live pages.

apps/
  chatgpt.com.json       Selectors and extraction for ChatGPT
  mail.google.com.json   Selectors and extraction for Gmail

These are living documents. Agents can load them before interacting with a site to avoid re-discovering selectors, and update them when sites change. See apps/README.md for the schema.

File structure

browser.py              HTTP API gateway (port 9223) — the main entry point
relay.py                CDP relay server (port 9222) — started by browser.py
chrome_cdp.py           Standalone relay launcher (for custom integrations)
extension/
  manifest.json         Chrome MV3 extension manifest
  service-worker.js     CDP command handler via chrome.debugger
  keep-alive.js         Prevents MV3 service worker suspension
  popup.html/js         Extension popup showing connection status
apps/
  chatgpt.com.json      ChatGPT DOM model
  mail.google.com.json  Gmail DOM model
  README.md             App model schema and conventions

License

MIT

About

Control your real Chrome browser over HTTP — agent-friendly API for the Chrome DevTools Protocol

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •