A lightweight approach to Chromium automation using basic CDP commands.
NOTE: Breaking changes may occur until the API is finalized.
- Overview
- Why Minimalistic?
- Key Features
- Installation
- Quick Start
- Examples
- Configuration
- Advanced Usage
- Project Status & Roadmap
- Contributions
gopilot is my attempt to provide a simple, minimalistic API for automating Chromium browsers. It's not meant to be another Puppeteer. Instead, it's focused on the essential features most users need for straightforward browser tasks—no fluff, just what you need.
Under the hood gopilot uses github.com/mafredri/cdp for chrome communication, inspired by gRPC provides a really nice and easy API.
I wanted to simplify browser automation by sticking to the core functionalities that most of us use:
- Navigation to web pages
- Clicking on elements
- Typing text
- Taking screenshots
- Extracting HTML content
I’ve also added some features for intercepting requests, which is handy if you want to cancel or grab AJAX info. Overall, gopilot aims to be a lightweight tool that doesn’t bog you down with unnecessary complexity.
- Headfull mode support: Designed to run as headful and compatible with Docker using Xvfb for display.
- Headless mode: Easily switch to headless operation when needed.
- Navigate to a specified URL
- Element Search finds and/or wait for elements
- Click on elements
- Get and set HTML content
- Intercept Request/Response network requests for those who want to dig deeper
- Set, get, and clear cookies and local storage
- Screenshots the current page's viewport, the full page or an element's within is bounding box
- Text Typing just provide the text to be written, a delay or func can be supplied per keystroke delays
- Go 1.24.0 or later
- Chrome or Chromium browser installed on your system
To install gopilot, use the standard Go package installation command:
go get github.com/falmar/gopilotImport it in your Go code:
import "github.com/falmar/gopilot"Here's a very basic example of how to use gopilot to open a URL:
package main
import (
"context"
"log/slog"
"os"
"os/signal"
"time"
"github.com/falmar/gopilot"
)
func main() {
ctx, cancel := signal.NotifyContext(context.Background(), os.Interrupt, os.Kill)
defer cancel()
logger := slog.New(slog.NewTextHandler(os.Stderr, &slog.HandlerOptions{
Level: slog.LevelDebug,
}))
cfg := gopilot.NewBrowserConfig()
b := gopilot.NewBrowser(cfg, logger)
err := b.Open(ctx, &gopilot.BrowserOpenInput{})
if err != nil {
logger.Error("unable to open browser", "error", err)
return
}
defer b.Close(ctx)
pOut, err := b.NewPage(ctx, &gopilot.BrowserNewPageInput{})
if err != nil {
logger.Error("unable to open page", "error", err)
return
}
page := pOut.Page
defer page.Close(ctx)
_, err = page.Navigate(ctx, &gopilot.PageNavigateInput{
URL: "https://www.google.com",
WaitDomContentLoad: true,
})
if err != nil {
logger.Error("unable to navigate", "error", err)
return
}
time.Sleep(2 * time.Second)
// do some magic ...
}For more practical examples of how to use gopilot, check out the examples provided:
- Click Element - Demonstrates how to find and click on elements in a web page
- Cookies - Shows how to set, get, and clear cookies
- Evaluate JS - Examples of executing JavaScript in the browser context
- External Browser - Shows how to connect to an existing Chrome instance instead of launching a new one
- Local Storage - Shows how to interact with browser local storage
- Open Chrome - Basic example of launching a Chrome browser
- Open URL - Simple example of navigating to a URL
- Screenshots - Shows how to capture screenshots of pages or elements
- Search - Demonstrates how to search for elements on a page
- Typing - Examples of typing text into input fields
- Request Modifier - Demonstrates how to modify outgoing requests and provide custom responses
- Listen XHR - Demonstrates how to intercept and monitor XHR requests
By default, gopilot runs in headful mode, which may require a display server when running in a Docker container. To
switch to headless mode, simply call the EnableHeadless method on the BrowserConfig object. You can start the
browser in headless mode as follows:
// EnableHeadless will make the browser start as headless
cfg := gopilot.NewBrowserConfig()
cfg.EnableHeadless()gopilot can connect to existing Chrome/Chromium instances instead of launching a new process. This is useful for debugging, reusing browsers across multiple runs, or working with browsers that have specific profiles or extensions loaded.
Start Chrome with remote debugging:
chromium --remote-debugging-port=9222Connect gopilot to the external browser:
cfg := gopilot.NewBrowserConfig()
cfg.ConnectionURL = "http://127.0.0.1:9222"
b := gopilot.NewBrowser(cfg, logger)
err := b.Open(ctx, &gopilot.BrowserOpenInput{})
if err != nil {
// Connection failed - browser may not be running
return
}
defer b.Close(ctx) // Closes pages but does NOT kill the browserSession-Based Page Tracking:
gopilot only manages pages it creates (via NewTab: true). This means:
Close()only closes pages created by this gopilot instance (session pages)- User tabs and pages from other gopilot instances are preserved
- Multiple gopilot instances can safely share the same browser without conflicts
GetPages() vs GetAllPages():
GetPages()- Returns only pages created by this instance (closeable)GetAllPages()- Returns ALL pages in the browser for inspection (calling Close() on these is a no-op)
See the External Browser example for a complete demonstration.
gopilot provides several configuration options to customize browser behavior:
The BrowserConfig struct allows you to configure how the browser is launched:
type BrowserConfig struct {
// Path specifies the path to the browser executable
Path string
// DebugPort specifies the port for debugging connections
DebugPort string
// Args contains additional command-line arguments
Args []string
// Envs holds environment variables for the browser process
Envs []string
// OpenTimeout defines how long to wait for Chrome to print the
// "DevTools listening on" message during startup. If nil, defaults to 5s.
OpenTimeout *time.Duration
}When you call gopilot.NewBrowserConfig(), it creates a configuration with these defaults:
- Browser Path: Uses the Chrome executable specified by the
GOPILOT_CHROME_EXECUTABLEenvironment variable, or defaults to "chromium" - Debug Port: "9222"
- Default Arguments: Several arguments for optimal browser operation:
--remote-allow-origins=*--no-first-run--no-service-autorun--no-default-browser-check--homepage=about:blank- And several others for stability and performance
- GOPILOT_CHROME_EXECUTABLE: Set this to specify the path to your Chrome or Chromium executable. For example:
export GOPILOT_CHROME_EXECUTABLE="/usr/bin/google-chrome"
You can add custom command-line arguments to the browser:
cfg := gopilot.NewBrowserConfig()
cfg.AddArgument("--disable-gpu")
cfg.AddArgument("--window-size=1280,720")gopilot is currently in active development ("WIP" - Work In Progress). While the core functionality is stable enough for many use cases, the API may change as we refine and improve the library.
- Core browser automation features are implemented and working
- API is functional but may undergo refinements
- Documentation and examples are being expanded
- Listen for page/target events to change local data
- Integration tests
- Performance optimizations
- Additional helper methods for common tasks
- API stabilization
- Improved error handling and recovery
- Enhanced documentation
- Performance improvements
Contributions are welcome! If you've got a feature request or an idea to share, reach out.
