Skip to content
/ gopilot Public

gopilot is a minimal Go library for automating Chromium browsers with essential features like navigation, clicking, and HTML extraction.

License

Notifications You must be signed in to change notification settings

falmar/gopilot

Repository files navigation

gopilot

GoPilot Logo

Go Reference

A lightweight approach to Chromium automation using basic CDP commands.

NOTE: Breaking changes may occur until the API is finalized.

Table of Contents

Overview

gopilot is my attempt to provide a simple, minimalistic API for automating Chromium browsers. It's not meant to be another Puppeteer. Instead, it's focused on the essential features most users need for straightforward browser tasks—no fluff, just what you need.

Under the hood gopilot uses github.com/mafredri/cdp for chrome communication, inspired by gRPC provides a really nice and easy API.

Why Minimalistic?

I wanted to simplify browser automation by sticking to the core functionalities that most of us use:

  • Navigation to web pages
  • Clicking on elements
  • Typing text
  • Taking screenshots
  • Extracting HTML content

I’ve also added some features for intercepting requests, which is handy if you want to cancel or grab AJAX info. Overall, gopilot aims to be a lightweight tool that doesn’t bog you down with unnecessary complexity.

Key Features

  • Headfull mode support: Designed to run as headful and compatible with Docker using Xvfb for display.
  • Headless mode: Easily switch to headless operation when needed.
  • Navigate to a specified URL
  • Element Search finds and/or wait for elements
  • Click on elements
  • Get and set HTML content
  • Intercept Request/Response network requests for those who want to dig deeper
  • Set, get, and clear cookies and local storage
  • Screenshots the current page's viewport, the full page or an element's within is bounding box
  • Text Typing just provide the text to be written, a delay or func can be supplied per keystroke delays

Installation

Prerequisites

  • Go 1.24.0 or later
  • Chrome or Chromium browser installed on your system

Installing gopilot

To install gopilot, use the standard Go package installation command:

go get github.com/falmar/gopilot

Import it in your Go code:

import "github.com/falmar/gopilot"

Quick Start

Here's a very basic example of how to use gopilot to open a URL:

package main

import (
	"context"
	"log/slog"
	"os"
	"os/signal"
	"time"

	"github.com/falmar/gopilot"
)

func main() {
	ctx, cancel := signal.NotifyContext(context.Background(), os.Interrupt, os.Kill)
	defer cancel()

	logger := slog.New(slog.NewTextHandler(os.Stderr, &slog.HandlerOptions{
		Level: slog.LevelDebug,
	}))

	cfg := gopilot.NewBrowserConfig()
	b := gopilot.NewBrowser(cfg, logger)

	err := b.Open(ctx, &gopilot.BrowserOpenInput{})
	if err != nil {
		logger.Error("unable to open browser", "error", err)
		return
	}
	defer b.Close(ctx)

	pOut, err := b.NewPage(ctx, &gopilot.BrowserNewPageInput{})
	if err != nil {
		logger.Error("unable to open page", "error", err)
		return
	}
	page := pOut.Page
	defer page.Close(ctx)

	_, err = page.Navigate(ctx, &gopilot.PageNavigateInput{
		URL:                "https://www.google.com",
		WaitDomContentLoad: true,
	})
	if err != nil {
		logger.Error("unable to navigate", "error", err)
		return
	}

	time.Sleep(2 * time.Second)

	// do some magic ...
}

Examples

For more practical examples of how to use gopilot, check out the examples provided:

  • Click Element - Demonstrates how to find and click on elements in a web page
  • Cookies - Shows how to set, get, and clear cookies
  • Evaluate JS - Examples of executing JavaScript in the browser context
  • External Browser - Shows how to connect to an existing Chrome instance instead of launching a new one
  • Local Storage - Shows how to interact with browser local storage
  • Open Chrome - Basic example of launching a Chrome browser
  • Open URL - Simple example of navigating to a URL
  • Screenshots - Shows how to capture screenshots of pages or elements
  • Search - Demonstrates how to search for elements on a page
  • Typing - Examples of typing text into input fields
  • Request Modifier - Demonstrates how to modify outgoing requests and provide custom responses
  • Listen XHR - Demonstrates how to intercept and monitor XHR requests

Advanced Usage

Headless Mode

By default, gopilot runs in headful mode, which may require a display server when running in a Docker container. To switch to headless mode, simply call the EnableHeadless method on the BrowserConfig object. You can start the browser in headless mode as follows:

// EnableHeadless will make the browser start as headless
cfg := gopilot.NewBrowserConfig()
cfg.EnableHeadless()

Connecting to External Browsers

gopilot can connect to existing Chrome/Chromium instances instead of launching a new process. This is useful for debugging, reusing browsers across multiple runs, or working with browsers that have specific profiles or extensions loaded.

Start Chrome with remote debugging:

chromium --remote-debugging-port=9222

Connect gopilot to the external browser:

cfg := gopilot.NewBrowserConfig()
cfg.ConnectionURL = "http://127.0.0.1:9222"

b := gopilot.NewBrowser(cfg, logger)
err := b.Open(ctx, &gopilot.BrowserOpenInput{})
if err != nil {
    // Connection failed - browser may not be running
    return
}
defer b.Close(ctx) // Closes pages but does NOT kill the browser

Session-Based Page Tracking: gopilot only manages pages it creates (via NewTab: true). This means:

  • Close() only closes pages created by this gopilot instance (session pages)
  • User tabs and pages from other gopilot instances are preserved
  • Multiple gopilot instances can safely share the same browser without conflicts

GetPages() vs GetAllPages():

  • GetPages() - Returns only pages created by this instance (closeable)
  • GetAllPages() - Returns ALL pages in the browser for inspection (calling Close() on these is a no-op)

See the External Browser example for a complete demonstration.

Configuration Options

gopilot provides several configuration options to customize browser behavior:

Browser Configuration

The BrowserConfig struct allows you to configure how the browser is launched:

type BrowserConfig struct {
    // Path specifies the path to the browser executable
    Path string

    // DebugPort specifies the port for debugging connections
    DebugPort string

    // Args contains additional command-line arguments
    Args []string

    // Envs holds environment variables for the browser process
    Envs []string

    // OpenTimeout defines how long to wait for Chrome to print the
    // "DevTools listening on" message during startup. If nil, defaults to 5s.
    OpenTimeout *time.Duration
}

Default Configuration

When you call gopilot.NewBrowserConfig(), it creates a configuration with these defaults:

  • Browser Path: Uses the Chrome executable specified by the GOPILOT_CHROME_EXECUTABLE environment variable, or defaults to "chromium"
  • Debug Port: "9222"
  • Default Arguments: Several arguments for optimal browser operation:
    • --remote-allow-origins=*
    • --no-first-run
    • --no-service-autorun
    • --no-default-browser-check
    • --homepage=about:blank
    • And several others for stability and performance

Environment Variables

  • GOPILOT_CHROME_EXECUTABLE: Set this to specify the path to your Chrome or Chromium executable. For example:
    export GOPILOT_CHROME_EXECUTABLE="/usr/bin/google-chrome"

Adding Custom Arguments

You can add custom command-line arguments to the browser:

cfg := gopilot.NewBrowserConfig()
cfg.AddArgument("--disable-gpu")
cfg.AddArgument("--window-size=1280,720")

Project Status & Roadmap

gopilot is currently in active development ("WIP" - Work In Progress). While the core functionality is stable enough for many use cases, the API may change as we refine and improve the library.

Current Status

  • Core browser automation features are implemented and working
  • API is functional but may undergo refinements
  • Documentation and examples are being expanded

Planned Features

  • Listen for page/target events to change local data
  • Integration tests
  • Performance optimizations
  • Additional helper methods for common tasks

Development Priorities

  1. API stabilization
  2. Improved error handling and recovery
  3. Enhanced documentation
  4. Performance improvements

Contributions

Contributions are welcome! If you've got a feature request or an idea to share, reach out.

About

gopilot is a minimal Go library for automating Chromium browsers with essential features like navigation, clicking, and HTML extraction.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages