Skip to content
/ ocrcap Public

GTK3-based screenshot OCR clipboard utility with Tesseract

License

Notifications You must be signed in to change notification settings

mreinrt/ocrcap

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OCR Capture (ocrcap)

ocrcap is a standalone GTK3-based screenshot utility for Linux that allows you to select a region of your screen and automatically copy any text in that region to the clipboard using Tesseract OCR.

It is designed to work in desktop environments like XFCE, providing a quick way to extract text from images without saving files manually.


Features

  • Select a region on screen with a crosshair cursor (optional)
  • Perform OCR on the selected region using Tesseract
  • Automatically copy recognized text to the system clipboard
  • Full-screen overlay with configurable semi-transparent selection rectangle
  • Configurable colors and crosshair via preferences dialog
  • Lightweight and standalone; does not interfere with other screenshot tools
  • Works with all paste methods: Ctrl+V, middle-click, context menu, triple-click/tap

Working Architecture

    OCRCap Process
    ├── Captures screen region
    ├── Runs OCR via Tesseract
    ├── Sets GTK clipboards (backup)
    ├── Saves text to /tmp/ocrcap_XXXX.txt
    ├── Forks PRIMARY xclip process (middle-click/context menu)
    ├── Forks CLIPBOARD xclip process (Ctrl+V)
    └── Exits after 2 seconds
Background xclip Processes
    ├── PRIMARY xclip ─── provides text for middle-click/context menu/triple-tap
    └── CLIPBOARD xclip ── provides text for Ctrl+V

Requirements

  • Linux (tested on Gentoo/XFCE4, should work on other distros with GTK3/X11)
  • GTK3 development libraries
  • Leptonica
  • Tesseract OCR
  • xclip command-line tool
  • GCC (or any C compiler compatible with GTK3/Tesseract)

Gentoo example:

sudo emerge media-gfx/tesseract media-libs/leptonica x11-misc/xclip dev-libs/gtk+

Debian/Ubuntu example:

sudo apt install libgtk-3-dev libleptonica-dev tesseract-ocr-dev xclip build-essential

Building

Clone or download the repository: git clone cd ocrcap

Compile the main application: gcc main.c -o ocrcap $(pkg-config --cflags --libs gtk+-3.0 lept tesseract) -lm

Compile the preferences dialog (optional): gcc ocrcap_prefs.c -o ocrcap_prefs $(pkg-config --cflags --libs gtk+-3.0 lept tesseract) -lm

Alternatively, use the provided Makefile: make

Installation

(Optional) Move both binaries to a directory in your $PATH: sudo cp ocrcap ocrcap_prefs /usr/local/bin/ sudo chmod +x /usr/local/bin/ocrcap /usr/local/bin/ocrcap_prefs

XFCE4 Integration

Binding ocrcap Keyboard Shortcut

  1. Open Settings Manager → Keyboard → Application Shortcuts
  2. Click Add, then enter /usr/local/bin/ocrcap as the command
  3. Press your desired key combination (e.g., Ctrl+PrtScr)

Add to Application's Menu

  1. Ensure /usr/local/bin/ocrcap_prefs exists
  2. Navigate to ~/.local/share/applications/
  3. nano ~/.local/share/applications/ocrcap-preferences.desktop
  4. Paste the following:
[Desktop Entry]
Version=1.0
Type=Application
Name=OCRCP Settings
Comment=Configure OCRCAP options
Exec=/usr/local/bin/ocrcap_prefs
Icon=gtk-preferences
Terminal=false
Categories=AudioVideo;Settings;
StartupNotify=true

Now you can invoke OCR capture anywhere in XFCE with a single shortcut.

Usage

Ensure Installation and XFCE4 Integration procedure is complete. Then simply press your configured hotkey combo which you configured from your Keyboard Settings (xfce4-keyboard-settings)

Main OCR Tool

ocrcap

A full-screen overlay will appear:

  1. Click and drag to select the region containing text
  2. Release the mouse button to capture the region
  3. Recognized text will be automatically copied to your clipboard
  4. The overlay disappears, and the app exits after 2 seconds
  5. Paste anywhere: Ctrl+V, middle-click, context menu, or triple-click/tap

Preferences Tool

ocrcap_prefs

Configure:

  • Selection rectangle color and transparency
  • Crosshair enable/disable and color
  • Settings are saved to ~/.config/ocrcap/ocrcap.conf

Debugging

Output of recognized text will be printed to stdout:

=== OCR RESULT === Detected text here ✅ Text ready! Use Ctrl+V or middle-click to paste

Ensure Tesseract is installed and working: tesseract ~/Pictures/test.png stdout

Configuration File

Settings are stored in ~/.config/ocrcap/ocrcap.conf:

region_r = 0.980000 region_g = 0.380000 region_b = 0.008000 region_alpha = 0.200000 crosshair_enable = 0 crosshair_r = 1.000000 crosshair_g = 0.000000 crosshair_b = 0.000000

License

This project is licensed under the GNU General Public License v2 or later. See LICENSE for details.


OCRCAP was created by BigSlimThic, a hopelessly broke digital low-life who somehow grew up somewhere between Philadelphia and probably South East Asia, surviving on instant noodles and bad Wi-Fi. Rumor has it he has a smoking hot girlfriend, unless she left him for a guy with a real job. Against all odds, he somehow managed to survive the apocalypse of homelessness, poverty, and questionable life choices to create this AI.

Donate to BigSlimThic: Help fund his lifelong quest to buy an ergonomic chair, a better Wi-Fi router, and possibly a vacation somewhere that isn't just his imagination.

BTC: 3GtCgHhMP7NTxsdNjcDs7TUNSBK6EXoAzz ETH: 0x5f1ed610a96c648478a775644c9244bf4e78631e


Changelog

v1.1 (GTK Clipboard Version)

  • DUAL XCLIP PROCESSES - One for PRIMARY, one for CLIPBOARD
  • Uses setsid() to properly detach xclip processes
  • Redirects output with freopen("/dev/null") to silence xclip
  • 2-second delay before exit (increased from 300ms)
  • GTK clipboards set as backup (belt and suspenders)
  • Selection monitoring with owner-change signal
  • Automatic cleanup of xclip processes when pasted
  • Added configuration file support (~/.config/ocrcap/ocrcap.conf)
  • Added crosshair feature (configurable colors)
  • Switched to GTK clipboard functions instead of xclip
  • Added configurable region colors and transparency
  • Exited after 300ms delay
  • Fixed all thee paste methods i.e. ctrl-v, context menu paste, tripple tap/middle mouse

v1.0 (Original Script)

  • Simple fullscreen overlay with selection rectangle
  • No configuration file
  • No crosshair option
  • Used xclip as primary copy method
  • Exited immediately after OCR with 300ms delay
  • Text copied to both CLIPBOARD and PRIMARY via xclip
  • No configurable colors
  • No preferences dialog
  • Worked reliably but no customization

About

GTK3-based screenshot OCR clipboard utility with Tesseract

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors