Image Highlight Extractor Script

This script processes a directory of images (JPG, PNG, WEBP, etc.), uses the Google Gemini API to extract text segments highlighted in yellow or green, sorts the extracted text by approximate reading order, and saves the results as a formatted Markdown bullet list. It also prints the results for each image to the console as it runs.

Prerequisites

Python 3.7 or higher installed.
Access to the Google Gemini API and a valid API Key.

Setup

Clone/Download:
- Download the script (extract_highlights.py or your chosen name) and the requirements.txt file into a project directory.
Create Virtual Environment (Highly Recommended):
- Open your terminal or command prompt, navigate to your project directory.
- Run:
```
python -m venv venv
```
- Activate the environment:
  - macOS / Linux: source venv/bin/activate
  - Windows (CMD): venv\Scripts\activate
  - Windows (PowerShell): venv\Scripts\Activate.ps1
  - (You should see (venv) at the beginning of your terminal prompt)
Install Dependencies:
- With your virtual environment activated, run:
```
pip install -r requirements.txt
```
  (This installs google-generativeai, Pillow, and optionally python-dotenv)
Set API Key:
- Option A (Recommended - .env file):
  - Create a file named .env (the filename starts with a dot) in your project directory.
  - Add your API key to this file on a single line:
```
GEMINI_API_KEY=YOUR_API_KEY_HERE
```
  - Replace YOUR_API_KEY_HERE with your actual key.
  - Important: If using Git, add .env to your .gitignore file to avoid committing your key.
- Option B (Environment Variable):
  - Set the GEMINI_API_KEY environment variable in your terminal session before running the script. How you do this depends on your operating system:
    - macOS / Linux:
      export GEMINI_API_KEY="YOUR_API_KEY_HERE"
    - Windows (CMD):
      set GEMINI_API_KEY=YOUR_API_KEY_HERE
    - Windows (PowerShell):
      $env:GEMINI_API_KEY="YOUR_API_KEY_HERE"
  - Note: This variable might only last for the current terminal session.

Usage

Run the script from your terminal (make sure your virtual environment is activated first).

python extract_highlights.py -i <path_to_images> -o <output_markdown_file> [options]

Command-Line Arguments

This section details the arguments you can pass to the script:

-i, --input-dir ( Required):
- Path to the directory containing the images you want to process.
- Example: -i ./my_scans
-o, --output-file ( Required):
- Path where the output Markdown file will be saved.
- Example: -o report.md
-t, --tolerance (Optional):
- The vertical pixel tolerance used when grouping text lines for sorting. Affects how strictly the script considers text to be on the same line if the image is slightly skewed.
- Default: 10
- Example: -t 15
-s, --sleep (Optional):
- The number of seconds to pause between processing each image. This helps manage API rate limits.
- Default: 5
- Example: -s 7
-m, --model (Optional):
- The specific Gemini model name to use for the API calls.
- Default: gemini-1.5-flash-latest (Check script's DEFAULT_MODEL constant if different)
- Example: -m gemini-1.5-pro-latest
- Find available model names here: ai.google.dev/models/gemini

Example Command

Here is an example of how to run the script with some options:

python extract_highlights.py -i ./path/to/my/images -o ./output/highlights_report.md -t 12 -s 5

This command will:

Process images in the ./path/to/my/images directory.
Save the formatted Markdown output to ./output/highlights_report.md.
Use a sorting tolerance of 12 pixels.
Wait 5 seconds between each image processing step.
Use the default Gemini model specified in the script.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.idea		.idea
.gitignore		.gitignore
README.md		README.md
extract_highlights.py		extract_highlights.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image Highlight Extractor Script

Prerequisites

Setup

Usage

Command-Line Arguments

Example Command

About

Uh oh!

Releases

Packages

Uh oh!

Languages

QuintSmart/BookHighlightExtractor

Folders and files

Latest commit

History

Repository files navigation

Image Highlight Extractor Script

Prerequisites

Setup

Usage

Command-Line Arguments

Example Command

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages