Skip to content

nitinnat/cursor-tracker

Repository files navigation

CursorTracker

Unsupervised mouse cursor detection and tracking in instructional videos using tracking-by-detection.

Quick Start

# Install dependencies
poetry install && poetry shell

# From YouTube URL - single command for everything
python cursor_tracker.py \
  --url https://youtube.com/watch?v=VIDEO_ID \
  --output-dir ./data/my_video

# View results
open data/my_video/our_results_1/tracked_video_our_results_1.mp4

Features

  • Fully Unsupervised: Automatically discovers cursor templates, no manual annotation needed
  • End-to-End Pipeline: YouTube URL to Download to Extract to Track to Visualize
  • Robust Tracking: Handles fast motion (over 200px per frame) and instant appearance changes
  • Visual Output: Generates annotated videos with bounding boxes around detected cursors

How It Works

  1. Unsupervised Template Discovery: Uses background subtraction + blob detection to identify cursor templates
  2. Multi-Scale Template Matching: Generates cursor proposals for each frame
  3. Spatiotemporal Path Optimization: Finds optimal tracking trajectory through entire video
  4. Visualization: Draws bounding boxes on frames and creates annotated video

Installation

# Install Poetry
curl -sSL https://install.python-poetry.org | python3 -

# Install dependencies
git clone https://github.com/yourusername/CursorTracker.git
cd CursorTracker
poetry install
poetry shell

# Create directories
mkdir -p data templates saved_models

Usage

YouTube Videos (Recommended)

Basic usage:

python cursor_tracker.py \
  --url "https://youtube.com/watch?v=VIDEO_ID" \
  --output-dir ./data/my_video

Options:

# Custom quality
--quality 1080p  # Options: 144p, 360p, 480p, 720p, 1080p, 1440p, 2160p

# Process specific frames
--start-frame 100 --end-frame 500

# Skip tracking (preprocessing only)
--skip-tracking

# Custom configuration
--config my_config.yaml

Local Video Files

# Step 1: Preprocess video
python preprocess_video.py \
  --video_path /path/to/video.mp4 \
  --output_dir ./data/my_video \
  --extract_templates

# Step 2: Track cursor
python cursor_tracker_dp.py \
  --video_name my_video \
  --base_dir ./data

# Step 3: Visualize (optional - automatic with YouTube pipeline)
python visualize_results.py \
  --video_name my_video \
  --base_dir ./data

Output Structure

data/my_video/
├── original_video.mp4              # Downloaded video
├── images/                         # Extracted frames
├── background/                     # Background masks
├── estimated_templates/            # Auto-discovered cursor templates
└── our_results_1/
    ├── our_results.txt             # Tracking results (CSV)
    ├── visualizations/             # Annotated frames
    └── tracked_video_our_results_1.mp4  # Annotated video

Visualization

Automatic (YouTube Pipeline)

Visualizations are generated automatically when using cursor_tracker.py.

Manual (Standalone)

python visualize_results.py \
  --video_name my_video \
  --base_dir ./data \
  --bbox_color "0,255,0" \  # Green (BGR format)
  --bbox_thickness 2 \
  --fps 30 \
  --quality 9

Configuration

Edit config/config.yaml to customize:

template_matching:
  score_threshold: 0.5          # Min template match score
  use_laplacian: true           # Edge detection
  template_vicinity: 300        # Temporal window for templates
  max_scale: 2                  # Max template scale factor
  nms_overlap_threshold: 0.3    # IoU threshold for NMS

tracking:
  enabled: true                 # Enable path optimization
  dist_threshold: 150           # Max pixel distance between frames
  scale_threshold: 1.3          # Max scale change ratio

Performance

Tested on 8 Adobe Photoshop instructional videos (3595 frames):

Method VIOU Success Rate
CursorTracker (Ours) 0.365 ~87%
Faster-RCNN 0.05 ~25%
Online Trackers (TLD/MIL) 0.03 ~15%
  • Speed: ~0.5 seconds/frame (1280×720)
  • Robustness: Handles 200+ pixel movements and instant appearance changes

Key Scripts

Script Purpose
cursor_tracker.py Main pipeline: YouTube → Track → Visualize
preprocess_video.py Extract frames + background masks from local video
extract_templates.py Discover cursor templates from preprocessed data
cursor_tracker_dp.py Run cursor tracking with DP path optimization
visualize_results.py Generate annotated frames and video

Dependencies

Core:

  • Python >=3.10,<3.13
  • OpenCV, NumPy, scikit-image
  • PyYAML, tqdm, imageio
  • youtube-downloader (git dependency)

Optional (install with poetry install --with ml):

  • TensorFlow, Keras (for CNN filtering)

Citation

@inproceedings{cursortracker2020,
  title={Mouse Cursor Detection and Tracking in Instructional Videos},
  booktitle={IEEE Winter Conference on Applications of Computer Vision (WACV)},
  year={2020}
}

Troubleshooting

Few templates discovered?

  • Check background subtraction quality in background/ folder
  • Adjust --consecutive_frames parameter in template extraction

Poor tracking results?

  • Tune dist_threshold and scale_threshold in config
  • Try adjusting score_threshold (lower = more proposals)

Out of memory?

  • Reduce template_vicinity parameter
  • Process in segments with --start-frame / --end-frame

Algorithm Overview

Phase 1: Unsupervised Template Discovery

  • Apply MOG background subtraction
  • Detect blobs (moving objects) in difference images
  • Track sequences where exactly 1 blob appears for N consecutive frames
  • Extract and save cursor templates from these sequences

Phase 2: Multi-Scale Template Matching

  • Select templates from temporal vicinity of current frame
  • Generate multi-scale template versions
  • Perform normalized cross-correlation matching
  • Apply non-maximum suppression to proposals

Phase 3: Optimal Path Search

  • Model as graph optimization problem
  • Find highest-scoring spatiotemporal path through video
  • Enforce distance and scale constraints between consecutive frames
  • Output optimal cursor trajectory

Key Insight: Cursors in screencasts exhibit unique motion signatures (movement while background stays static), enabling unsupervised discovery without labeled training data.

License

MIT


About

Mouse Tracker for Adobe Photoshop videos

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published