GitHub - johnamit/motoreid: A deep learning pipeline for MotoGP team detection, tracking, and re-identification from race broadcast footage. This system combines YOLOv8 for robust object detection with DINOv3 (Vision Transformer) embeddings for semantic team classification.

A deep learning pipeline for MotoGP team detection, tracking, and re-identification from race broadcast footage. This system combines YOLOv8 for robust object detection with DINOv3 (Vision Transformer) embeddings for semantic team classification. It addresses specific challenges in high-speed sports computer vision: persistent identity tracking across extreme occlusions, rapid camera cuts, and motion blur. This project is in active development.

Development Status

This project is currently in active development. Upcoming changes to this repository include:

Finalising the REID memory bank strategy for long-term occlusion handling.
Improving "Hard Negative" performance on rear-view angles and implementing Kalman Filter integration for trajectory smoothing.
Demonstration video to be added soon.

Overview

This project implements a multi-stage perception pipeline designed for high-velocity agents:

Detection — YOLOv8 (fine-tuned) localizes motorcycles in each frame.
Feature Extraction — DINOv3 ViT-S/16 extracts dense semantic features from detected regions, leveraging its self-supervised understanding of object geometry.
Classification — A lightweight Logistic Regression head predicts team identity from the high-dimensional DINO embeddings.
Re-ID & Tracking — ByteTrack handles short-term association, while a Cosine Similarity Memory Bank enables long-term re-identification after occlusions.
State Estimation — EMA embedding smoothing and velocity-based position prediction reduce ID switching during glare or blur.

Key Features:

Label locking after high-confidence agreement
Re-ID matching via combined visual + spatial similarity
Position trajectory tracking with velocity prediction
Global constraint enforcement (max 2 bikes per team)
Visual debug overlay (trajectories, Re-ID events)

Dataset Construction & Training

Unlike general-purpose object detectors, this project relies on a highly specialized, manually curated dataset to handle the specific livery variations of the 2025 MotoGP grid.

1. Data Ingestion (YouTube)

Raw footage was sourced from high-definition broadcast highlights (1080p/60fps) on YouTube. We developed custom ffmpeg scripts (src/extract_frames.py) to extract frames at specific intervals, ensuring a diverse range of lighting conditions (sunny, overcast) and camera angles (onboard, trackside, helicopter).

2. Annotation (Roboflow)

To train the YOLOv8 detector, we created a custom bounding box dataset:

Aspect	Details
Platform	Roboflow for annotation management and augmentation
Labeling	Manually annotated 501 instances (404 motorcycles, 97 null), specifically filtering out "soft" targets (e.g., pit lane scooters, pedestrians) to focus on the riders and their racing prototypes
Augmentation	Applied random rotations, exposure adjustments, and noise injection to simulate broadcast compression artifacts
Export	YOLOv8-compatible format with train/valid/test splits

3. Identity Dataset

For the DINOv3 classifier, we built a reference library of ~700 high-quality crops:

Stage	Tool	Description
Harvesting	`src/harvest_bikes.py`	Auto-crop bikes from video stream using trained YOLO model
Cleaning	Manual curation	Remove motion blur, ensure balanced class distribution across all 11 teams
Hard Negatives	Targeted mining	Specifically captured "difficult" angles (direct rear view, extreme lean angles) to force the model to learn geometric features beyond just side-fairing logos

4. Training Pipeline

YouTube Highlights (1080p/60fps)
        │
        ▼
┌───────────────────┐
│  Frame Extraction │ ──► extract_frames.py
│  (ffmpeg @ 5fps)  │
└───────────────────┘
        │
        ├──────────────────────────────┐
        ▼                              ▼
┌───────────────────┐        ┌───────────────────┐
│  Roboflow         │        │  harvest_bikes.py │
│  (BBox Annotation)│        │  (Auto-crop)      │
└───────────────────┘        └───────────────────┘
        │                              │
        ▼                              ▼
┌───────────────────┐        ┌───────────────────┐
│  YOLOv8           │        │  Manual Sorting   │
│  Fine-tuning      │        │  (11 Team Folders)│
└───────────────────┘        └───────────────────┘
        │                              │
        ▼                              ▼
┌───────────────────┐        ┌───────────────────┐
│  Motorcycle       │        │  DINOv3 + LogReg  │
│  Detector         │        │  Team Classifier  │
└───────────────────┘        └───────────────────┘

Project Structure

MotoGP-Team-Detection/
├── assets/                     # Banner images and visual assets
├── data/
│   ├── input/
│   │   ├── race_highlights/    # Source video files
│   │   ├── race_frames/        # Extracted frames for training
│   │   │   ├── 2025_spanish_gp/
│   │   │   ├── 2025_italian_gp/
│   │   │   ├── 2025_qatar_gp/
│   │   │   └── 2025_german_gp_sprint/
│   │   └── null_samples/       # Background/non-bike samples
│   ├── output/                 # Annotated video results
│   │   ├── annotated_races_yolo/
│   │   └── annotated_races_yolodino/
│   └── teams/                  # The Identity Dataset (Sorted by Team)
│       ├── aprilia_factory/
│       ├── aprilia_trackhouse/
│       ├── ducati_lenovo/
│       ├── ducati_gresini/
│       ├── ducati_vr46/
│       ├── honda_hrc/
│       ├── honda_lcr/
│       ├── ktm_factory/
│       ├── ktm_tech3/
│       ├── yamaha_monster/
│       └── yamaha_pramac/
├── models/
│   ├── DINO/                   # DINOv3 backbone weights
│   └── YOLO/                   # Fine-tuned YOLOv8 weights
├── notebooks/
│   └── teams_dist.ipynb        # Team distribution analysis
├── runs/
│   ├── classifier/             # Trained Scikit-Learn team classifiers
│   └── detect/                 # YOLO training logs
├── src/
│   ├── main.py                 # CORE PIPELINE (Detection + Re-ID Logic)
│   ├── detector.py             # YOLO-only inference check
│   ├── train_detector.py       # YOLO fine-tuning script
│   ├── train_identity_model.py # DINOv3 feature extraction & classifier training
│   ├── extract_frames.py       # Frame extraction from videos
│   ├── harvest_bikes.py        # Dataset creation tool (auto-crop)
│   └── collect_nulls.py        # Collect background/null samples
├── mgp_env.yaml                # Conda environment definition
└── requirements.txt            # Python dependencies

Installation

Prerequisites

Python 3.10+
PyTorch 2.4+ (Required for DINOv3 compatibility)
CUDA 12.x (Tested on CUDA 12.8)
Operating System: Linux (tested on Ubuntu 22.04)

Tested on: NVIDIA RTX 3090 • AMD Ryzen 7 • 32GB RAM

1. Clone the Repository

git clone https://github.com/johnamit/mgp-detect.git
cd mgp-detect

2. Create Environment

Using Conda (Recommended):

conda env create -f mgp_env.yaml
conda activate mgp_env

3. Setup Dependencies

This project uses DINOv3, which requires a local clone of the official repository:

# Clone DINOv3 into the project root
git clone https://github.com/facebookresearch/dinov3.git

# Download Pre-trained Weights (ViT-Small)
mkdir -p models/DINO
wget -O models/DINO/dinov3_vits16_pretrain_lvd1689m.pth \
    https://huggingface.co/facebook/dinov3-vits16-pretrain-lvd1689m/resolve/main/dinov3_vits16_pretrain_lvd1689m.pth

Usage

Full Pipeline (Detection + Re-ID)

Run the complete pipeline. This script initializes the models, processes the video with temporal smoothing, and outputs annotated results.

python src/main.py \
    --source data/input/race_highlights/spanish_gp_2025.mp4 \
    --output data/output/annotated_races_yolodino/spanish_gp.mp4 \
    --yolo_weights models/YOLO/best.pt \
    --classifier_path runs/classifier/dinov3_identity_model.pkl \
    --dino_weights models/DINO/dinov3_vits16_pretrain_lvd1689m.pth \
    --conf_thresh 0.25 \
    --draw_trajectory \
    --draw_reid_events

Key Arguments:

Argument	Type	Default	Description
`--source`	str	Required	Path to input video
`--output`	str	`data/output/...`	Path for annotated output
`--conf_thresh`	float	`0.25`	YOLO confidence threshold
`--lock_threshold`	float	`0.85`	Agreement ratio to lock team label
`--reid_visual_thresh`	float	`0.80`	Cosine similarity for Re-ID matching
`--draw_trajectory`	flag	-	Draw position history trails
`--draw_reid_events`	flag	-	Flash Re-ID match events

YOLO-Only Detection

For simple detection without team classification:

python src/detector.py \
    --video data/input/race_highlights/race.mp4 \
    --output data/output/annotated_races_yolo/race_annotated.mp4

Train Team Classifier

If you add new images to data/teams/, retrain the identity head:

python src/train_identity_model.py \
    --data_dir data/teams \
    --model_dir runs/classifier \
    --model_weights models/DINO/dinov3_vits16_pretrain_lvd1689m.pth \
    --batch_size 64

Extract Frames from Video

python src/extract_frames.py \
    --video data/input/race_highlights/race.mp4 \
    --output data/input/race_frames/race_name \
    --fps 5

Harvest Bike Crops

Auto-crop detected bike regions for building the identity dataset:

python src/harvest_bikes.py \
    --frames_dir data/input/race_frames/race_name \
    --output_dir data/teams/unlabeled \
    --yolo_weights models/YOLO/yolov8m.pt

Pipeline

Architecture Diagram

┌─────────────────────────────────────────────────────────────────────┐
│                         INPUT VIDEO FRAME                           │
└─────────────────────────────────────────────────────────────────────┘
                                  │
                                  ▼
┌─────────────────────────────────────────────────────────────────────┐
│                    YOLOv8 DETECTION                                 │
│                    (Motorcycle Localization)                        │
└─────────────────────────────────────────────────────────────────────┘
                                  │
                                  ▼
┌─────────────────────────────────────────────────────────────────────┐
│                    DINOv3 ViT-S/16                                  │
│                    (Feature Extraction)                             │
└─────────────────────────────────────────────────────────────────────┘
                                  │
                    ┌─────────────┴─────────────┐
                    ▼                           ▼
┌───────────────────────────┐     ┌───────────────────────────────────┐
│   TEAM CLASSIFIER         │     │   BYTETRACK + MEMORY BANK         │
│   (Logistic Regression)   │     │   (Short + Long Term Tracking)    │
└───────────────────────────┘     └───────────────────────────────────┘
                    │                           │
                    └─────────────┬─────────────┘
                                  ▼
┌─────────────────────────────────────────────────────────────────────┐
│                    LABEL VOTING + LOCKING                           │
│                    (Confidence-based agreement)                     │
└─────────────────────────────────────────────────────────────────────┘
                                  │
                                  ▼
┌─────────────────────────────────────────────────────────────────────┐
│                    ANNOTATED OUTPUT                                 │
│                    (Team labels, trajectories, Re-ID events)        │
└─────────────────────────────────────────────────────────────────────┘

Re-ID Strategy

The system maintains identity through:

EMA Embedding Smoothing — Running average of appearance features reduces noise from motion blur
Velocity Prediction — Extrapolate position when occluded based on historical trajectory
Memory Bank — Store embeddings of lost tracks for later matching (up to 300 frames)
Combined Similarity — Visual (cosine) + Spatial (L2 distance) matching with configurable weights

Teams

The system identifies all 11 teams from the 2025 MotoGP grid:

Manufacturer	Teams	Color Code
Ducati	Lenovo, Gresini, VR46	Red / Orange-Red / Yellow
Aprilia	Factory, Trackhouse	Green / Teal
KTM	Factory, Tech3	Orange / Dark Orange
Yamaha	Monster, Pramac	Blue / Light Blue
Honda	HRC, LCR	Black / Silver

TEAM_COLORS = {
    'aprilia_factory':    (0, 255, 0),      # Green
    'aprilia_trackhouse': (0, 200, 100),    # Teal
    'ducati_lenovo':      (0, 0, 255),      # Red
    'ducati_gresini':     (0, 100, 255),    # Orange-Red
    'ducati_vr46':        (0, 255, 255),    # Yellow
    'honda_hrc':          (0, 0, 0),        # Black
    'honda_lcr':          (200, 200, 200),  # Light Gray
    'ktm_factory':        (0, 165, 255),    # Orange
    'ktm_tech3':          (0, 140, 255),    # Dark Orange
    'yamaha_monster':     (255, 0, 0),      # Blue
    'yamaha_pramac':      (255, 100, 100),  # Light Blue
}

Citation

If you use this code or methodology in your research, please cite:

@misc{motoreid,
  author = {Amit John},
  title = {MotoReID},
  year = {2025},
  url = {https://github.com/johnamit/motoreid}
}

DinoV3:

@article{simeoni2025dinov3,
  title={Dinov3},
  author={Sim{\'e}oni, Oriane and Vo, Huy V and Seitzer, Maximilian and Baldassarre, Federico and Oquab, Maxime and Jose, Cijo and Khalidov, Vasil and Szafraniec, Marc and Yi, Seungeun and Ramamonjisoa, Micha{\"e}l and others},
  journal={arXiv preprint arXiv:2508.10104},
  year={2025}
}

YOLOV8:

@software{yolov8_ultralytics,
  author = {Glenn Jocher and Ayush Chaurasia and Jing Qiu},
  title = {Ultralytics YOLOv8},
  version = {8.0.0},
  year = {2023},
  url = {https://github.com/ultralytics/ultralytics},
  orcid = {0000-0001-5950-6979, 0000-0002-7603-6750, 0000-0003-3783-7069},
  license = {AGPL-3.0}
}

License

This project is released under the MIT License.

Note: MotoGP broadcast footage is copyrighted material. This project is intended for educational and research purposes only. All models were trained on fair-use excerpts for non-commercial analysis.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Development Status

Overview

Dataset Construction & Training

1. Data Ingestion (YouTube)

2. Annotation (Roboflow)

3. Identity Dataset

4. Training Pipeline

Project Structure

Installation

Prerequisites

1. Clone the Repository

2. Create Environment

3. Setup Dependencies

Usage

Full Pipeline (Detection + Re-ID)

YOLO-Only Detection

Train Team Classifier

Extract Frames from Video

Harvest Bike Crops

Pipeline

Architecture Diagram

Re-ID Strategy

Teams

Citation

License

About

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
assets		assets
data		data
models		models
notebooks		notebooks
src		src
.gitignore		.gitignore
README.md		README.md
mgp_env.yaml		mgp_env.yaml
requirements.txt		requirements.txt

johnamit/motoreid

Folders and files

Latest commit

History

Repository files navigation

Development Status

Overview

Dataset Construction & Training

1. Data Ingestion (YouTube)

2. Annotation (Roboflow)

3. Identity Dataset

4. Training Pipeline

Project Structure

Installation

Prerequisites

1. Clone the Repository

2. Create Environment

3. Setup Dependencies

Usage

Full Pipeline (Detection + Re-ID)

YOLO-Only Detection

Train Team Classifier

Extract Frames from Video

Harvest Bike Crops

Pipeline

Architecture Diagram

Re-ID Strategy

Teams

Citation

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Languages