A high-performance Computer Vision framework that converts real-time hand gestures into precise system-level commands.
Kinetic-OS is a modular hand-tracking API built on MediaPipe that powers a smooth, low-latency virtual mouse and diagnostic telemetry system.
Designed with scalability, performance, and ML-integration in mind.
- 🖱️ Smooth virtual mouse with zero-jitter interpolation
- ✌️ Gesture-based clicking & scrolling
- 🔴 Safety kill-switch gesture
- 📊 Built-in landmark data logger (CSV export)
- ⚙️ JSON-based persistent configuration system
- 🧠 ML-ready architecture
Kinetic-OS follows a decoupled layered architecture to ensure maintainability and modularity.
Camera Input
↓
HandTrackingModule (Core Engine)
↓
Gesture Logic (AIMouse)
↓
OS Cursor / Scroll Events
- MediaPipe landmark extraction
- Skeletal rendering
- Binary finger-state detection
- Clean, reusable API
- Linear Interpolation (Lerp) smoothing
- Asymmetrical spatial mapping
- Gesture-to-command translation
- Cursor stabilization
- CustomTkinter dashboard
- Persistent
settings.json - Adjustable margins & sensitivity
- Real-time landmark telemetry
- Debug visualization
- CSV dataset generation (
hand_data_log.csv)
| Gesture | Action |
|---|---|
| ☝️ Index Finger | Cursor movement |
| ✌️ Index + Middle Join | Left Click |
| 🖐️ Four Fingers | Vertical Scroll |
| 🤏 Pinky Extension | Kill-Switch |
Tracking zone is biased upward to compensate for webcam Field of View (FOV) limitations.
This allows full bottom-screen access without losing landmark visibility.
Optimized for:
- Intel i7-6600U (dual-core)
- Windows 10/11
- Minimal per-frame calculations
- Efficient NumPy vector operations
To bridge resolution differences:
Camera: 640x480
Screen: 1920x1080
Weighted interpolation:
x_screen = interp(x_finger, [margin, W - margin], [0, ScreenWidth])
Ensures stable and resolution-aware mapping.
Landmarks are exported to:
hand_data_log.csv
This enables:
- Classical ML integration (Scikit-Learn)
- Deep Learning models (TensorFlow)
- Gesture classifier training
- Future LSTM-based temporal recognition
Hardcoded gesture logic can be replaced with trained models.
pip install opencv-python mediapipe pyautogui customtkinter Pillow numpypython MouseSettingsGUI.pypython AIMouse.pyKinetic-OS/
│
|-- hand_landmarker.task # MediaPipe API file
├── HandTrackingModule.py # Core API
├── AIMouse.py # Virtual Mouse Logic
├── MouseSettingsGUI.py # Settings Dashboard
├── HTM_GUI.py # Telemetry & Logger
├── settings.json # Persistent Configuration
└── hand_data_log.csv # Generated Dataset
Developer: github/i-hmzakhan
Institution: University of Engineering & Applied Sciences (UEAS), Swat
Environment Target: Windows 10/11 on mobile i7 hardware
- ML-based gesture classification
- Multi-hand support
- Plugin gesture architecture
- Cross-platform abstraction
- Latency benchmarking study
- User evaluation & accuracy metrics
Kinetic-OS is not just a gesture demo.
It is:
- A reusable hand-tracking engine
- A lightweight HCI middleware prototype
- A foundation for adaptive AI-based gesture systems
MIT License
If you find this project interesting, consider giving it a ⭐