Abserny - Object Detection for the Visually Impaired

A voice-activated object detection application designed to help visually impaired users understand their surroundings using artificial intelligence and speech feedback.

Features

Voice Activation: Say "start", "detect", "see", or "look" to trigger object detection
Offline Operation: Uses Vosk for offline speech recognition and pyttsx3 for text-to-speech
Real-time Detection: YOLOv8 for fast, accurate object detection
Natural Descriptions: Converts detections into natural language descriptions
Accessible UI: Clean KivyMD interface designed for accessibility

Architecture

abserny/
├─ app/                     # UI entry point
│  └─ main.py              # KivyMD application
├─ orchestrator/            # Main coordination logic
│  └─ supervisor.py        # Orchestrates all services
├─ domain/                  # Business logic
│  ├─ entities.py          # Data models
│  └─ usecases.py          # Use cases
├─ services/                # External service wrappers
│  ├─ camera.py            # Camera interface
│  ├─ vision.py            # YOLO wrapper
│  ├─ audio.py             # Text-to-speech
│  └─ speech_recognition.py # Vosk wrapper
├─ infra/                   # Infrastructure
│  └─ config.py            # Configuration
└─ requirements.txt

Installation

1. Install Python Dependencies

pip install -r requirements.txt

2. Download Vosk Model

Download a Vosk model for your language:

English: Offical Link
Extract to a model folder in the project root

wget https://alphacephei.com/vosk/models/vosk-model-small-en-us-0.15.zip
unzip vosk-model-small-en-us-0.15.zip
mv vosk-model-small-en-us-0.15 model

3. YOLO Model

The YOLOv8 nano model will be automatically downloaded on first run.

Usage

Run the Application

python app/main.py

How It Works

The app starts listening for voice commands
Say "start" or another trigger word
The camera captures a frame
YOLO detects objects in the frame
The app speaks a natural description of what it sees
Returns to listening mode

Trigger Words

"start"
"detect"
"see"
"look"
"scan"

Configuration

Edit infra/config.py to customize:

Camera settings
YOLO model and confidence threshold
Trigger words
Detection cooldown period
TTS settings

Example Output

User says: "start"

App responds: "Detecting... I see a person, a laptop, and 2 cups"

Requirements

Python 3.8+
Webcam
Microphone
Speakers/headphones

Troubleshooting

No audio output

Check that pyttsx3 is properly installed
Try: python -c "import pyttsx3; pyttsx3.init()"

Speech recognition not working

Verify Vosk model is in the model folder
Check microphone permissions
Test microphone: python -m sounddevice

Camera not found

Check camera is connected and not in use
Try changing CAMERA_ID in config (0, 1, 2, etc.)

YOLO model not loading

Ensure internet connection for first download
Manually download from: Direct Link

Future Enhancements

Distance estimation
Directional information (left/right/center)
Hazard detection (stairs, obstacles)
Object tracking across frames
Custom object classes for indoor/outdoor environments
Multi-language support
Gesture control

License

This project is intended for educational purposes as a graduation project.

Credits

YOLOv8: Ultralytics
Vosk: Alpha Cephei
KivyMD: KivyMD Team

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Abserny - Object Detection for the Visually Impaired

Features

Architecture

Installation

1. Install Python Dependencies

2. Download Vosk Model

3. YOLO Model

Usage

Run the Application

How It Works

Trigger Words

Configuration

Example Output

Requirements

Troubleshooting

No audio output

Speech recognition not working

Camera not found

YOLO model not loading

Future Enhancements

License

Credits

About

Uh oh!

Releases 1

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
app		app
domain		domain
infra		infra
model		model
orchestrator		orchestrator
services		services
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
yolov8n.pt		yolov8n.pt

Abserny/Abserny

Folders and files

Latest commit

History

Repository files navigation

Abserny - Object Detection for the Visually Impaired

Features

Architecture

Installation

1. Install Python Dependencies

2. Download Vosk Model

3. YOLO Model

Usage

Run the Application

How It Works

Trigger Words

Configuration

Example Output

Requirements

Troubleshooting

No audio output

Speech recognition not working

Camera not found

YOLO model not loading

Future Enhancements

License

Credits

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages