Sign2Text: Sign Language to Text Conversion

Overview

Sign2Text is a web-based application that converts American Sign Language (ASL) gestures into text in real-time. The application uses computer vision and machine learning to detect hand gestures and translate them into corresponding letters and words, making communication more accessible for sign language users.

Sign2Text.Demonstration.Video.mp4

Project Background

More than 70 million deaf people around the world use sign languages to communicate. Sign language allows them to learn, work, access services, and be included in their communities. However, it's challenging to make everyone learn sign language to ensure people with disabilities can enjoy their rights on an equal basis with others.

This project aims to develop a user-friendly human-computer interface (HCI) where the computer understands American Sign Language, helping deaf and mute people communicate more easily with the wider community.

Features

Real-time Sign Language Detection: Converts ASL hand gestures to text in real-time
Word Suggestions: Provides word suggestions based on detected characters
Text-to-Speech: Speaks the detected sentence aloud
ASL Reference Chart: Includes a reference chart for ASL alphabet
User-friendly Interface: Clean, responsive web interface
Special Gestures: Support for "space" and "next" gestures to build sentences
Background-Independent: Works in various lighting conditions and backgrounds

How It Works

Innovative Approach to Hand Detection

The application uses a unique approach to detect hand gestures:

Hand Landmark Detection: Using MediaPipe to detect 21 key points on the hand
White Background Rendering: Drawing these landmarks on a plain white background
- This eliminates issues with varying lighting conditions and backgrounds
CNN Classification: The hand gesture is classified using a Convolutional Neural Network
Post-processing: Additional logic determines the exact letter

Special Gestures

The application recognizes two special gestures:

Space Gesture: Used to add a space between words
- Make this gesture when you want to start a new word
Next Gesture: Used after making a letter gesture to add it to the sentence
- This is essential for building words letter by letter
Backspace Gesture: Used to delete the last character
- Useful for correcting mistakes

Model Architecture

The CNN model was trained on 180 skeleton images of ASL alphabets. To improve accuracy, the 26 alphabets were divided into 8 classes of similar gestures:

[A, E, M, N, S, T]
[B, D, F, I, U, V, K, R, W]
[C, O]
[G, H]
[L]
[P, Q, Z]
[X]
[Y, J]

This approach achieved a 88-95% accuracy rate, even with varying backgrounds and lighting conditions.

Usage

Position your hand in the camera view
Make ASL gestures for letters
Use the "next" gesture to add the detected character to the sentence
Alternatively, use the "Add to Sentence" button
Click on word suggestions to complete words
Use the "Speak" button to hear the sentence
Use the "Clear" button to start over

Technical Details

Backend

Flask: Web framework for the application
OpenCV: For webcam access and image processing
TensorFlow/Keras: For loading and running the CNN model
MediaPipe/cvzone: For hand landmark detection
pyttsx3: For text-to-speech functionality
Enchant: For word suggestions

Frontend

HTML/CSS/JavaScript: For the web interface
Fetch API: For communication with the backend

Advantages of Our Approach

Background Independence: Works in any environment, not just clean backgrounds
Lighting Robustness: Functions in various lighting conditions
High Accuracy: 88-95% accuracy in recognizing ASL alphabets
Real-time Performance: Fast enough for practical use
User-friendly Interface: Easy to use with minimal training

Future Work

Develop a mobile application version
Expand to include more sign language gestures beyond the alphabet
Implement continuous sign language recognition for common phrases
Add support for other sign languages beyond ASL

Credits and Acknowledgments

This project is based on the work of Devansh Raval (19IT470) and Kelvin Parmar (19IT473) under the guidance of Dr. Nilesh B. Prajapati at Birla Vishvakarma Mahavidyalaya Engineering College, Information Technology Department. Model Info

The original research implemented:

A CNN-based approach for sign language recognition
MediaPipe for hand landmark detection
An innovative technique of drawing landmarks on a white background to overcome lighting and background issues

Their approach achieved 97% accuracy in recognizing ASL alphabets, even in varying conditions.

Created by Tejashya Mehta

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.vscode		.vscode
backend		backend
frontend		frontend
README.md		README.md
requirements.txt		requirements.txt
white.jpg		white.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sign2Text: Sign Language to Text Conversion

Overview

Project Background

Features

How It Works

Innovative Approach to Hand Detection

Special Gestures

Model Architecture

Usage

Technical Details

Backend

Frontend

Advantages of Our Approach

Future Work

Credits and Acknowledgments

About

Uh oh!

Releases

Packages

Languages

tejashyamehta/Digital-Inclusion-for-Sign-Language

Folders and files

Latest commit

History

Repository files navigation

Sign2Text: Sign Language to Text Conversion

Overview

Project Background

Features

How It Works

Innovative Approach to Hand Detection

Special Gestures

Model Architecture

Usage

Technical Details

Backend

Frontend

Advantages of Our Approach

Future Work

Credits and Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages