VoicePrint: Real-Time Speaker Verification

A simplified sophisticated biometric authentication system built with Node.js, TypeScript, and Digital Signal Processing. This project transforms raw human speech into a unique 13-dimensional mathematical "fingerprint" to distinguish between a target user and an imposter.

Project Overview

VoicePrint ID operates on the principle of Vocal Tract Resonance. By analyzing the frequency characteristics of speech (MFCCs), the system captures the physical unique qualities of a user's voice, rather than just the words spoken.

Technical Architecture & Phases

Phase 1: Signal Cleanup & VAD (The Filter)

The first challenge was ensuring the "Brain" only receives high-quality speech data, not silence or air conditioning hum.

Hardware Interface: Interfacing with system microphones using @picovoice/pvrecorder-node.
Voice Activity Detection (VAD): Implementing Picovoice Cobra to gate the pipeline, only allowing frames with a confidence probability > 0.75.
Audio Normalization: Converting 16-bit PCM integers to a normalized Float32 range [-1.0, 1.0].
Hamming Window: Applying a smoothing function to each 512-sample frame (32ms) to prevent "spectral leakage" during frequency analysis.

Phase 2: Feature Extraction (The Brain)

Turning time-domain waves into frequency-domain features.

MFCC Analysis: Using the Meyda library to calculate Mel-Frequency Cepstral Coefficients.
Feature Selection: Extracting 13 coefficients, specifically focusing on indices 1-12 to ignore overall volume (Energy) and focus purely on vocal timbre.
Centroid Averaging: Collecting a matrix of 150 valid speech frames and collapsing them into a single 13-digit vector (the "Voice Signature").

Phase 3 & 4: The Judge (Scoring & Verification)

The final logic that determines "Access Granted" or "Denied."

Cosine Similarity: Implementing a vector math engine to measure the angular distance between a live voiceprint and the stored identity.
Threshold Calibration: Establishing a statistical baseline (0.85) to separate "Target" speakers from "Imposters."

Installation & Setup

1. Prerequisites

Node.js (v18 or higher)
TypeScript (npm install -g typescript)
Picovoice AccessKey: Required for the Cobra VAD engine. Get one at console.picovoice.ai.

2. Environment Setup

Clone the repository and install dependencies:

npm install

Build the project

npm run build

Usage Guide

The system uses a command-line argument to switch between Enrollment and Verification modes.

Step 1: Voice Enrollment

Record your "Standard" voice signature. Speak clearly for about 5-8 seconds until you see a log Voiceprint calculated and saved

npm run enroll <optional enrollment file name here>

If the file name for the enrollment wasnt provided, a voiceprint.print.json will be generated in the /prints. If the optional file name was provided, the saved print will have the same filename as the optional argument provided.

Step 2: Voice Verification

Test your live voice against the stored signature.

npm run verify <stored voiceprint label>

The system will output your Similarity Score (e.g., 95.43%) based on the calibrated threshold.

Metric	Result
Sample Rate	16 KHz
Frame Length	512 Samples (32ms)
Target Match Accuracy	87% - 98% Similarity
Imposter Rejection	Typically < 80% Similarity

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
prints		prints
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
nodemon.json		nodemon.json
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VoicePrint: Real-Time Speaker Verification

Project Overview

Technical Architecture & Phases

Phase 1: Signal Cleanup & VAD (The Filter)

Phase 2: Feature Extraction (The Brain)

Phase 3 & 4: The Judge (Scoring & Verification)

Installation & Setup

1. Prerequisites

2. Environment Setup

Usage Guide

Step 1: Voice Enrollment

Step 2: Voice Verification

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

VoicePrint: Real-Time Speaker Verification

Project Overview

Technical Architecture & Phases

Phase 1: Signal Cleanup & VAD (The Filter)

Phase 2: Feature Extraction (The Brain)

Phase 3 & 4: The Judge (Scoring & Verification)

Installation & Setup

1. Prerequisites

2. Environment Setup

Usage Guide

Step 1: Voice Enrollment

Step 2: Voice Verification

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages