2022 ยท Machine Learning
Voice Gender Recognition
Machine Learning Audio Processing Python
Voice Gender Recognition
BITS Pilani | Jul 2022
Python Librosa MFCC SVM Audio Processing
Project Overview
Built audio classification system for automatic gender recognition from voice samples, applicable to voice assistants, call center analytics, and speaker identification systems.
Key Contributions
Feature Extraction: Extracted acoustic features (MFCCs, spectral centroid, zero-crossing rate) from 3,168 voice samples using Librosa library
Model Development: Trained SVM and Random Forest classifiers achieving 96% accuracy on gender classification task with cross-validation
Feature Analysis: Analyzed feature importance, identifying fundamental frequency (F0) and formant frequencies as strongest gender discriminators
Technologies Used
- Languages: Python
- Libraries: Librosa, Scikit-Learn
- Features: MFCC, Spectral Centroid, ZCR
- Models: SVM (RBF kernel), Random Forest
Key Results
| Metric | Value |
|---|---|
| Accuracy | 96% |
| Voice Samples | 3,168 |
| Features Extracted | 20+ |
| Best Model | SVM (RBF) |
Audio Features Used
| Feature | Description |
|---|---|
| MFCC | Mel-frequency cepstral coefficients (13 coefficients) |
| Spectral Centroid | Center of mass of spectrum |
| Zero-Crossing Rate | Rate of sign changes in signal |
| Spectral Rolloff | Frequency below which 85% energy lies |
| Chroma Features | Pitch class profile |
Applications
- Voice assistant personalization
- Call center analytics and routing
- Speaker identification systems
- Audio content tagging
