2022 ยท Machine Learning

Voice Gender Recognition

Machine Learning Audio Processing Python

Voice Gender Recognition

BITS Pilani | Jul 2022

Python Librosa MFCC SVM Audio Processing

Project Overview

Built audio classification system for automatic gender recognition from voice samples, applicable to voice assistants, call center analytics, and speaker identification systems.

Key Contributions

Feature Extraction: Extracted acoustic features (MFCCs, spectral centroid, zero-crossing rate) from 3,168 voice samples using Librosa library
Model Development: Trained SVM and Random Forest classifiers achieving 96% accuracy on gender classification task with cross-validation
Feature Analysis: Analyzed feature importance, identifying fundamental frequency (F0) and formant frequencies as strongest gender discriminators

Technologies Used

  • Languages: Python
  • Libraries: Librosa, Scikit-Learn
  • Features: MFCC, Spectral Centroid, ZCR
  • Models: SVM (RBF kernel), Random Forest

Key Results

MetricValue
Accuracy96%
Voice Samples3,168
Features Extracted20+
Best ModelSVM (RBF)

Audio Features Used

FeatureDescription
MFCCMel-frequency cepstral coefficients (13 coefficients)
Spectral CentroidCenter of mass of spectrum
Zero-Crossing RateRate of sign changes in signal
Spectral RolloffFrequency below which 85% energy lies
Chroma FeaturesPitch class profile

Applications

  • Voice assistant personalization
  • Call center analytics and routing
  • Speaker identification systems
  • Audio content tagging