top of page

Gender of Voice Prediction

​

William Wei
liyewei2019@u.northwestern.edu
EECS 349 Spring 2018 Northwestern University
Services
Motivation
Determining a person’s gender as male or female, based upon a sample of their voice seems to initially be an easy task. Often, the human ear can easily detect the difference between a male or female voice within the first few spoken words. However, designing a computer program to do this turns out to be a bit trickier.
My task was to predict the gender of users based on characteristics of voices provided in the dataset. While prior work has investigated male/female classification based on language use, we are specifically inclusive of certain attribute and corresponding importance. Thus, this project makes an important contribution to our understanding of the relationship between gender and voice.
Methodology
This database was created to identify a voice as male or female, based upon acoustic properties of the voice and speech. The dataset consists of 3,168 recorded voice samples, collected from male and female speakers. The 20% data was regarded as testing set while 80% data was set as training set. And there are 28 attributes overall, and standard deviation of frequency, first quantile, interquantile range, spectral flatness, mode frequency and average of fundamental frequency measured across acoustic signal are chosen as initial tasting attributes. The voice samples are pre-processed by acoustic analysis in R using the seewave and tuneR packages, with an analyzed frequency range of 0hz-280hz.
Result
In order to determine which exact properties indicate a target gender of male or female, we could initially guess that it likely one of the statistically significant features, but ultimately this decision breakdown is masked within the model. And to gain an understanding of a trained model, we can apply a classification via regression tree model to our dataset on Weka software to determine how these properties might correspond to a gender classification of male or female.
The classification via regression model achieves an accuracy of 81% on the training set and the mode frequency attribute serves as a root node for detecting the gender as male or female. From there, it then checks the minimum fundamental frequency, followed by more specific properties.
Then the random forest model is also applied and it achieves an accuracy of 100% on the training set, which is further improvement over the classification via regression model.
bottom of page