For this project, I developed a complete end-to-end multilabel classifier leveraging a convolutional neural network (CNN) to extract features from audio mel-spectrograms. The model produces two classification outputs:
A language classifier capable of identifying English, Russian, German, French, Spanish, and Arabic.
A sex classifier distinguishing between male and female voices.