In collaboration with my peers, I published a package called emotion_detective to detect emotions in videos and audio files. This project was made in collaboration with Banijay Benelux who provided us with TV Show data for the training and evalution process.
I’m excited to share Emotion Detective, a project I recently completed with my peers.
It is our first published Python package, which analyses emotions from video and audio content.
This was a collaborative effort with Amy Suneeth, Martin Vladimirov, Andrea Tosheva, Kacper Janczyk, and it challenged us to develop and deploy a fully functional machine learning pipeline, combining natural language processing (NLP) and audio processing.
Our journey didn’t stop there; we took things further by deploying the entire pipeline on Azure to create an automated solution for emotion analysis. We collaborated with Banijay, a global media company, to ingest their multimedia data, train models, and analyze the emotional content of their video files.
Here’s an overview of what we achieved and the new skills we gained during this journey!
The Emotion Detective package is designed to extract, process, and analyse emotions in multimedia files (such as videos or audio files). Whether it's analysing emotions in movie dialogues, podcasts, or any other content, our package offers a comprehensive solution. It can perform emotion classification at the sentence level using various NLP models, such as RoBERTa and RNN.
One of the most exciting achievements was deploying the project on Azure, creating a fully automated pipeline to handle data ingestion, model training, and inference. We worked closely with Banijay, using their video data as input to automatically train emotion detection models.
Here's how the Azure integration works:
By deploying the pipeline in Azure, we were able to automate the entire workflow, from ingesting raw video data to producing emotion analysis reports, making it a valuable tool for Banijay and other content creators.
One of the biggest achievements for our team was successfully publishing our first Python package. It is available for installation via pip install emotion_detective.
The package includes everything from data ingestion and preprocessing functions to the final emotion detection pipelines for training and inference.
Publishing a package involved not only coding but also documentation, version control, dependency management, and debugging—all skills that I’m glad we had the chance to develop.
We also built Sphinx documentation to ensure that others can easily understand and use the package.
Throughout the project, I gained several valuable skills:
The core functionality of the package lies in its two key pipelines: Training Pipeline and Inference Pipeline.
Training Pipeline
Our training pipeline allows users to train their own NLP models using custom datasets. Here’s a quick breakdown of its functionality:
Example Usage:
from emotion_detective.training import training_pipeline
training_pipeline(
train_data_path='path/to/train.csv',
test_data_path='path/to/test.csv',
text_column='text',
emotion_column='emotion',
num_epochs=5,
model_type='roberta',
model_dir='./models/',
model_name='emotion_model'
)
Inference Pipeline
Once a model is trained, our inference pipeline processes audio/video files, transcribes the speech, and analyzes the emotion in each sentence. This pipeline can handle both mp4 video files and mp3 audio files.
Example Usage:
from emotion_detective.inference import main
results = main(
input_media_path='path/to/video.mp4',
model_path='path/to/model.pth',
model_type='roberta',
emotion_mapping_path='path/to/emotion_mapping.csv'
)
print(results)
The package is organized into several modules:
Each file has clearly defined functions to make the package easy to extend or modify for different use cases.
One of the biggest challenges was implementing the emotion classification model in a way that could handle real-world noisy data, especially during audio transcription. Additionally, deploying the entire system on Azure required careful orchestration of the pipelines to ensure smooth operation. Learning how to manage cloud resources, configure virtual machines, and utilize Azure’s machine learning services was crucial to our success.
I’m really proud of the work we accomplished as a team and the collaboration with Banijay, which provided real-world data to test and validate our system. With the Azure deployment, Emotion Detective has the potential to be a valuable tool for researchers, content creators, and media companies alike.
Feel free to check out our documentation or install the package to try it out yourself!