This is a project for my Masters degree, for the Multimodal master degree course
In order to run run the application the following steps are necessary:
- Include the data folder in the same directory (90 most popular songs in Spotify
- Install the libraries in the requirements.txt
- To set up the image data for the Resnet-50 model run the frames_set_up.ipynb to create the frames for each music video clip.
- To set up the text data for the Bert-uncased model run the lyrics_set_up.ipynb to create both text files that include the lyrics and csv files that include vital information for the content of the music video clips.
- Run the resnet_model.ipynb, that will create the trained torch model.
- Run the bert_base_model to get the results of the textual model.