Skip to content

The goal of this project is to recommend songs for a given playlist from Spotify. This project starts from data collection all the way to model deployment to ensure you have a working model to showcase.

Notifications You must be signed in to change notification settings

rodolfolermacontreras/PracticalDataScience-ENCA

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

76 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Enhance your Playlists with Machine Learning: Spotify Automatic Playlist Continuation

This is the repository of the group project Enhance your Playlists with Machine Learning: Spotify Automatic Playlist Continuation. The four articles in the series is linked below:

Part I: Extracting song data from Spotify’s API in Python

Part II: EDA and Clustering

Part III: Building a Song Recommendation System with Spotify

Part IV: Deploying a Spotify Recommendation Model with Flask

The code for all four articles is in this repository.

Introduction

The goal of this project is to recommend songs for a given playlist. This project starts from data collection all the way to model deployment to ensure you have a working model to showcase.

How to use

To clone the repository:

git clone https://github.com/enjuichang/PracticalDataScience-ENCA.git

Process

The following image is the flow chart of the project:

Screen Shot 2021-12-18 at 12 02 45 AM

Data extraction

Here are a couple of things you should know before starting the project.

Spotfiy API Acquisition

If you haven’t used an API before, the use of various keys for authentication, and the sending of requests can prove to be a bit daunting. The first thing we’ll look at is getting keys to use. For this, we need a [Spotify for developers] (https://developer.spotify.com/) account. This is the same as a Spotify account, and doesn’t require Spotify Premium. From here, go to the dashboard and “create an app”. Now, we can access a public and private key, needed to use the API.

Spotify Credentials Storage and Access

Now that we have an app, we can get a client ID and a client secret for this app. Both of these will be required to authenticate with the Spotify web API for our application, and can be thought of as a kind of username and password for the application. It is best practice not to share either of these, but especially don’t share the client secret key. To prevent this, we can keep it in a separate file, which, if you’re using Git for version control, should be Gitignored.

Spotify credentials should be stored the in the a secret.txt file with the first line as the credential id and the second line as the secret key:

Screen Shot 2021-12-18 at 12 10 03 AM

To access this credentials, please use the following code:

with open("secret.txt") as f:
    secret_ls = f.readlines()
    cid = secret_ls[0][:-2]
    secret = secret_ls[1]

EDA and clustering

Recommendation Model

The recommendation model is summarized in the content_based_recsys.ipynb notebook. In this section, we will go through the process of building a content-based filtering recommendation. The following parts will be covered:

  1. Package Setup
  2. Preprocessing
  3. Feature Generation
  4. Content-based Filtering Recommendation

Please follow the instruction in the notebook to produce the result.

Deployment

In order to access the final version of the app, please visit the following link: nazaryaremko1.pythonanywhere.com A demo version of the website can be accessed and tested out there. Due to the limitations of file sizes that can be uploaded to pythonanywhere, it the model there is trained only on a subset of the data. To test the full functionality of the model, please, download the repository data, cd into the folder and run the following commands:

cd recommendation_app
python wsgi.py

Then visit the local host and try out the model using any playlist!

To create a virtual environment, you can run the following commands:

python3 -m venv venv
source venv/bin/activate (or venv\Scripts\activate if you are using Windows)

Installing dependencies in virtual environment:

pip3 install -r requirements.txt

Repo Structure

│
├── README.md          <- The top-level README for developers using this project.
├── data
│   ├── raw            <- The original, immutable data dump.
│   ├── processed      <- The preprocessed data sets for training.
│   ├── test           <- The test data sets for testing.
│   └── final          <- The final data sets for modeling.
│
├── models             <- Trained models, model predictions, or model summaries.
│
├── notebooks          <- Serialized Jupyter notebooks created in the project.
│   ├── script         <- Script for data extraction and loading data
│   ├── Extraction     <- Data extraction using Spotify API
│   ├── EDA            <- Exploratory data analysis process.
│   └── Recsys         <- The training of traditional statistical models.
│
├── recommendation_app <- Model deployment folder
│   ├── application    <- Code for model deployment and website design
│   ├── data1          <- Pretrained data for model
│   └── venv           <- Environment
│
└── requirements.txt   <- The requirements file for reproducing the analysis environment.

About

The goal of this project is to recommend songs for a given playlist from Spotify. This project starts from data collection all the way to model deployment to ensure you have a working model to showcase.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 96.3%
  • Cython 2.2%
  • C 1.1%
  • Jupyter Notebook 0.2%
  • C++ 0.1%
  • TeX 0.1%