Spotify Project Literature Review
The following resources helped us in our initial research, data scraping, data infrastructure, model building, and final analysis:
Data Scraping
- 1 million playlist data set: Used to understand our initial playlist data
- Spotify API documentation: Used when we built script to extract audio features from the Spotify API
- Spotify oauth code flow: We initialized a connection to the Spotify API using the oath code flow documented here
- Spotipy documentation Spotipy is a Python library that allows you to easily call the Spotify API.
- Will Soares’ Spotify-Genius integration We found this useful when learning to use the Spotipy library.
Data Infrastructure
- GCP BigQuery documentation: Used to learn how to load data into Google Cloud Platform’s BigQuery
- GCP service accounts: Documentation for service accounts in GCP
- GCP python authetication documentation: Documentation for authorization handling in Python
Model Building
K-Means:
- University of Washington’s k-means course on Coursera: Learned to how implement k-means using this course.
- Towards Data Science: K-Means: A useful high-level summary of the k-means algorithm