Recommendation System: Data Collection

Collecting metadata of my favorite songs from Spotify

  • Bijon Setyawan Raya

  • June 14, 2023

    2 mins

    Music Recommendation Sytem (4 Parts)

    In the first part of this series, I will be collecting approximately 2000 random songs and I would like to find out which songs / music I would like based on my playlists. Once I have the data, I will be using those data to make a recommendation system using Non-negative Matrix Factorization (NMF).

    My Spotify homepage
    My Spotify homepage

    Since I am a Spotify subscriber, I can use their API to get songs' metadata. Unfortunately, I had finished writing my own function after stumbling upon a framework called Spotipy that can be used to interact with Spotify API.

    Click here to see the script on GitHub.

    There are endpoints that I used to collect the metadata of songs in my own playlist and also songs in the playlists that I never listened to:

    1. to get the access token
    2. to get the refresh token
    3. to get all of my playlists
    4.{category_id}/playlists to get all of the playlists in a category
    5.{playlist['id']}/tracks to get all of the tracks in a playlist
    6.{track_ids} to get the tracks' metadata in a bulk
    7.{track_ids} to get the tracks' audio features in a bulk

    Here is one of the examples of the metadata that Spotify API returns:

      "id": "2i2gDpKKWjvnRTOZRhaPh2",
      "title": "Moonlight",
      "artist(s)": "Kali Uchis",
      "popularity": 88,
      "danceability": 0.639,
      "energy": 0.723,
      "key": 7,
      "loudness": -6.462,
      "mode": 0,
      "speechiness": 0.0532,
      "acousticness": 0.511,
      "instrumentalness": 0.0,
      "liveness": 0.167,
      "valence": 0.878,
      "tempo": 136.872,
      "type": "audio_features",
      "uri": "spotify:track:2i2gDpKKWjvnRTOZRhaPh2",
      "track_href": "",
      "analysis_url": "",
      "duration_ms": 187558,
      "time_signature": 4