- http://apassant.net/2014/10/27/500000-deezer-playlists-google-big-query
- This repo contains the various scripts used to run the experiment
- Dataset is split into 9 JSON files (About 1Go total) at http://storage.googleapis.com/deezer-playlists/[1-9].json.gz
- Data comes from the Deezer API. By using it, you agree to the Deezer API TOS at http://developers.deezer.com/api
- I'm making it available publicly for experimental purposes. If you're a Deezer representative and want it to be removed, please contact me at http://apassant.net
- Create a Google Cloud project
- Set-up your authorization details using https://cloud.google.com/bigquery/authorization#clientsecrets
- Setup BigQuery for your Google Cloud project
- Copy
config.py.dist
intoconfig.py
and edit your project details
- Run
load.py
and accept the oAuth screen (this will create abigquery_credentials.dat
file) - Wait until it loads all JSON files from Google Cloud Storage into your Big Query project
- Run query.py to query the dataset
- Add additional queries by creating new files in
/queries