- use Last.fm https (ssl) API endpoint to download scrobbles
FacetsTransformer
andfacet
(unique artists, albums, tracks) option intransform/2
for facets archiving- refactor: utils modules, dataframe and transformer config macros
- Livebook:
Creating a file archive
: heatmaps visualisation containing daily playcounts, tooltips and statsFacets archiving
new guide about facets archiving with usage demo forExplorer.DataFrame
analytics and Vega-Lite visualisation
- fix: Lastfm client bug that fetches page 1 instead of page 2 (for scrobbles > 200 per-day/on the day)
- test: use
ExMachina
test data factories, update tests and replace static test fixtures
year
,date
,overwrite
options forLastfmArchive.sync/2
- refactor:
.cache
hidden directory hosting cache files - refactor: split
LastfmArchive.Cache
into behaviour, client, server modules - refactor: extract date time logic from
LastfmArchive.Utils
into a separate module
- faceted archive: deriving multiple facets (archives) from the same file archive in support of analytics use cases
- transforms data on a year-by-year basis by default, instead of loading the entire dataset
- refactor:
.metadata
andderived
directories hosting various metadata files and derived archives - analytics features migrated to coda
- most-frequent, play-once samples analytics for on-this-day Livebook page
- fine-grained
Transformer
behaviour with default implementation and base transformer DateTime
,Date
typed columns (instead of string) in transformed columnar archive- refactor: rename
name
column (of songs) in columnar storage totrack
- new
Analytics
(generated callbacks) andLivebookAnalytics
behaviours, with default implementations to separate the concerns of data frame analytics and Livebook rendering - analyics improvement:
- includes various artists albums
- more overall stats
- stats per facet, e.g. number of albums, tracks per artist
- resolve on-this-day analytics ranking issue
- fix issue that caused subsequent columnar data transforms to stuck at previous (first) date of transform
- new Livebook to present analytics of all music played "on this day"
- additional data storage formats (Apache Arrow IPC) via
Explorer.DataFrame
i/o functions read/2
all scrobbles, i.e. entire dataset into a lazy data frame by default- update and create new Livebook guides for archiving and columnar data transformation
overwrite
option for file archive transformer andtransform/2
read/2
CSV and Parquet data (deprecateread_csv
,read_parquet
)columns
option to load only required CSV, Parquet columns into data frames- use Explorer DataFrame I/O functions to compress Parquet data
- refactor: DataFrame I/O macro, generate tests for transform formats
read/2
callback and implementation to return data inExplorer
data frameafter_archive
callback and implementation to transform data into TSV and Apache Parquet files- refactor
Archive
,FileArchive
and newScrobble
andMetadata
structs
- first interactive notebook - Livebook support
- provide an archiving Livebook to initiate archiving jobs
- playcounts heatmap and table for visualising archiving progress
- refactor: FileArchive - raw Lastfm scrobbles JSON format and better logging
- default runtime configuration to better support Livebook usage
info/0
returns total play count and first date of scrobble for default user- update Lastfm client to use Livebook system env vars (
LB_LFM_USER
andLB_LFM_API_KEY
) along with config values - rename modules and tests in idiomatic namespaces
- bump dependencies, Elixir / Erlang versions
- Github Actions CI
sync/2
handles request errors from Last.fm API- caches errors during sync so that subsequent syncs or API requests can be made to archive the missing scrobbles
- do not cache sync results of today's scrobbles as it's a always partial sync
- Create an
Archive
behaviour withFileArchive
implementation - Extract and store scrobbles in daily chunks
- Extensive refactoring: simpler LastfmArchive functions core, deprecate
archive
functions - Rewrite tests with mocks, eliminate file writes during tests
- Implement a GenServer simple tick-based (auto) cache memoization to prevent repetitive API calls to Last.fm
- Update Elixir/Erlang versions and dependencies
- Replace HTTP client (HTTPoison) with Erlang's built-in httpc client
- Replace Poison with Jason for JSON decoding/encoding
- Apply mix formatting to the entire codebase
- Refactor and abstract LastfmArchive.Extract functions into a new module based on
Lastfm.Client
behaviour - Implement explicit behaviour and contract-based unit testing (Hammox/Mox)
- Refactor and enable concurrent testing
- Fix the issue of Last.fm changing the date/count data type in JSON back and forth (string <-> integer).
- Patches as per Last.fm API JSON data format changes: uts timestamp, play counts info are now returned as integers instead of strings.
sync/0
,sync/1
: sync and keep tracks of scrobbles for a default and Last.fm users, via delta archiving (download latest scrobbles)
- Support for Solr: load all transformed (TSV) data from the archive into Solr,
load_archive/2
- Underpinning functions to read, parse, load data into Solr
transform_archive/2
: transform downloaded raw Last.fm archive and create a TSV file archive- Underpinning functions to read, parse and transform raw Lastfm JSON data into TSV files
- fix single year archiving (bug):
daily: true
option
archive/3
: archiving data subset based on date ranges: single day/year, past week/month, arbitrary date range usingDate
,Date.Range
structsdaily: true
option for finer-grained batch archiving cf. the default year-level granularity
overwrite
archiving option to also re-fetch any existing downloaded data, for refreshing file archive
- Keyword list archiving options (
per_page
,interval
) forarchive/2
which can also be configured
archive
latest tracks (current year) on a daily basis to better ensure data immutability and updatability (new scrobbles)archive
older tracks on a yearly basis
archive/0
: downloads scrobbled tracks, creates a file archive for a default user according to configuration settingsarchive/2
: downloads scrobbled tracks, creates a file archive for any Lastfm userwrite/3
: outputs data for multiple Lastfm users (no longer hardwired to the default user)
- Download scrobbled tracks raw data, create an archive on local filesystem for a default user in configuration -
archive/1
extract/5
andwrite/2
functions for Lastfm API requests and file output