Skip to content

MemGator 1.0-rc8

Pre-release
Pre-release
Compare
Choose a tag to compare
@ibnesayeed ibnesayeed released this 06 Mar 05:28
· 34 commits to master since this release
7ea353f

Notable Changes

A new release was long due, but it is finally here.

API

  • A new /about endpoint is added to the server to report runtime configurations and state of aggregated upstream archives, also reported at the root of the service if a public folder of static files is not specified
  • CDXJ metadata lines now use ! prefix instead of @
  • Updated !context URI in CDXJ response
  • X-Generator header is replaced with the standard Server header
  • Default user-agent now separates name and version number of MemGator with a / instead of a :, which ia a more common style
  • Default contact information in the user-agent is changed from @WebSciDL to https://git.io/MemGator
  • Default archives list file URL now uses HTTPS to avoid a round trip
  • API endpoints now correctly report base URL behind a reverse proxy
  • CLI help now correctly exits with 0 status

Archives

  • Archive IDs now use their respective domain names instead of manually chosen short names
  • List of archives is now sorted by ID
  • Randomly assigned probabilities are removed, which were added to test the top-K functionality
  • New archives added: National Records of Scotland, Library and Archives Canada, Australian Web Archive, and BAnQ
  • Alexandrina Web Archive is now enabled, which was ignored for a long time due to quality of service issues
  • Ignored: PRONI (moved to Archive-IT) and PastPages (DEFUNCT)
  • Updated endpoints: Perma.cc and Arquivo.pt
  • Update various archive endpoints to use HTTPS (when supported) to minimize redirects

Build

  • Preferred Go version is changed to 1.14+
  • Local packages are moved from pkg to vendor folder
  • Utilized new Go Modules feature to simply dependency management, which no longer requires the code to reside in the Go Workspace to work with it
  • Refactored crossbuild.sh script and updated to exit on failures
  • Leverages GitHub Workflow Actions to cross-build binaries for various platforms on each push/PR automatically and make these artifacts available
  • Dockerfile now utilizes multi-stage builds to create a small image and allows test builds against specified versions of base images using build-args
  • Final Docker image is now based on alpine instead of scratch, which makes the image slightly bigger, but enables the ability to execute a Shell to inspect a running container
  • Leverages .dockerignore to minimize the build context size
  • Docker images now leverage standard opencontainers annotation labels
  • gh-pages branch-based static asset (primarily, the default archives list) serving is retired in favor of using docs folder of the master branch to avoid maintaining common files in two branches

Distribution

  • Changed Docker image location from ibnesayeed to oduwsdl DockerHub account, which is configured to auto-build
  • Leverages GitHub Workflow Actions to publish Docker images to GitHub Package Registry as an alternate image registry
  • Both the image registries host images with master tag on each push to the master branch and two tags latest and {version} for each release, ensuring that the default latest tag always points to the latest released version, not the bleeding edge master branch

Documentation

  • Added project citation information with JCDL 2016 poster publication and a link to the paper
  • Update README to reflect the recent API changes
  • Updated Docker-related section in the README with steps to build images locally from the source, pulling from the two registries, and running containers in one-off and server modes
  • Documented automated archival exclusion feature for archives that misbehave at run time
  • Updated CLI usage to reflect new changes
  • Updated run and build instruction from source (for developers) and cross-platform binary building

Please test it and provide feedback to push it towards the final release.