MemGator 1.0-rc8
Pre-release
Pre-release
Notable Changes
A new release was long due, but it is finally here.
API
- A new
/about
endpoint is added to the server to report runtime configurations and state of aggregated upstream archives, also reported at the root of the service if a public folder of static files is not specified - CDXJ metadata lines now use
!
prefix instead of@
- Updated
!context
URI in CDXJ response X-Generator
header is replaced with the standardServer
header- Default user-agent now separates name and version number of MemGator with a
/
instead of a:
, which ia a more common style - Default contact information in the user-agent is changed from
@WebSciDL
tohttps://git.io/MemGator
- Default archives list file URL now uses HTTPS to avoid a round trip
- API endpoints now correctly report base URL behind a reverse proxy
- CLI help now correctly exits with
0
status
Archives
- Archive IDs now use their respective domain names instead of manually chosen short names
- List of archives is now sorted by ID
- Randomly assigned probabilities are removed, which were added to test the top-K functionality
- New archives added: National Records of Scotland, Library and Archives Canada, Australian Web Archive, and BAnQ
- Alexandrina Web Archive is now enabled, which was ignored for a long time due to quality of service issues
- Ignored: PRONI (moved to Archive-IT) and PastPages (DEFUNCT)
- Updated endpoints: Perma.cc and Arquivo.pt
- Update various archive endpoints to use HTTPS (when supported) to minimize redirects
Build
- Preferred Go version is changed to
1.14+
- Local packages are moved from
pkg
tovendor
folder - Utilized new Go Modules feature to simply dependency management, which no longer requires the code to reside in the Go Workspace to work with it
- Refactored
crossbuild.sh
script and updated to exit on failures - Leverages GitHub Workflow Actions to cross-build binaries for various platforms on each push/PR automatically and make these artifacts available
Dockerfile
now utilizes multi-stage builds to create a small image and allows test builds against specified versions of base images using build-args- Final Docker image is now based on
alpine
instead ofscratch
, which makes the image slightly bigger, but enables the ability to execute a Shell to inspect a running container - Leverages
.dockerignore
to minimize the build context size - Docker images now leverage standard
opencontainers
annotation labels gh-pages
branch-based static asset (primarily, the default archives list) serving is retired in favor of usingdocs
folder of themaster
branch to avoid maintaining common files in two branches
Distribution
- Changed Docker image location from
ibnesayeed
tooduwsdl
DockerHub account, which is configured to auto-build - Leverages GitHub Workflow Actions to publish Docker images to GitHub Package Registry as an alternate image registry
- Both the image registries host images with
master
tag on each push to themaster
branch and two tagslatest
and{version}
for each release, ensuring that the defaultlatest
tag always points to the latest released version, not the bleeding edgemaster
branch
Documentation
- Added project citation information with JCDL 2016 poster publication and a link to the paper
- Update README to reflect the recent API changes
- Updated Docker-related section in the README with steps to build images locally from the source, pulling from the two registries, and running containers in one-off and server modes
- Documented automated archival exclusion feature for archives that misbehave at run time
- Updated CLI usage to reflect new changes
- Updated run and build instruction from source (for developers) and cross-platform binary building
Please test it and provide feedback to push it towards the final release.