Archelon is the Web front-end for a Fedora 4 repository-based set of applications known collectively as "umd-fcrepo". The umd-fcrepo system consists of the following parts:
- umd-fcrepo-docker - a set of Docker images for running the Fedora repository platform
- Plastron - a utility application for performing batch operations on the Fedora repository
- Archelon - a web GUI providing an administrative interface for Fedora
While Archelon is technically able to run without access to any other application, its functionality is extremely limited without Plastron or the applications provided by umd-fcrepo-docker.
Archelon consists of the following components when run in a production environment:
- A Rails application providing an administrative interface to the Fedora repository. It uses the Blacklight gem for providing Solr search functionality.
- A STOMP listener application for communicating with Plastron using the STOMP messaging protocol via ActiveMQ
- An SFTP server, used to upload files for inclusion in import jobs
Archelon interacts directly with the following umd-fcrepo components:
- ActiveMQ - Archelon communicates to Plastron using STOMP messaging mediated by ActiveMQ queues.
- Solr - Archelon communicates directly with the Solr instance in the "umd-fcrepo-docker" stack for metadata search and retrieval.
- Plastron - Archelon uses the HTTP REST interface provided by Plastron to retrieve information about export and import jobs (some export/import status information is also provided via STOMP messaging).
See Installing Prerequisites for information on prerequisites on a local workstation.
There are several ways to setup the umd-fcrepo system -- see umd-lib/umd-fcrepo/README.md for information about setting up a local development environment for Archelon.
The following are the basic steps to run the Archelon Rails application. Archelon requires other components of the umd-fcrepo system to enable most functionality.
- Checkout the code and install the dependencies:
git clone [email protected]:umd-lib/archelon.git cd archelon yarn bundle install
- Create a
.env
file from theenv_example
file and fill in appropriate values for the environment variables. - Set up the database:
rails db:migrate
- (Optional) Load sample "Download URL" data:
rails db:reset_with_sample_data
- In three separate terminals:
- Start the STOMP listener:
rails stomp:listen
- Start the Delayed Jobs worker:
rails jobs:work
- Run the web application:
rails server
- Start the STOMP listener:
Archelon will be available at http://archelon-local:3000/
Archelon requires other components of the umd-fcrepo system to enable most functionality.
-
Checkout the repo, and open the codebase in VSCode:
git clone [email protected]:umd-lib/archelon.git cd archelon code .
-
When opening the codebase, there will be a notification to reopen the directory in a dev container, select "Reopen in Container"
ℹ️ Note: If there isn't a notification, you can also open the command palette (cmd+shift+p) and type “Dev Containers: Rebuild and Reopen in Container”
The dev container will take a moment to build the docker image, and install the javascript and ruby dependencies.
-
Create a
.env
file from theenv_example
file, and adding these environment variables:- LDAP_BIND_PASSWORD (Obtained from LastPass)
- FCREPO_AUTH_TOKEN (Obtained from generating a JWT token from the local fcrepo stack)
-
Run yarn install
yarn install
- Open three separate terminals in VSCode and run these respectively in each:
- Start the STOMP listener:
rails stomp:listen
- Start the Delayed Jobs worker:
rails jobs:work
- Run the web application:
rails server
- Start the STOMP listener:
Archelon will be available at http://archelon-local:3000/
Note: If a 403 Not Authorized Error occurs when visiting, visit the page in a private window.
By default, the development environment for Archelon will log at the DEBUG level,
while all other environments will log at the INFO level. To change this, set the
RAILS_LOG_LEVEL
environment variable in your .env
file.
In the development environment, the log will be sent to standard output and
the log/development.log
file, as is standard in Rails application.
In production, set the RAILS_LOG_TO_STDOUT
environment variable to true
to
send the log to standard out.
In general, Archelon requires a CAS login to access the application, and the user must have been added to the system by an administrator.
Two notable exceptions are the "ping" endpoint and "public keys" endpoint (there are also some other minor endpoints, such as import/export status updates).
The "ping" endpoint is unrestricted, and is suitable for monitoring the health of the application.
The "public keys" endpoint returns a JSON list of the public keys allowed to SFTP to the Archelon server. While these are public keys, and hence not technically a security issue, current SSDR policy is to limit access to this endpoint to "localhost", or nodes in the Kubernetes cluster.
Archelon uses the boathook gem to provide Rake tasks for building and pushing Docker images, as described in the following Dockerfiles:
Dockerfile | Image Name | Application |
---|---|---|
Dockerfile | docker.lib.umd.edu/archelon |
main Rails application |
Dockerfile.sftp | docker.lib.umd.edu/archelon-sftp |
SFTP server for import/export |
Usage:
# list the images that would be built, and the metadata for them
rails docker:tags
# builds the images
rails docker:build
# pushes to docker.lib.umd.edu hub
rails docker:push
See umd-lib/umd-fcrepo/README.md for information about setting up a local development environment for Archelon using Docker.
When running locally in Docker, the Archelon database can be accessed using:
# Archelon database backing the Archelon Rails app
psql -U archelon -h localhost -p 5434 archelon
It is possible to build a multi-platform Docker image using the docker buildx
command and targeting both the linux/amd64
and linux/arm64
platforms. As
long as there is a local
builder configured for buildx, the following will
build and push a multi-platform image:
docker buildx build \
--builder local \
--platform linux/amd64,linux/arm64 \
--tag docker.lib.umd.edu/archelon:latest \
--push .
See Rake Tasks
Archelon has the ability to create one-time use URLs, which allow a Fedora binary file to be downloaded. The random token used for the URLs, and other information, is stored in the DownloadUrl model.
In production, the URL that patrons use to retrieve the files does not reference the Archelon server directly, relying instead on a virtual host, which proxies back to Archelon.
The base URL of the virtual host (i.e., the entire URL except for the random
token, but including a trailing slash) should be set in the RETRIEVE_BASE_URL
environment variable. This base URL should be proxied to the
<ARCHELON_SERVER_URL>/retrieve/
path.
Archelon has the ability to create one-time use URLs for downloading files from Fedora. Since downloading files may take considerable time, it is necessary that the production Archelon server support concurrent operations.
File downloads are sent as a "streaming" response, so file downloads should start almost immediately, regardless of the size of the file. If large file downloads take a long time to start, it might be being buffered by the Rails server.
Rails disables concurrent operation when using the development environment.
Edit the "config/development.rb" file, and add the following line inside
the Rails.application.configure
block:
config.allow_concurrency=true
The batch export functionality relies on a running Plastron instance.
See BatchImport.
For information about the controlled vocabularies used when editing metadata, see Vocabulary.
By default, Archelon determines the user type for a user ("admin", "user" or
"unauthorized") using the list of Grouper groups in the memberOf
attribute
returned from an LDAP server for that user.
The local development environment (or Docker container) can be run without
connecting to an LDAP server by setting the LDAP_OVERRIDE
environment variable.
The LDAP_OVERRIDE
environment variable should contain a space-separated list
of Grouper group DNs that any authenticated user should receive.
The LDAP_OVERRIDE
environment variable only works in the development
Rails environment.
GitHub (or a vulnerability scanner such as "bundler-audit"), may report that this application is vulnerable to CVE-2015-9284, due to its use of the "omniauth" gem. More information about this vulnerability can be found at:
https://github.com/omniauth/omniauth/wiki/Resolving-CVE-2015-9284
As configured, this application uses CAS for authentication. As the application does not use OAuth it is not vulnerable to CVE-2015-9284.
The Rails "Action Cable" functionality is used to provide dynamic updates to the GUI.
See ActionCable for more information.
Archelon is configured to use the Delayed::Job queue adapter, via the delayed_job_active_record gem to store jobs in the database.
The delayed_cron_job gem is used to schedule jobs to run on a cron-like schedule.
The "CronJob" class app/cron_jobs/cron_job.rb should be used as the superclass, and all implementations should be placed in the "app/cron_jobs" directory.
CronJob implementations in the "app/cron_jobs" directory are automatically scheduled when the "db:migrate" Rails task is run, via the "db:schedule_jobs" Rake task (see lib/tasks/jobs.rake).
The "Changing the schedule" section of the "delayed_cron_job" README.md file indicates that when the "cron_expression" of a CronJob is changed, any previously scheduled instances will need to be manually removed.
In this implementation, the "db:schedule_jobs" task removes existing CronJob implementations from the database before adding them back in. Therefore, it should not be necessary to manually delete existing CronJobs from the database after modifying the "cron_expression" for a CronJob (as long as "db:schedule_jobs" or "db:migrate" is run after the modification).
An interactive demo displaying the React components provided by the application is available at:
http://archelon-local:3000/react_components
React components are documented using "React Styleguidist" https://react-styleguidist.js.org/
In the development environment, web-based interactive documentation can be accessed by running:
> yarn styleguidist server
and then accessing the documentation at http://localhost:6060/
See the "Documenting Components" page on the "React Styleguidist" website https://react-styleguidist.js.org/docs/documenting, for information about writing documentation for the React components.
See the LICENSE file for license rights and limitations (Apache 2.0).