Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem: packages don't have a status #1081

Open
sallain opened this issue Nov 25, 2024 · 10 comments
Open

Problem: packages don't have a status #1081

sallain opened this issue Nov 25, 2024 · 10 comments
Assignees
Labels
analysis The action on this issue is to provide analysis Client: SFA

Comments

@sallain
Copy link
Collaborator

sallain commented Nov 25, 2024

Is your feature request related to a problem? Please describe.

Enduro provides a status for workflows but not for packages. This hasn't really been a problem, but as we complete the analysis for an AIP deletion workflow (#1076) it seems desirable to include a DELETED status for an AIP. As it is, if a package is deleted then the package's workflow status would be updated to DONE, but a user would have to click into the package detail page and review the deletion workflow to see that it has in fact been deleted.

In order to be a one-stop source of information about an institution's preserved holdings, Enduro should maintain a record of all SIPs that have been processed, including those that are deleted. By glancing at the packages table, a user should be able to see if a package is stored, in progress/processing, or deleted.

A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like

Add a package attribute for status, with the following:

  • PROCESSING - the package is being used in an active workflow
  • STORED - the package has been successfully processed and sent to the final storage location
  • DELETED - the package has been removed from the system
  • FAILED - the workflow errored out and the package could not be stored/deleted <-- this status needs opinions

Describe alternatives you've considered

We considered whether or not we could repurpose the locations column to indicate that the package has been deleted, since the deleted package's location would now be blank. However, this seems a little messy.

@sallain sallain added Client: SFA analysis The action on this issue is to provide analysis labels Nov 25, 2024
@sallain sallain added this to Enduro Nov 25, 2024
@sallain
Copy link
Collaborator Author

sallain commented Nov 25, 2024

@fiver-watson can you start thinking about how to incorporate this status in the UI?

@sallain sallain moved this to ⏳ In Progress in Enduro Nov 25, 2024
@fiver-watson
Copy link
Contributor

fiver-watson commented Nov 26, 2024

Additional thoughts on this issue

Existing enums and entities

While we agree that the current statuses used on package tables are in fact statuses about the workflow running and not the package itself, the developers have pointed out that until recently, we did not have an entity for "Preservation Actions" - and as such, in the code the current enum is labeled as package status - see:

Currently these two enums mostly duplicate each other, minus the unused statuses. I am not sure where each are used, but I am hoping that with this work, we can better distinguish the two.

Another possible package status needing opinions:

Depending on when in a workflow we create database entries etc, we may want to have a QUEUED status for packages as well. For example, if a "Create AIP" workflow is initiated with a new package entering the system, but no workers are available, we may want to create the package entity and capture whatever initial information is known about it.

On the proposed FAILED status

I can see how this will be useful, much like my proposed QUEUED status but on the other end of the workflow. I do think it would be good for us to consider how to avoid maintaining a list of failed package ingest attempts indefinitely, howver.

Example: let's just imagine that some stubborn archivist is trying to learn the validation rules through trial and error. Every time the package fails, they make changes based on the first validation error reported, and no more. In the end, they attempt to ingest the same package 10 times, before it finally succeeds on the 11th.

Each package attempt will have an identical name, with pretty identical details on its contents (depending how far it gets), though I think it will be assigned a different UUID by Enduro (as a separate unique attempt). If we maintain that information indefinitely, then every time this user wants to search for their package, they will have to wade through 10 other incorrect results to find the desired "STORED" package.

Now multiply this by thousands of packages over the course of a year. Over time, I suspect that the user experience of the system will degrade quickly when there is so much noise.

How can we keep useful information about failed packages and workflows for enough time, but not so long that they simply become extra noise impeding the use of the system?

UI Design proposal

Ideally, I would like to tie this into the work proposed in Enduro issue #955 , when we reorganize the package details page.

I think that at a glance, package status will be less important to determine than workflow status, as operators will be mainly using Enduro as an ingest engine and not a digital repository for long-term storage management, etc. Consequently, I don't want to add another "badge" that will draw the eye in the same way as the current workflow statuses, and potentially lead to momentary confusion as users try to determine which status is which.

My suggestion is that we use bolding and color (with a high enough contrast ratio to ensure accesibility - if needed, i can propose hex codes for some of the statuses), but no badge for the status, and simply add it to the package details.

We can also take this opportunity in the current UI to remove the duplicate workflow status badge being shown in the package details area, and replace it with a package status instead, like so:

pkg-details-w-status_current

Once Issue #955 is implemented, the redeigned package details could look like so:

pkg-details-w-status_redesign

Thoughts and feedback welcome!

@fiver-watson
Copy link
Contributor

fiver-watson commented Nov 26, 2024

An additional thought for consideration:

I don't know how much of a headache this might be for the devs and happy to think of alternatives, but even with a proposed new status, I kind of liked Sara's original idea of, on the package browse page, reusing the location column to just show "DELETED" (or "REMOVED" or similar) instead of a location.

I do think it's helpful to be able to tell in SOME way from the pkg browse page (without having to click through to the package details) that the package is deleted, and the current table already feels overloaded - I don't personally think adding an additional column is a good idea. Meaning: until we decide to redesign the package browse page, this might be a decent workaround, if it doesn't make the dev's 🤯 😡 (╯°□°)╯︵ ┻━┻

Super simple mockup done via browser:

Image

@sallain sallain assigned jraddaoui and unassigned fiver-watson Dec 2, 2024
@jraddaoui
Copy link
Collaborator

Thanks for putting this together and sharing your thoughts @sallain and @fiver-watson.

This raises again the discussion we had about Enduro being more an orchestrator than an ingest application as it's mixing again concepts from the storage domain. Based on the decisions we make we may go even more in that direction, which may be okay but it should be clear for everybody.

Ingest vs orchestrator

I consider what we call ingest more like an orchestrator already. For example, this is what the ingest/processing workflow does:

  • Executes a preprocessing child workflow (if enabled).
  • Calls and waits for the preservation system (a3m or Archivematica).
  • Stores the AIP:
    • With a3m, uses Enduro Storage Service API and workflow.
    • With Archivematica, only keeps track of the AMSS location trough configuration.
  • Triggers poststorage child workflows.

I think it's okay to orchestrate all that in an ingest workflow/application but ...

Move preservation task/workflows

A better example is the move operation. Right now is not clear that we already have two services because we serve both API's together and have a single user interface. However, internally these services are called "Package service" (what we call ingest that I think we should consider an orchestrator) and "Storage service". In the API spec they are separated by the package and storage paths and there is some separation in the UI, "Packages" against "Locations". Both services have their own workflows and the communication between them is done through the API (some "package" workflows call the "storage" API).

If you consider the "package" service an ingest application, I think the current statuses are okay. Processing/Ingest is done the package should be stored because it didn't fail. However, when we added the move operation, instead of doing it only in the storage domain, we wanted to keep track of the location where the AIP is stored and a preservation action for the move in the ingest domain. I think that's where the separation between services starts to blur and the "package" service becomes more an orchestrator than an ingest application.

If you separate both services, you could show where the API was sent in the ingest application and "relate" the package in both services by the UUID (what we already do). But everything after that should only happen in the storage domain, and because we share the UI and API it should be easy to redirect to a storage/aip/:uuid* page, where the move and delete operations can be triggered without worrying about the ingest domain anymore. This approach could solve the issue about removing ingest data after a while that Dan mentioned above, and I think this is similar to the AM/AMSS approach.

However, we wanted to keep track of the move operation and the location UUID in the "package" service, so we added the operation to the ingest/package/:uuid* page, and the current process is as follows:

  • Click move from the package page.
  • Request is sent to the package service API.
  • "package-move"** workflow is started in the package service.
  • Request is sent to the storage service API from that workflow.
  • "storage-move"** workflow is started in the storage service.
  • "package-move" workflow polls for status using the storage service API.
  • "storage-move" workflow ends.
  • Package status is updated an preservation action/task created.

Ingest separation

If we want to separate the concerns a bit better we could still use the statuses we have on the ingest domain. Adding new statuses only in the storage domain. Moving the move and the delete operations there, not worrying about them on ingest. We could add a package list and a package view pages in that domain, having two different package pages, one for the ingest domain and another for the storage domain, keeping the information for each service there, having different search filters and so on. It will help clearing old data from the ingest domain when needed too.

Orchestrator approach

Keeping the current functionality as it is will require adding new statuses as suggested above. It makes the package service kind of the source of truth where you know all operations performed for a package over the time. Like the move operation, deleting should also happen on the package service to keep track of that operation, adding an extra layer on top of the actual operation in the storage service.

UI mix

Similar to what we have, we could have a better separation in the backend but keep the UI with a single page for the package, where information from the package and storage services is mixed to provide different actions or calculate an overall status. But this complicates searching and will make it harder to split the applications later if needed.

Conclusion

In my opinion, we should have a better separation of concerns in this services already. As I commented above, the move and delete operations should only be part of the storage service, instead of mixing the information from both services in the same page we could have two different package list and view pages based on the domain, allowing us to show the information, actions, filters, etc. related to that domain only. Making it easier to clear old ingest packages while keeping the storage information, and also making it easier to separate in the future.

Please, let me know your thoughts!


* URLs are the ones I'd use to make the separation clear between domains, not the existing ones.
** Prefixed the workflow names to make clear the internal service separation.

@fiver-watson
Copy link
Contributor

I think you have outlined the issue wonderfully, @jraddaoui - thanks. And I agree that we need to make some decisions around this, as it will become increasingly challenging to manage if we are not clear on how we plan to extend the functionality of the application into other bounded contexts.

Please forgive me in advance if this is a massive misunderstanding of the underlying technical issues...

I wonder if there are not some other options that compromise between these? For example...


Common source of truth, choreographed services

  • Each service is responsible for its own operations, and for saving any important information from those operations
  • Each service shares a common underlying persistence layer - ideally this is optimized for both current state and historical state of operations.
  • Services can request any data needed from the persistence layer, but do not talk directly to each other
  • Services can listen for triggering events from other services and trigger their own workflows or events in response, but ultimately only manage information in their context.
  • The frontend is modular and composed of elements from across services - operators only interact with one UI, and that UI can get any data it needs from the common source of truth for current states:

Something like this-ish?

modular-architecture-enduro


It is possible that sharing a common db might make data modeling difficult and inefficient. Perhaps another option would be:

Choreographed services with a new front-end service

  • Similar separation of concerns for the services
  • Still the possibility of adding a shared Event Store for history, that listens to all services
  • Each service is responsible for persisting its own mission-critical data
  • A new front-end service listens for events and copies any relevant service data to a common index to serve the UI's search functionality and things we want to load fast
  • Any rarely used data in the front end can be requested on-demand via api from the responsible service

Or, finally, the third consideration I have is that if we want an orchestrator approach, that's fine - but perhaps it should be a separate new thing from the ingest service?

I guess I am trying to avoid the old pre-processing approach that happened, where it felt like one application that should have been a bounded context ended up holding all the domain logic and state of truth for the others.

New orchestrator and API gateway?

Perhaps it is time to consider some variation of what we originally tried, i.e.

  • An API gateway sits between the common front-end
  • The gateway is managed by a new cross-service orchestrator
  • Ingest is just one more service / bounded context communicating with the orchestrator
  • etc etc

I suspect there are likely many things I have suggested that aren't practical, don't make sense, or even just repeat what you are already suggesting and I just didn't understand - apologies in advance. Just wanted to make sure we are considering all options!

@jraddaoui
Copy link
Collaborator

Thanks for looking into this @fiver-watson, and for going even further.

I think there are some good points there but most of those ideas change the principle we discussed about using public APIs to communicate between bounded contexts. I think sharing the persistence layer is not a good idea, I may not be understanding your suggestion, would each service need to know about the schema from the others to access their data? How takes care of versioning/migrations?

Having an event store was something we also discussed, it could be a good option to communicate between services. For example, if a package is deleted in storage it can submit an event and the systems that care about that can listen and act. Instead of doing that, we decided to trigger the operation in ingest and use workflows to keep track of the operation and public APIs to communicate between services. My suggestion now is that we don't need to know the move/deletion operation done in storage from the ingest application. But it could be a good solution for other communications in the future.

An API gateway is something that I always have a hard time seeing. In the end is similar to the UI mix, but doing the mix in an intermediary service instead of in the UI. I think it depends on how much we want a single UI for all services or individual ones. My idea, if we separate ingest application is to use the same UI, but still have the separation of concerns we kind of have right now, I'll follow up about this in another comment.

@mcantelon
Copy link
Contributor

mcantelon commented Dec 5, 2024

Having multiple package list and view pages based on the domain (orchestration vs ingest/"service") makes sense to me as well (a unified interface would be nice, but maybe harder to keep coherent long-term). The orchestration package view page could potentially show view data relating to ingest/"service" when appropriate.

Maybe the pkg_status.go filename could end up being used for the orchestration status and a new pkg_ingest_status.go end up used for the ingest status.

Creating a generalized framework for "services" - making them akin to "plugins" (and possibly something that third parties could contribute in the future) - definitely seems appealing, but could be more time consuming to get right versus conventional application subsections.

@fiver-watson
Copy link
Contributor

I personally think that we should definitely try to aim for a single common UI as much as possible, regardless of how we choose to separate contexts / tech stacks / responsibilities in the back.

From a marketing and usability perspective, it will be a lot harder to convince potential clients / users that they need 4 or more different applications in the future (e.g. Enduro Ingest, Preserve, Store, some kind of metadata manager, some kind of public access system, some kind of reporting tool, etc...) if we can't abstract that away from their actual daily experience. If each application has its own UI, there is also more likely to be drift in terms of user experience across the UIs, adding cognitive load to the end user (e.g. Right, the edit button is up here in this view, but if I go to this view, it's down here now, and it's buried in the context menu on this app, etc...). If I use Spotify, I don't know I am installing 10 different apps with different bounded contexts and managed by different teams - I get the experience of one seamless app. Given the common "Enduro" branding, I personally think that is a goal we should keep in mind. That said, perhaps PLT might have thoughts on this as a high-level long term goal.

And yeah, I snuck an event store into my diagram, because I do really think there is a lot of good overlap for us to explore, in terms of the archival domain's focus on chain of custody, authenticity, and capturing EVENTS for every step along the way, and the way that event-based systems work. It might be too soon for us to go deep on that, but when we start thinking about versioning, reingest, etc. I think that they could be invaluable. In terms of recovery too, being able to replay events to return to a previous state could possibly help us in many ways.

That said.... if we are going this far in thinking more broadly about the long term aims of the project and its architecture, I would also love us to talk with @jhsimpson more about some of the ideas he has been exploring, like some of the patterns in this miro board, etc...

I think sharing the persistence layer is not a good idea, I may not be understanding your suggestion, would each service need to know about the schema from the others to access their data?

I trust your thoughts on this - I was mainly trying to avoid the search issues that a common UI might introduce. That said, I think that a shared event store with local service persisting their own data could still work fine - in the UI design, we put search boxes on specific pages (much like AtoM has a dedicated actors search on the actor browse page, etc) rather than a global one. In cases where one service needs data from the others (e.g. being able to show the package location when in the package browse or view pages), then listening to the events and just persisting what is needed in the local context could help with that. There is some minor data duplication, but in cases of conflicts it would be pretty easy to determine which service is the source of truth for which datum.

Mostly I just wanted to make sure that we weren't just considering only 2 options - I think there are more variations possible, and I hope that smarter folks than me can help to consider them!

@fiver-watson
Copy link
Contributor

Also, just to bring all of this back to the original issue....

Regardless of how we choose to separate the concerns, adding a Package Status shouldn't necessarily touch on this boundary. Sure, one of the proposed statuses is "STORED", but that is still a status of the package - we don't need to know with that status where it is stored, just that it successfully was stored.

The original issue here was separating the status of the workflow (where there will eventually be many different workflows- e.g. delete AIP, move AIP, reingest package, generate DIP, etc...) from the status of the package.

Just so we don't lose sight of the immediate work under discussion here!

@djjuhasz
Copy link
Collaborator

djjuhasz commented Dec 9, 2024

I think one possible future architecture we should consider is eliminating the Enduro Storage Service altogether in favour of using the Archivematica Storage Service. This is a radical change to Enduro, but I think there is a way to get there from here using a facade pattern approach. I also think this has the possibility of allowing Enduro to use the AMSS or Enduro SS.

Here's my very simplified (and probably slightly incorrect) attempt at an architecture diagram of Enduro + AM right now:
Enduro + AM Architecture proposal - Current Architecture - Simplified

I hope @jraddaoui will correct any errors I've made when he's back.

And here's a proposal for an Enduro + AM software architecture with no Enduro SS:
Enduro + AM Architecture proposal - Proposed Architecture - No Enduro SS

In the proposed architecture I think you could potentially use an adapter to replace the AMSS with an Enduro SS, or another storage service. My thinking is to treat the Storage Service as a fully separate app and service, which make it much easier to replace independent of the rest of Enduro. I think working against the AMSS has the advantage of the AMSS already being a completely separate app, and supporting multiple backend storage systems. Of course using the AMSS also has all the disadvantages of the AMSS: a unusual and often unpredictable API, possible performance limitations, and the technical debt of the AMSS.

I think that getting to the proposed architecture would first involve the same steps that @jraddaoui suggested for Ingest separation:

  1. Separating the Enduro SS into it's own API
  2. Separating the Enduro Dashboard into more distinct "Ingest" and "Storage" pages

Then we would have to disentangle the Enduro SS and AMSS by removing the way we proxy AIP downloads through the Enduro SS. After that we could develop an adapter or "shim" to convert Enduro SS API calls to AMSS API calls. We would also probably need to do some development in the AMSS API to make it more predictable and to add any missing functionality.

Here's a link the original Miro board where I drew the Architecture Diagrams:
https://miro.com/app/board/uXjVL56lagQ=/?share_link_id=775609888575

[Edited: the diagrams several times to correct errors]

@fiver-watson fiver-watson moved this from ⏳ In Progress to ❗ Blocked / Needs Review in Enduro Dec 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
analysis The action on this issue is to provide analysis Client: SFA
Projects
Status: ❗ Blocked / Needs Review
Development

No branches or pull requests

5 participants