-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stopping postprocessing service #9096
Comments
This is not correct. Postprocessing stores its data in a configurable store (nats by default). This store is persistent across restarts. That means postprocessing can be killed and restartet without losing data. It only needs to finish working on events it is currently working on. Note: Almost all services are connected to the event stream. If this is a problem for graceful shutdown, all services have it.
Why? The order of the events does not matter to postprocessing service. |
The docs mentions an in-memory store used by default, which seems to be accessible. Maybe we should either deprecate or remove that option at least from the docs.
I assume the events are persisted in nats, so it depends on whether the metadata is also persisted or not. If the metadata is persisted, then we could stop the postprocessing service any time. After restarting the service, we receive the next event, we get the metadata and we continue the postprocessing. The behavior would be similar to the service having a big delay. Since the in-memory store is a valid option (not yet deprecated / removed), we can't rely on the metadata to be persisted. If it isn't persisted, we have to process any event relying on metadata before stopping the service, otherwise, by the time we need to process the event, its metadata will be gone and will cause issues. A possible scenario is that we start the postprocessing of upload 10 and we need to stop the service at that point. If the metadata isn't persisted, we'd need to keep processing events until we're done with the upload 10, however, there could be more events between the start and the end of the upload 10. These events should be skipped because we don't want to start processing new uploads when we've received a stop signal. If we can guarantee that the metadata is persisted, it would simplify the handling because we could stop processing the events anytime. |
Yes true, postprocessing docs are outdated.
Imo not needed. We can add a section to postprocessing docs that inmem should only be used for testing or in small installations. If files get stuck because of the postprocessing service dying, one needs to manually restart the postprocessing of these uploads.
I don't understand. What do you mean by
Let's keep it simple. If you restart your inmem postprocessing service you need to restart your uploads too. I think that is quite fair. We even have a simple ocis command to do so. |
Initial solution included in #9048 . There is a small problem though: it seems reva spawns a goroutine in order to deliver the events to the postprocessing service through a channel, however, this goroutine won't finish and will get stuck waiting for the event to be read from the channel. It's also unclear what happens down below, specially using natsjs, because we aren't unregistering the connection. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 10 days if no further activity occurs. Thank you for your contributions. |
Should be fixed meanwhile. Events are delivered concurrently in newer versions. |
Is your feature request related to a problem? Please describe.
There isn't a clean way to gracefully stop the postprocessing service.
Describe the solution you'd like
The postprocessing service should be able to be stopped in a graceful way on demand. Right now, the service is killed, which could lead to problems due to inconsistent states that could happen if the service is abruptly stopped.
In order to provided a graceful shutdown of the services there are some things to consider:
This should follow the standard procedure of a server shutdown: no new event will be processed, and we'll let any ongoing upload to fully finish. However, this could mean that shutting down the service could take a while (the timeout of the runner is 10 secs by default, so we might need to adjust the value for this specific service)
Some technical considerations:
Run
method to start the postprocessing service, we should have aStop
method in order to stop it. This will hide all the complexity described above.Run
method should return only when everything is done and the service has completely finished. This is what the runners expect. Alternatively, the responsibility could fall into theStop
method, which should return when the service has completely finished. Go-micro seems to use this other approach.Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context or screenshots about the feature request here.
The text was updated successfully, but these errors were encountered: