Inconsistent errors handling during the listener recovery #35

romank0 · 2023-04-20T12:18:20Z

If the listener process is running and it encounters an unprocessable notification that is the notification which processing ends up with the exception such a notification is skipped and the processing continues to the next one.

But the processing during the recovery causes the listener process to exit immediately so if there are more stored notifications that can potentially be processed they are skipped and there is not way to recover automatically (in the sense to process existing "good" notifications and listen to new notifications) except to delete the notification that is unprocessable.

Minimal Fix

I believe the behaviour should be consistent. If we skip errors during listenting we should skip them as well during the recovery and that's what the minimal fix might be.

The recovery here is a misnomer I think as this is just processing of the notifications that were received when the listener was down and that may happen even during the normal processing in the cloud environment where processes might go down without any internal issue.

Configurable errors handling strategy

Maybe it makes sense and something I would love to see is a configuration option that would define a strategy how to handle errors during the procesing.

While skipping notification that is faulty is a good default in some scenarios it might be beneficial to stop processing completely when the error happens in some other scenarios (for example if we have temporary issue in some system that notification processing is using I would like that processing does not skip notifications but instead retry the processing of the oldest unprocessed one continuously).

PaulGilmartin · 2023-04-23T16:56:25Z

Thanks for raising this, this inconsistency is something I've never noticed before. I can try to make the minimal fix as soon as I have time.

For the long term changes in this direction, error handling/retry strategies sound like the way to go. This is also a feature I wanted to add when first creating the library, but never got round to defining the interface. I think at least the strategy would be defined on the channel dataclasses, but outside that I haven't thought about it. This and logging are definitely at the top of the priority list in terms of new features.

PaulGilmartin · 2023-04-30T18:19:50Z

Here's a PR for the minimal fix: #42

If you'd like to take a look before I merge to check if it's doing the right thing, that would be nice. Otherwise I'll just merge into master in a day or so.

PaulGilmartin · 2023-05-01T16:07:39Z

Fixed in https://pypi.org/project/django-pgpubsub/1.0.6/
Error/Retry strategy can be implemented in next larger feature release.

PaulGilmartin closed this as completed May 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistent errors handling during the listener recovery #35

Inconsistent errors handling during the listener recovery #35

romank0 commented Apr 20, 2023 •

edited

Loading

PaulGilmartin commented Apr 23, 2023

PaulGilmartin commented Apr 30, 2023

PaulGilmartin commented May 1, 2023

Inconsistent errors handling during the listener recovery #35

Inconsistent errors handling during the listener recovery #35

Comments

romank0 commented Apr 20, 2023 • edited Loading

Minimal Fix

Configurable errors handling strategy

PaulGilmartin commented Apr 23, 2023

PaulGilmartin commented Apr 30, 2023

PaulGilmartin commented May 1, 2023

romank0 commented Apr 20, 2023 •

edited

Loading