Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch all error and exception logger calls to warning if they do not lead to a crash. #7861

Closed
drew2a opened this issue Jan 26, 2024 · 5 comments

Comments

@drew2a
Copy link
Contributor

drew2a commented Jan 26, 2024

Currently, we use error, exception, and warning logger calls in Tribler without any established system.

I propose the following approach:

  1. Use warnings for all errors that do not lead to a Tribler crash.
  2. Use error and exception logs for all critical errors that may lead to a Tribler crash.

The lack of a systematic approach to using these log levels complicates the investigation of bugs and leads to false error detection for CoreCrashedError. When Sentry parses the core output, it identifies the closest logged ERROR, assuming it's the root cause of the CoreCrashedError (see example #7855). However, CoreCrashedError itself often doesn't provide much insight into the cause.

If we agree on this approach, I can perform the switch (estimated time: 2 hours).

@kozlovsky
Copy link
Contributor

kozlovsky commented Jan 26, 2024

To me, a warning is something that can be ignored for a while, and we should not ignore serious Core errors (at least in development) even if those errors do not lead to an immediate Core crash.

Also, parsing the last Core output to extract the reason for the last crash looks like a hack. Even if we change the level of all errors that do not lead to the Core crash to warnings, it still looks possible that a critical error causes a cascade of other critical errors, and the last error in the Core output will still not be the initial error that caused the crash.

My concern is that by switching all non-crashing errors to warnings, we start ignoring them more in development.

Instead, we can extend #7699 in the follow-up PR and provide a robust way to send Core errors to GUI not via the stderr output but via files. Then, in case of the Core crash, we can be sure that the actual error leading to the last Core crash is stored in the last error file, and there should no longer be ambiguity.

Saying this, I'm not against changing some specific non-critical errors to warnings.

@xoriole
Copy link
Contributor

xoriole commented Jan 26, 2024

In general, if there is an exception or an error, I tend to use the same log level. The reason being, at the point of logging, the error or exception may or may not be clear whether it crashes (or should crash) the application. The exception may be re-raised again after logging as well.

Switch all error and exception logger calls to warning if they do not lead to a crash

Instead of all error or exception logger calls, I would go for case-by-case basis wherever it makes sense.

Saying this, I'm not against changing some specific non-critical errors to warnings.

I agree with @kozlovsky here.

@drew2a
Copy link
Contributor Author

drew2a commented Jan 26, 2024

Here are my first three results from searching through the codebase:

async def _on_remote_select_basic(self, peer, request_payload, force_eva_response=False):
try:
sanitized_parameters = self.parse_parameters(request_payload.json)
# Drop selects with deprecated queries
if any(param in sanitized_parameters for param in self.composition.deprecated_parameters):
self.logger.warning(f"Remote select with deprecated parameters: {sanitized_parameters}")
self.ez_send(peer, SelectResponsePayload(request_payload.id, LZ4_EMPTY_ARCHIVE))
return
db_results = await self.process_rpc_query_rate_limited(sanitized_parameters)
self.send_db_results(peer, request_payload.id, db_results, force_eva_response)
except (OperationalError, TypeError, ValueError) as error:
self.logger.error(f"Remote select. The error occurred: {error}")

async def _poll_download_manager(self):
# This must run in all circumstances, so catch all exceptions
try:
dl_states = self.download_manager.get_last_download_states()
self.monitor_downloads(dl_states)
except Exception as e: # pylint: disable=broad-except
self.logger.error("Error on polling Download Manager: %s", e)

def on_e2e_finished(self, address, info_hash):
dl = self.get_download(info_hash)
if dl:
dl.add_peer(address)
else:
self.logger.error('Could not find download for adding hidden services peer %s:%d!', *address)

Are these serious errors, or just warnings?

@kozlovsky
Copy link
Contributor

I can't comment on the last one, but the first two look like errors. We should investigate these errors and fix the reason or improve the error handling by making it less broad and generic.

@drew2a
Copy link
Contributor Author

drew2a commented Jan 26, 2024

@kozlovsky @xoriole, thank you for sharing your opinions. Since it is now clear that there is no support for the suggested improvement, I am closing the issue.

@drew2a drew2a closed this as completed Jan 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants