You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
asyncdefstart_http_site(self):
api_port= ...
self.site=web.TCPSite(self.runner, self.http_host, api_port, shutdown_timeout=self.shutdown_timeout)
self._logger.info(...)
try:
# The self.site.start() is expected to start immediately. It looks like on some machines, it hangs.# The timeout is added to prevent the hypothetical hanging.awaitasyncio.wait_for(self.site.start(), timeout=SITE_START_TIMEOUT)
except ...
The call of self.site.start() should be fast; why do we have this TimeoutError? An additional newly added log information reveals this (right before the TimeoutError):
That means, when wait_for was about to start its inner coroutine self.site.start(), another unrelated coroutine DownloadManager.start_handle occupied the asyncio loop for more than seven seconds, and as a result, the five-second timeout of wait_for(self.site.start(), ...) was triggered even before the server has any chance to start.
You can see from the log that DownloadManager:start_handle spends most of its time on line 634. This line is:
On my machine, the typical call of DownloadManager.create_session takes about 20-30 milliseconds to complete. But it looks on some machines for some downloads it takes more than seven seconds to create and initialize a libtorrent session.
Conclusion
The immediate fix should be removing the wait_for wrapper off the self.site.start() call. An execution of completely unrelated coroutines triggers the timeout, so it does not help us to manage the DownloadManager.site.start() call. Or at least we can significantly increase the timeout, say, to one minute or more.
Then we should move the slow synchronous call DownloadManager.create_session(...) out of the asyncio loop. It is hard to say why the execution of this method is surprisingly slow sometimes, but from Sentry error reports, we can see that in all cases when the TimeoutError was triggered, it was a DownloadManager.create_session(...) call which was responsible for that.
To better understand why the DownloadManager.create_session(...) call is so slow sometimes, we can apply a recently added @profile decorator to it.
Also, we can add a way to attach profiling results to the Sentry report to be able to analyze them.
According to Sentry, this is the most popular error in the latest release.
Note that ALL recorded occurrences of the problem in 7.13.0 are on Win32 (Windows 7, 8, 10) and never on Win64.
It might not be by accident
Now, after extended logging was added in 7.13, it is possible to decipher what is happening.
Analysis
The error happens at Tribler start when GUI tries to connect to Core, and Core is starting the REST HTTP API.
The traceback says that the Core unexpectedly stopped with some error. We can see the Core traceback in the last core stdout/stderr output fragment:
The corresponding code is:
The call of
self.site.start()
should be fast; why do we have thisTimeoutError
? An additional newly added log information reveals this (right before theTimeoutError
):That means, when
wait_for
was about to start its inner coroutineself.site.start()
, another unrelated coroutineDownloadManager.start_handle
occupied the asyncio loop for more than seven seconds, and as a result, the five-second timeout ofwait_for(self.site.start(), ...)
was triggered even before the server has any chance to start.You can see from the log that
DownloadManager:start_handle
spends most of its time on line 634. This line is:Where
get_session
is:On my machine, the typical call of
DownloadManager.create_session
takes about 20-30 milliseconds to complete. But it looks on some machines for some downloads it takes more than seven seconds to create and initialize a libtorrent session.Conclusion
wait_for
wrapper off theself.site.start()
call. An execution of completely unrelated coroutines triggers the timeout, so it does not help us to manage theDownloadManager.site.start()
call. Or at least we can significantly increase the timeout, say, to one minute or more.DownloadManager.create_session(...)
out of the asyncio loop. It is hard to say why the execution of this method is surprisingly slow sometimes, but from Sentry error reports, we can see that in all cases when theTimeoutError
was triggered, it was aDownloadManager.create_session(...)
call which was responsible for that.DownloadManager.create_session(...)
call is so slow sometimes, we can apply a recently added@profile
decorator to it.Partially related to #7065
The text was updated successfully, but these errors were encountered: