-
Notifications
You must be signed in to change notification settings - Fork 450
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix the detection of running processes by including process UID to the check #7637
Conversation
702a8b3
to
a1bd858
Compare
…ld process. Stop relying on PIDs when detecting running processes
a1bd858
to
53e3778
Compare
I've been added as a reviewer for this PR. I noticed another PR addresses the same issue. Have you (@kozlovsky, @xoriole) reached a consensus on which approach we should take? |
Just a quick note: from my perspective, this PR is unlikely to resolve #7603, as it addresses a very specific flatpak problem, which is just a small part of the original issue. The platform distribution for the original issue is as follows: |
First, I agree that this is not a complete fix for #7603. Essentially, this Sentry issue combines two unrelated bugs that look similar to Sentry.
Second, I disagree that the Flatpak issue is only a small part of the original issue. I manually traversed through all currently available reports of this sentry issue and counted how many events are related to Flatpak: https://sentry.tribler.org/organizations/tribler/issues/2003/events/latest/?project=7&referrer=latest-event In the available reports, I counted the following:
So, the Flatpak-related issue corresponds to 41.7% of all recorded reports. It is a pretty significant number. By addressing the Flatpak problem in 7.13.1 with this PR, we can demystify the reports, making the second bug easier to isolate and investigate. Regarding another PR #7621 my opinion is:
For that reason, I suggest the following plan:
|
I haven't done a deep dive into the PR yet, but my initial impression is one of concern due to its complexity. I believe it was already intricate to the point where, I suspect, no one besides @kozlovsky fully grasped its logic. Now, it appears even more complicated with the addition of a new thread and the When I approved the PR, I assumed that despite its complexity, it would address the "two instance" problem, which remains unresolved. I'd suggest considering a rollback of the entire SQL lock implementation in favor of a slightly tweaked previous approach. Below is a simplified example of a potential file lock solution. It's incredibly straightforward and would protect us from race conditions, which was the original intention behind implementing #7212. import asyncio
from filelock import FileLock, Timeout
lock = FileLock("gui.lock", timeout=0)
async def run_tribler_gui():
try:
with lock:
print('Tribler is running')
await asyncio.sleep(10) # simulation of a normal tribler run
except Timeout:
print("Tribler is already running. Closing the Tribler...")
exit(1)
asyncio.run(run_tribler_gui()) Ref: https://py-filelock.readthedocs.io/en/latest/index.html |
The simplified core part could be like: import asyncio
from filelock import FileLock, Timeout
lock = FileLock("core.lock", timeout=0)
port_file_name = 'core.port'
async def run_tribler_core():
try:
with lock:
print('Tribler core is running')
print('Creating REST server..')
# assume it has been created on port 1234
port = 1234
with open(port_file_name, "w+") as f:
f.write(str(port))
await asyncio.sleep(10) # simulation of a normal tribler run
except Timeout:
print("Tribler core is already running. Closing the Tribler core...")
exit(1)
asyncio.run(run_tribler_core()) |
Yes, I agree that it is possible for us to use the filelock library. I considered it before implementing SQLite-based lock implementation, but at that moment, the With this, I doubt that the full solution will be as simple as in the suggested example:
|
As discussed at length: lets try to keep it simple; use a file-lock and new dependency. Always a good idea to see if we can keep it simple, as long as we are 90% certain it will work. So a 10% chance of introducing new errors is fine, when we might keep-things-simple 🤔 Lets add this idea to the Application Tester, it is the primary usage scenario of Tribler. To download something. |
Closed in favor of #7660 |
This PR should fix #7603 by associating a unique UID value with each process. It fixes the problem of non-unique PID values.
In addition, each process periodically (1 time per second) updates its
last_alive_at
time in the database. When thelast_alive_at
time is not updated long enough (for 5 seconds), the process is considered not running.Now, the following situations should be handled correctly:
last_alive_at
was recently updated and that the primary process is running in a separate virtual environment.api_port
value, it understands that the correct Core process should specify the correct UID of the current GUI process. That prevents connecting with the wrong Core process.