Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workflow scheduling race condition #1450

Closed
jmchilton opened this issue Jan 8, 2016 · 1 comment
Closed

Workflow scheduling race condition #1450

jmchilton opened this issue Jan 8, 2016 · 1 comment

Comments

@jmchilton
Copy link
Member

This piece of bioblend test code:

        self.gi.workflows.cancel_invocation(workflow_id, invocation_id)
        invocation = self.gi.workflows.show_invocation(workflow_id, invocation_id)
        self.assertEqual(invocation['state'], 'cancelled')

Causes this error transiently:

======================================================================
FAIL: test_cancelling_workflow_scheduling (TestGalaxyWorkflows.TestGalaxyWorkflows)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/build/galaxyproject/bioblend/tests/test_util.py", line 56, in wrapped_method
    return method(has_gi, *args, **kwargs)
  File "/home/travis/build/galaxyproject/bioblend/tests/test_util.py", line 56, in wrapped_method
    return method(has_gi, *args, **kwargs)
  File "/home/travis/build/galaxyproject/bioblend/tests/TestGalaxyWorkflows.py", line 98, in test_cancelling_workflow_scheduling
    self.assertEqual(invocation['state'], 'cancelled')
AssertionError: u'ready' != 'cancelled'

Which I believe is demonstrating the race condition described as follows:

17:17 < jmchilto1> this looks like a bug in galaxy - a race condition
17:17 < jmchilto1> one thread marks it as canceled and another as ready
17:17 < jmchilto1> nsoranzo: if you rerun this test does it usually pass?
17:18 < nsoranzo> Yes
17:18 < jmchilto1> that is a bug - we have a test that found a bug 
17:18 < jmchilto1> project test stuff is a success
17:19 < dave_b> awesome, the tests have served their purpose. We can delete them now.
17:19 < jmchilto1> nsoranzo: how about we just throw a skip if that is ready instead of 
                   cancelled
17:20 < nsoranzo> I thought it was a delay in getting the state to the database before the 
                  new call to show_invocation()
17:21 < jmchilto1> I think it is two threads wirting the state in galaxy. I don't think a 
                   delay would help
17:21 < jmchilto1> when that cancel call is complete - the db is almost certainly flushed
17:21 < jmchilto1> so show should be callable right away
17:22 < jmchilto1> on the other hand - one can easily imagine two galaxy threads - one 
                   responding to the cancel and the workflow scheduler - each with a copy 
                   of the invocation
17:23 < jmchilto1> they each geet it in state 'new' - cancel changed it to 'cancelled' and 
                   the scheduler changes it to 'ready' - two seconds later it could be 
                   flushed as either
@mvdbeek
Copy link
Member

mvdbeek commented Nov 14, 2023

If this was still a problem I'd assume 2a11420 is going to fix it

@mvdbeek mvdbeek closed this as completed Nov 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants