Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce lock to prevent parallel task execution #9858

Merged
merged 4 commits into from
Aug 24, 2021

Conversation

tsmaeder
Copy link
Contributor

@tsmaeder tsmaeder commented Aug 6, 2021

Signed-off-by: Thomas Mäder [email protected]

What it does

Fixes #9806

The idea is to have a lock that prevents interleaved execution of the check whether a task is already running with the actual starting of the task.
The approach is to delay later execution of tasks instead of ignoring them, because runTask() is used in various places that treat failure to start a task as an error.

How to test

to reproduce first:

You should be able to start multiple instances of the same task by clicking on the tasks view.

to verify it's fixed, keep the above changes, but check out the PR branch

You should not be able to create multiple instances of the same task.

Review checklist

Reminder for reviewers

@tsmaeder
Copy link
Contributor Author

tsmaeder commented Aug 6, 2021

Grrrr...changed code to use injection and didn't rerun the tests. Stand by...

Copy link
Member

@msujew msujew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can confirm that the issue exists on master and is resolved by these changes. Starting a task multiple times is no longer possible.

*
* const stringValue = await myPromise.then(delay(600)).then(value => value.toString());
*
* @param ms the number of millisecond os dealy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @param ms the number of millisecond os dealy
* @param ms the number of milliseconds to delay

Comment on lines 64 to 66
* A function to allow a promise resolution to be delayed by a number of milliseconds. Usage is like so:
*
* const stringValue = await myPromise.then(delay(600)).then(value => value.toString());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* A function to allow a promise resolution to be delayed by a number of milliseconds. Usage is like so:
*
* const stringValue = await myPromise.then(delay(600)).then(value => value.toString());
* A function to allow a promise resolution to be delayed by a number of milliseconds. Usage is as follows:
*
* `const stringValue = await myPromise.then(delay(600)).then(value => value.toString());`

packages/task/src/browser/task-service.ts Outdated Show resolved Hide resolved
packages/core/src/common/lock.ts Outdated Show resolved Hide resolved
Comment on lines 23 to 25
beforeEach(() => {

});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not useful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦

packages/task/src/browser/task-service.ts Outdated Show resolved Hide resolved
packages/task/src/browser/task-service.ts Outdated Show resolved Hide resolved
@vince-fugnitto vince-fugnitto added the tasks issues related to the task system label Aug 9, 2021
@colin-grant-work
Copy link
Contributor

Although in general I'm not a fan of the question 'why not use an existing library?' I do think it deserves to be posed. There seem to be a few libraries that implement lock-like functionality with enough downloads to make it plausible that they're useful:
async-mutex
async-lock

@paul-marechal
Copy link
Member

@colin-grant-work async-mutex looks great.

@paul-marechal
Copy link
Member

@tsmaeder what do you think about async-mutex? In your case you might be interested by the Lock class. I am interested by the Semaphore class for one of my PRs (in order to limit concurrency).

@tsmaeder
Copy link
Contributor Author

@paul-marechal I'm not sure: I'm relying on the fact that I can multi-release the same Lock multiple times without it causing an error. The doc of async-mutex does not mention that. Out of curiosity, how does a semaphore (aka n parallel tasks) make sense in the single-threaded environment?

@paul-marechal
Copy link
Member

paul-marechal commented Aug 17, 2021

I'm relying on the fact that I can multi-release the same Lock multiple times without it causing an error.

I don't see that being relied upon within the code? Perhaps I am missing it? But from what I understand calling async-lock's acquired release function multiple times only releases once. We can always open a PR to add that missing documentation upstream.

[...] how does a semaphore (aka n parallel tasks) make sense in the single-threaded environment?

Semaphores apply to any concurrent system. Note that concurrency can be achieved without parallelism. Node's async tasks are very much concurrent, that's the whole selling point of Node :) See this or that.

In my case I would spawn X amount of upload promises, and the semaphore would limit the amount uploaded by those concurrent tasks so that only a fewer amount Y is actually uploading at any given time. The semaphore would guard the "upload budget count" shared resource.

@tsmaeder
Copy link
Contributor Author

In my case I would spawn X amount of upload promises, and the semaphore would limit the amount uploaded by those concurrent tasks so that only a fewer amount Y is actually uploading at any given time.

But why: doesn't the I/O of the upload block the promises from proceeding just as effectively as waiting on the semaphore?

@paul-marechal
Copy link
Member

paul-marechal commented Aug 18, 2021

The I/O is happening in parallel in my case, so the semaphore would help throttle the logic. Right now I also had to implement my own class to handle only X tasks at a given time, but semaphores from async-mutex might be more adequate than re-implementing something equivalent in Theia.

A few lines of code might be worth a thousand words:

I invite you to open a new empty tab, open the dev tools, open the network tab and/or the console and paste the following snippets. They each do 16 requests to some random API, but one is doing the requests in parallel and the other not.

// parallel
(async function() {
    console.log('start');
    const start = Date.now();
    const promises = [];
    for (let i = 0; i < 16; i++) {
        promises.push((async () => {
            const response = await fetch('https://v2.jokeapi.dev/joke/Any?type=single');
            const { joke = 'something went wrong' } = await response.json();
            return joke;
        })());
    }
    const jokes = await Promise.all(promises);
    const end = Date.now();
    console.log('jokes:', jokes);
    console.log(`end (took: ${end - start}ms)`);
})();
// sequential
(async function() {
    console.log('start');
    const start = Date.now();
    const jokes = [];
    for (let i = 0; i < 16; i++) {
        const response = await fetch('https://v2.jokeapi.dev/joke/Any?type=single');
        const { joke = 'something went wrong' } = await response.json();
        jokes.push(joke);
    }
    const end = Date.now();
    console.log('jokes:', jokes);
    console.log(`end (took: ${end - start}ms)`);
})();

Promises are handles to some underlying operation. Said operation happens "outside of JS" meaning it can happen in parallel. JS code attached to a promise won't run in parallel of another JS handler code, but we still benefit from having the underlying process doing things in the background in parallel.

@tsmaeder
Copy link
Contributor Author

@paul-marechal but why throttle uploads? And if we are trottling, what is the resource we're trying to conserve? I'm not questioning Semaphores in general, I'm asking about your specific case.

@tsmaeder
Copy link
Contributor Author

@colin-grant-work I'm kinda split down the middle about using a library: async-lock is a no-go for me since it's a not being maintained. Async mutex seems weird: it has two interface classes that basically do the same thing (a mutex is really just a semaphore with n=1). I'm just no sure there is enough meat there to not write the utility ourselves.

@colin-grant-work
Copy link
Contributor

@tsmaeder, I'm certainly not wedded to either of those, and this is a simple-enough utility that I don't think we need to worry too much about missing subtleties. It's one I've been tempted to write in the past to handle synchronous calls to update preferences, and I think your implementation handles that case, as well, and with the added feature of safe multiple-release.

@paul-marechal
Copy link
Member

paul-marechal commented Aug 20, 2021

@tsmaeder in my case I noticed that not throttling uploads on my environment takes more time than when throttled. But from what I've seen it's a common thing to do to not hammer servers down with too many concurrent requests.

@tsmaeder tsmaeder force-pushed the 9806_parallel_task_runs branch from 27bd572 to d7fe10d Compare August 23, 2021 12:41
@tsmaeder
Copy link
Contributor Author

I've changed the code to use async-mutex.

Copy link
Member

@paul-marechal paul-marechal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confirm that I am unable to start the same task multiple times anymore, following the "How to test" instructions.

Code LGTM.

packages/core/src/common/promise-util.ts Show resolved Hide resolved
@tsmaeder tsmaeder merged commit ff9e050 into eclipse-theia:master Aug 24, 2021
dna2github pushed a commit to dna2fork/theia that referenced this pull request Aug 25, 2021
RomanNikitenko pushed a commit that referenced this pull request Sep 16, 2021
RomanNikitenko pushed a commit that referenced this pull request Sep 16, 2021
azatsarynnyy pushed a commit to redhat-developer/eclipse-theia that referenced this pull request Sep 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tasks issues related to the task system
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TaskService.runTask(...) not safe for parallel Execution
5 participants