Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CELEBORN-1725][FOLLOWUP] Optimize isAllMapTasksEnd performance #2959

Closed
wants to merge 4 commits into from

Conversation

turboFei
Copy link
Member

@turboFei turboFei commented Nov 27, 2024

What changes were proposed in this pull request?

Followup for #2905,

using the same logic to optimize isAllMapTasksEnd method.

Why are the changes needed?

Address comments: #2905 (review)

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Same logic with #2905

@turboFei turboFei requested a review from cfmcgrady November 27, 2024 18:51
Copy link
Contributor

@onebox-li onebox-li left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FMX
Copy link
Contributor

FMX commented Nov 28, 2024

You can add a new PR to fix all similar code snippets. There are still some more exists codes.
For example, there are similar codes in ReducePartitionCommitHandler.

@turboFei turboFei force-pushed the celeborn_1725_follow branch from 8c79f5b to b065492 Compare November 28, 2024 04:18
@turboFei turboFei marked this pull request as draft November 28, 2024 04:24
@turboFei turboFei marked this pull request as ready for review November 28, 2024 04:25
@turboFei turboFei force-pushed the celeborn_1725_follow branch from 2c50c20 to e03ae05 Compare November 28, 2024 04:57
@turboFei
Copy link
Member Author

You can add a new PR to fix all similar code snippets. There are still some more exists codes. For example, there are similar codes in ReducePartitionCommitHandler.

Ok, thx

utils
@turboFei turboFei force-pushed the celeborn_1725_follow branch from b065492 to d9012d1 Compare November 28, 2024 05:00
@turboFei
Copy link
Member Author

turboFei commented Nov 28, 2024

Thanks! I think this code fragment could be polished too . https://github.com/apache/celeborn/blob/main/worker/src/main/scala/org/apache/celeborn/service/deploy/worker/Controller.scala#L455-L466

Thanks for the suggestion, will address it in CELEBORN-1753 for exists and find

@turboFei turboFei force-pushed the celeborn_1725_follow branch from e5b3c59 to c1acead Compare November 28, 2024 05:55
@turboFei turboFei requested a review from pan3793 November 28, 2024 06:18
@RexXiong RexXiong closed this in c84733f Nov 29, 2024
RexXiong pushed a commit that referenced this pull request Nov 29, 2024
### What changes were proposed in this pull request?

Followup for #2905,

using the same logic to optimize `isAllMapTasksEnd` method.

### Why are the changes needed?
Address comments: #2905 (review)

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
Same logic with #2905

Closes #2959 from turboFei/celeborn_1725_follow.

Authored-by: Wang, Fei <[email protected]>
Signed-off-by: Shuang <[email protected]>
(cherry picked from commit c84733f)
Signed-off-by: Shuang <[email protected]>
RexXiong pushed a commit that referenced this pull request Nov 29, 2024
Followup for #2905,

using the same logic to optimize `isAllMapTasksEnd` method.

Address comments: #2905 (review)

No.

Same logic with #2905

Closes #2959 from turboFei/celeborn_1725_follow.

Authored-by: Wang, Fei <[email protected]>
Signed-off-by: Shuang <[email protected]>
(cherry picked from commit c84733f)
Signed-off-by: Shuang <[email protected]>
@RexXiong
Copy link
Contributor

Thanks, merge to main(V0.6.0) and branch-0.5(v0.5.3) and branch-0.4(v0.4.3)

RexXiong pushed a commit that referenced this pull request Dec 23, 2024
### What changes were proposed in this pull request?

Optimize the code for `exists` and `find`.

1.  Enhance the performance to lookup workerInfo by workerUniqueId instead of looping the collection:
 https://github.com/apache/celeborn/blob/74c1ec0a7fcc4d9efb26d9b96901234eb76e22cd/client/src/main/scala/org/apache/celeborn/client/LifecycleManager.scala#L65-L66

Change the type to:
```
 type ShuffleAllocatedWorkers =
    ConcurrentHashMap[Int, ConcurrentHashMap[String, ShufflePartitionLocationInfo]]
```
And save the `WorkerInfo` into `ShufflePartitionLocationInfo`.
```
class ShufflePartitionLocationInfo(val workerInfo: WorkerInfo) {
...
}
```

So that, we can get the `WorkerInfo` by worker uniqueId fast.

2. Reduce the loop cost for below code: https://github.com/apache/celeborn/blob/33ba0e02f56bfa032c02d1e41c52573c79661b1b/worker/src/main/scala/org/apache/celeborn/service/deploy/worker/Controller.scala#L455-L466

### Why are the changes needed?

Enhance the performance.
Address comments:
#2959 (review)
#2959 (comment)

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?

GA

Closes #2962 from turboFei/CELEBORN_1753_exists.

Lead-authored-by: Wang, Fei <[email protected]>
Co-authored-by: Fei Wang <[email protected]>
Signed-off-by: Shuang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants