Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lop.PartitionBy does not guarantee the same order #510

Open
nicewook opened this issue Aug 9, 2024 · 3 comments · May be fixed by #572
Open

lop.PartitionBy does not guarantee the same order #510

nicewook opened this issue Aug 9, 2024 · 3 comments · May be fixed by #572

Comments

@nicewook
Copy link

nicewook commented Aug 9, 2024

I have tested the example of lop.PartitionBy on README.md
It keep changed the order

@dengyunsheng250
Copy link

well i have tested in bash file and it always return the same result

@nicewook
Copy link
Author

sorry for confusion. I mean the parallel version. belows are my test and the result

func ExamplePartitionByParallel3() {
	partitions := lop.PartitionBy([]int{-2, -1, 0, 1, 2, 3, 4, 5}, func(x int) string {
		if x < 0 {
			return "negative"
		} else if x%2 == 0 {
			return "even"
		}
		return "odd"
	})
	fmt.Println(partitions)
	// Output:
	// [[-2 -1], [0 2 4], [1 3 5]]
}

result

=== RUN   ExamplePartitionByParallel3
--- FAIL: ExamplePartitionByParallel3 (0.00s)
got:
[[5 1 3] [2 4 0] [-2 -1]]
want:
[[-2 -1], [0 2 4], [1 3 5]]

@liyishuai
Copy link

liyishuai commented Jan 7, 2025

lo/parallel/slice.go

Lines 95 to 125 in 5044a1d

// PartitionBy returns an array of elements split into groups. The order of grouped values is
// determined by the order they occur in collection. The grouping is generated from the results
// of running each element of collection through iteratee.
// `iteratee` is call in parallel.
func PartitionBy[T any, K comparable, Slice ~[]T](collection Slice, iteratee func(item T) K) []Slice {
result := []Slice{}
seen := map[K]int{}
var mu sync.Mutex
var wg sync.WaitGroup
wg.Add(len(collection))
for _, item := range collection {
go func(_item T) {
key := iteratee(_item)
mu.Lock()
resultIndex, ok := seen[key]
if !ok {
resultIndex = len(result)
seen[key] = resultIndex
result = append(result, []T{})
}
result[resultIndex] = append(result[resultIndex], _item)
mu.Unlock()
wg.Done()
}(item)
}

Not only is the iteratee called in parallel, but also the append operation as well.

The correct implementation should be:

  1. Map each item to a key, using lop.Map.
  2. Loop over the items sequentially, read its computed key, and append the item to the corresponding bucket in the result.

There should be no lock in here.

@liyishuai liyishuai linked a pull request Jan 7, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants