Remove parallelization of get hybridscores #779

VijayanB · 2024-06-06T20:02:08Z

Description

This parallelization is not adding any value after comparing the benchmarks with and without this optimization. Hence removing it.

Issues Resolved

Check List

New functionality includes testing.
- All tests pass
New functionality has been documented.
- New functionality has javadoc added
Commits are signed as per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

This parallelization is not adding any value after comparing the benchmarks with and without this optimization. Hence removing it. Signed-off-by: Vijayan Balasubramanian <[email protected]>

martin-gaievski · 2024-06-06T20:07:44Z

Can you share rough results of benchmarks, is it the same latency or worse with and without parallel get scores?

VijayanB · 2024-06-06T20:20:14Z

Can you share rough results of benchmarks, is it the same latency or worse with and without parallel get scores?

Query Count	Vector Count	No of Vector Search Query	No of Term Query	No of Sub-queries	is parallel enabled	K	size	P50 ( client time in ms )	P90 ( client time in ms )	P99 ( client time in ms )
1000	50000	1	1	2	no	100	100	101	106	121
1000	50000	1	1	2	yes	100	100	101	107	122
1000	50000	2	1	3	no	100	100	100	108	149
1000	50000	2	1	3	yes	100	100	100	108	152
1000	50000	1	2	3	no	100	100	106	120	200
1000	50000	1	2	3	yes	100	100	106	122	205

VijayanB · 2024-06-06T20:22:46Z

Can you share rough results of benchmarks, is it the same latency or worse with and without parallel get scores?

Ran 10 experiments with and without this change, P50 looks same, whereas i see few degradation with P90 and P99 ( in few milliseconds) in 8 out of 10 experiments. This is also consistent with took time. However, i can take a follow up task on how to tune this to improve further .

VijayanB · 2024-06-06T21:19:07Z

I added skip change-log, while merging to main, will group it into 1.

…ject#779) This parallelization is not adding any value after comparing the benchmarks with and without this optimization. Hence removing it. Signed-off-by: Vijayan Balasubramanian <[email protected]>

Remove parallelization while collecting hybrid scores

1b48281

This parallelization is not adding any value after comparing the benchmarks with and without this optimization. Hence removing it. Signed-off-by: Vijayan Balasubramanian <[email protected]>

VijayanB requested review from heemin32, navneet1v, vamshin, jmazanec15, naveentatikonda, junqiu-lei, martin-gaievski, sean-zheng-amazon, model-collapse, zane-neo, ylwu-amzn, jngz-es, vibrantvarun and zhichao-aws as code owners June 6, 2024 20:02

VijayanB added the skip-changelog label Jun 6, 2024

martin-gaievski approved these changes Jun 6, 2024

View reviewed changes

naveentatikonda approved these changes Jun 6, 2024

View reviewed changes

VijayanB merged commit ed3b974 into opensearch-project:feature/parallelize-hybrid-search Jun 6, 2024
86 of 102 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove parallelization of get hybridscores #779

Remove parallelization of get hybridscores #779

VijayanB commented Jun 6, 2024 •

edited

Loading

martin-gaievski commented Jun 6, 2024

VijayanB commented Jun 6, 2024

VijayanB commented Jun 6, 2024 •

edited

Loading

VijayanB commented Jun 6, 2024

Remove parallelization of get hybridscores #779

Remove parallelization of get hybridscores #779

Conversation

VijayanB commented Jun 6, 2024 • edited Loading

Description

Issues Resolved

Check List

martin-gaievski commented Jun 6, 2024

VijayanB commented Jun 6, 2024

VijayanB commented Jun 6, 2024 • edited Loading

VijayanB commented Jun 6, 2024

VijayanB commented Jun 6, 2024 •

edited

Loading

VijayanB commented Jun 6, 2024 •

edited

Loading