Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add hybrid search blog #2182

Merged
merged 18 commits into from
Oct 4, 2023

Conversation

kolchfa-aws
Copy link
Collaborator

Description

Adds hybrid search blog

Issues Resolved

Fixes #1872

Check List

  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the BSD-3-Clause License.

Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
---
layout: post
title: Hybrid search is generally available in OpenSearch 2.10
authors:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to add Vamshi as well, @navneet1v what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes lets add him too

Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This image has a comment on it, lets fix the image

Comment on lines 363 to 368
* Executing individual queries in parallel.
* Adding more configuration options and parameters to the normalization processor to allow more control over combined results. For instance, we can add the ability to specify a minimal score for documents to be returned in the results, which will avoid returning non-competitive hits.
* Supporting results pagination
* Supporting filters in the hybrid query clause. It’s possible to define a filter for each inner query individually, but it’s not optimal if a filter condition is the same for all inner queries.
* Adding more benchmark results for larger datasets so we can provide recommendations for using hybrid search in various configurations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have github issues around all of them. @martin-gaievski please provide those issues here

Copy link
Member

Copy link
Contributor

@navneet1v navneet1v left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added small comments. Overall looks good.

Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Copy link
Collaborator

@natebower natebower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kolchfa-aws Please see my comments and changes and push to @pajuric once addressed. Thanks!

_authors/vamshin.markdown Outdated Show resolved Hide resolved
_posts/2023-09-21-hybrid-search.md Outdated Show resolved Hide resolved
_posts/2023-09-21-hybrid-search.md Outdated Show resolved Hide resolved
_posts/2023-09-21-hybrid-search.md Outdated Show resolved Hide resolved
_posts/2023-09-21-hybrid-search.md Outdated Show resolved Hide resolved
_posts/2023-09-21-hybrid-search.md Outdated Show resolved Hide resolved
_posts/2023-09-21-hybrid-search.md Outdated Show resolved Hide resolved
_posts/2023-09-21-hybrid-search.md Outdated Show resolved Hide resolved

## References

1. The ABCs of semantic search in OpenSearch: Architectures, benchmarks, and combination strategies, https://opensearch.org/blog/semantic-science-benchmarks.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. The ABCs of semantic search in OpenSearch: Architectures, benchmarks, and combination strategies, https://opensearch.org/blog/semantic-science-benchmarks.
1. _The ABCs of semantic search in OpenSearch: Architectures, benchmarks, and combination strategies_. https://opensearch.org/blog/semantic-science-benchmarks.


## References

1. The ABCs of semantic search in OpenSearch: Architectures, benchmarks, and combination strategies, https://opensearch.org/blog/semantic-science-benchmarks.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I've done here, let's put the titles in italics and separate the titles from the links with a period.

kolchfa-aws and others added 2 commits September 26, 2023 12:26
Co-authored-by: Nathan Bower <[email protected]>
Signed-off-by: kolchfa-aws <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
@kolchfa-aws
Copy link
Collaborator Author

@pajuric This blog has been through editorial review and I've addressed all comments. The only thing missing is the meta/keywords; otherwise it's ready to publish. Thanks!

date: 2023-09-21
categories:
- technical-posts
meta_keywords:
Copy link

@pajuric pajuric Oct 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update with the following meta:

meta_keywords: Improve search relevance, hybrid search in OpenSearch 2.10, semantic and keyword search
meta_description: Improve search relevance with OpenSearch 2.10 when you tune search relevance by using hybrid search to combine and normalize query relevance scores.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pajuric Done, thank you!

Signed-off-by: Fanit Kolchina <[email protected]>
@pajuric
Copy link

pajuric commented Oct 4, 2023

@krisfreedain @dtaivpp - We are GTG on pushing this blog live.

minor edits to date and hyperlinks
Copy link
Member

@krisfreedain krisfreedain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@krisfreedain krisfreedain merged commit a639e19 into opensearch-project:main Oct 4, 2023
3 of 4 checks passed
@navneet1v
Copy link
Contributor

When the blog going live?

@krisfreedain krisfreedain mentioned this pull request Oct 4, 2023
1 task

The following table provides further details of the test datasets used for benchmarking.

|Dataset |Average query length |Average query length |Average query length |Average query length |Average query length |Average query length |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The headings are all same. :( This needs to be fixed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Blog] Improved Hybrid Search relevancy with Normalized score combination
6 participants