Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DYN-5795 Lucene Search Weights #14062

Merged
merged 1 commit into from
Jun 13, 2023
Merged

DYN-5795 Lucene Search Weights #14062

merged 1 commit into from
Jun 13, 2023

Conversation

RobertGlobant20
Copy link
Contributor

@RobertGlobant20 RobertGlobant20 commented Jun 7, 2023

Purpose

Minor changes in the Lucene Search functionality
The hard-coded values for the field names were moved to the Configurations class and all the places in which this names were used were replaced. Also in the SearchViewModel.Search() method I've done minor changes to consider the wildcard expression * keyword *.
The next fields were removed: "InputParameters", "OutputParameters", "PackageName", "PackageVersion" due that are not used in the Legacy Search and neither in the Lucene search.

TODO - There is still a functionality that I think should be implemented (but not sure about it):
In the Legacy Search at indexing time each keyword (SearchKeywords) has assigned a specific weight between 0.0 - 1.0 (SearchKeywordsWeight) but when running the query those weights are used for sorting the results, in comparison, in the Lucene Search we are assigning a fixed weights for all the tags (see image attached), then for fixing this case we need to convert the SearchKeywords weights from 0.0-1.0 to 1 - 10 scale and set the right Boost value for each word, I think this should be implemented in the string CreateSearchQuery(string[] fields, string searchKey) method.
image
image
@reddyashish

Declarations

Check these if you believe they are true

  • The codebase is in a better state after this PR
  • Is documented according to the standards
  • The level of testing this PR includes is appropriate
  • User facing strings, if any, are extracted into *.resx files
  • All tests pass using the self-service CI.
  • Snapshot of UI changes, if any.
  • Changes to the API follow Semantic Versioning and are documented in the API Changes document.
  • This PR modifies some build requirements and the readme is updated
  • This PR contains no files larger than 50 MB

Release Notes

Minor changes in the Lucene Search functionality

Reviewers

@QilongTang @reddyashish

FYIs

The hard-coded values for the field names were moved to the Configurations class and all the places in which this names were used were replaced.
Also in the SearchViewModel.Search() method I've done minor changes to consider the wildcard expression * keyword *
@RobertGlobant20
Copy link
Contributor Author

GIF showing behavior of Lucene Search
LuceneSearchWildcards

@QilongTang QilongTang added this to the 2.19.0 milestone Jun 8, 2023
/// <summary>
/// This represent the fields that will be indexed when initializing Lucene Search
/// </summary>
public enum IndexFieldsEnum
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine place for now, I may move these to dedicated Lucene config file later in my PR

wildcardQuery = new WildcardQuery(new Term(f, s + "*"));
if (f.Equals("Name")) { wildcardQuery.Boost = 5; }
else { wildcardQuery.Boost = 2; }
wildcardQuery = new WildcardQuery(new Term(f, "*" + s + "*"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably could benefit with some comments

@QilongTang QilongTang merged commit 30fd0a1 into DynamoDS:master Jun 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants