-
Notifications
You must be signed in to change notification settings - Fork 636
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DYN-5806 lucene node autocomplete #14169
DYN-5806 lucene node autocomplete #14169
Conversation
Added functionality for enabling the use of Lucene Search in node autocomplete, also I've modified the LuceneSearchUtility class for supporting in-memory indexing.
Changes for indexing (in memory RAMDirectory) a subset of nodes every time that the node autocomplete search functionality is executed so the search will be executed using only the subset of nodes indexed (and not over all the 3000 nodes).
Adding brackets in else statement and removing the Distinct() due that all the nodes should be different then there is no need to use this function
src/DynamoCoreWpf/ViewModels/Search/NodeAutoCompleteSearchViewModel.cs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with one more comment
Rolling back the code for SearchViewModel.Search() due that tests are using this method (Lucene indexing is not done when tests are executed - IsTestMode = true). Also I've modified the part in which Node Autocomplete is initialized so it will calling the Lucene search to initialize the results.
Removing unnecessary Distinct()
@RobertGlobant20 @zeusongit Any idea why the BinDiff job would fail? |
The job fails when only a part of it is re-run, because it depends on the attempt ID, re running all the the jobs in that workflow fixed it. btw @QilongTang is that you? ;) |
hi @RobertGlobant20 I think there are a lot regressions from this PR so you need to take another look of the edge cases causing a crash I think |
Trying to fix regressions generated when trying to initialize NodeAutocomplete with Lucene Search.
I've added an extra validation for checking if the StorageType is RAMDirectory or FSDirectory otherwise it won't do things like create the indexWriter or get a new Lucene Document (due that previously were skipped if IsTestMode is True).
Fixing comment
it was failing due that when IsTestMode = True several things are not initialized but because for Node Autocomplete we are using RAMDirectory for indexing then doesn't matter the isTestMode flag. |
Please consider that all this errors were happening due that I was using the NodeAutocomplete Search implementation for pre-populating the candidates (we are only indexing a small subset of node in Memory) then the fix was using the normal Lucene Search which has all the nodes. |
// Initialize Lucene index writer, unless in test mode. | ||
if (!DynamoModel.IsTestMode) | ||
// Initialize Lucene index writer, unless in test mode or we are using RAMDirectory for indexing info. | ||
if (!DynamoModel.IsTestMode || currentStorageType == LuceneStorage.RAM) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this condition currentStorageType == LuceneStorage.RAM
here, when we are not using any files to index for this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, we need the condition otherwise a specific test is failing.
Basically I've added this validation because in the test SearchNodeAutocompletionSuggestions() is doing th next actions:
- Populate Node Autocomplete with candidates
- Search the text "ar" using Node Autocomplete Search (over a small set of node and using Lucene Search with RAMDirectory).
- Check that we are getting 5 results.
So if you check this piece of code is creating the writer then if IsTestMode is True (due that is executing the text) then the writer won't be created and when searching it will return null or crash (meaning that no results were found), the same case for the validation added in the method InitializeIndexDocumentForNodes(), if we don't check the currentStorageType variable then is always returning null and is crashing.
Just to clarify things: remember that with NodeAutocomplete indexing process we are using RAMDirectory (just for the subset of nodes) so the IsTestMode flag doesn't matter any more (there is no problem with concurrency and will be executing the search over the subset of nodes indexed), that's why we have to check for the currentStorageType variable.
At the end we will be removing all those conditions related to IsTestMode and currentStorageType but we need to refactor the code so that when tests are being executed then the indexing is done using RAMDirectory otherwise we will be using FSDirectory.
Please let me know if this explanation is clear or we need to check the implementation in detail
Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks I think that matches my expectation, we should refactor the code soon for unit testing
@@ -248,7 +269,7 @@ internal string CreateSearchQuery(string[] fields, string SearchTerm) | |||
internal void DisposeWriter() | |||
{ | |||
//We need to check if we are not running Dynamo tests because otherwise parallel test start to fail when trying to write in the same Lucene directory location | |||
if (!DynamoModel.IsTestMode) | |||
if (!DynamoModel.IsTestMode || currentStorageType == LuceneStorage.RAM) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The answer is similar than above, with AutoComplete we are indexing and searching for both cases:
- Dynamo execution
- Tests execution
so for Node Autocomplete the writer needs to be disposed (doesn't matter if we are in test mode or not)
re-triggering pullRequestValidation job
@QilongTang seem that those regressions were NOT real, I sent an execution manually and I didn't get errors. |
Purpose
Implementing the Lucene Search in Node Autocomplete Search funcionality
Added functionality for enabling the use of Lucene Search in node autocomplete, I've modified the LuceneSearchUtility class for supporting in-memory indexing.
Also I did changes for indexing (in memory RAMDirectory) a subset of nodes every time that the node autocomplete search functionality is executed so the search will be executed using only the subset of nodes indexed (and not over all the 3000 nodes).
Declarations
Check these if you believe they are true
*.resx
filesRelease Notes
Implementing the Lucene Search in Node Autocomplete Search funcionality
Reviewers
@QilongTang @reddyashish
FYIs