Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lucene index amend improvements #14513

Merged
merged 16 commits into from
Oct 27, 2023
Merged

Lucene index amend improvements #14513

merged 16 commits into from
Oct 27, 2023

Conversation

QilongTang
Copy link
Contributor

@QilongTang QilongTang commented Oct 23, 2023

Purpose

Lucene index improvements to make sure index locks can always be released and amended when needed by multiple Dynamo sessions. Index lifetime diagrams as below:

Dynamo 2 19Index Process drawio

Dynamo 3 0Index Process drawio

What this PR includes:

  • Add getter for luceneSearchUtility.DirReader so we can remove the setting in each Dynamo class
  • User logger from DynamoModel to avoid threading issues when index lock gain timeout exception happens
  • Dispose indexwriter after package install and package path updates so node index are unlocked right away
  • Re-create indexwriter when asked if it is not available after DynamoModel initialization

Here is what it looks like:
SideBySideDynamoSearch

Declarations

Check these if you believe they are true

  • The codebase is in a better state after this PR
  • Is documented according to the standards
  • The level of testing this PR includes is appropriate
  • User facing strings, if any, are extracted into *.resx files
  • All tests pass using the self-service CI.
  • Snapshot of UI changes, if any.
  • Changes to the API follow Semantic Versioning and are documented in the API Changes document.
  • This PR modifies some build requirements and the readme is updated
  • This PR contains no files larger than 50 MB

Release Notes

Multiple Dynamo sessions can use search independently

Reviewers

(FILL ME IN) Reviewer 1 (If possible, assign the Reviewer for the PR)

(FILL ME IN, optional) Any additional notes to reviewers or testers.

FYIs

(FILL ME IN, Optional) Names of anyone else you wish to be notified of

@QilongTang QilongTang marked this pull request as ready for review October 23, 2023 23:28
@QilongTang QilongTang added the DNM Do not merge. For PRs. label Oct 23, 2023
@QilongTang
Copy link
Contributor Author

Opened for running regressions, still WIP

@QilongTang QilongTang added this to the 3.0 milestone Oct 23, 2023
@@ -147,12 +153,12 @@ internal void CreateLuceneIndexWriter()
{

DisposeWriter();
(ExecutionEvents.ActiveSession.GetParameterValue(ParameterKeys.Logger) as DynamoLogger).LogError($"LuceneNET LockObtainFailedException {ex}");
dynamoModel.Logger.LogError($"LuceneNET LockObtainFailedException {ex}");
Copy link
Contributor Author

@QilongTang QilongTang Oct 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to revert this because I found in edge case if the index is ever locked (although would rarely happen after all the improvements in this PR), the previous code will make Dynamo hang because of threading issues

Copy link
Contributor

@reddyashish reddyashish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a comment

@QilongTang QilongTang removed the DNM Do not merge. For PRs. label Oct 25, 2023
@@ -366,7 +367,7 @@ private void RefreshBackupFileList(IEnumerable<string> filePaths)
IEnumerable<string> filePaths)
{
files.Clear();
foreach (var filePath in filePaths)
foreach (var filePath in filePaths.Where(x => x != null))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is throwing a bunch of exceptions so filtering out the nulls from the list

@@ -37,6 +37,9 @@ public static Exception CreateFolderIfNotExist(string folderPath)
{
try
{
// Do not even try when folder path is null or empty.
// This usually happens when system folder dialog is initialized with empty path
if (string.IsNullOrEmpty(folderPath)) return null;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was throwing exception when folder path is null

@QilongTang
Copy link
Contributor Author

I am considering adding getter for dirReader and Searcher because they are always the same code so we do not have to leak this Lucence specific code everywhere in the Dynamo code base

@QilongTang
Copy link
Contributor Author

Hi @reddyashish This should be ready for a final review

@QilongTang
Copy link
Contributor Author

There are some new regressions that I am taking a look

Copy link
Contributor

@reddyashish reddyashish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once the regressions are fixed.

@QilongTang
Copy link
Contributor Author

LGTM once the regressions are fixed.

Looks like it is my last commit about refactoring cause the regressions, looking at it

@@ -395,8 +403,8 @@ internal void DisposeWriter()
/// </summary>
internal void DisposeAll()
{
writer?.Dispose();
dirReader?.Dispose();
DisposeWriter();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this was causing the regressions, because writer was disposed first so we need to set it to null, otherwise in the DirReader getter, we will be trying to access an already disposed object

@QilongTang
Copy link
Contributor Author

Regressions fixed, waiting for PR validation task to pass

@QilongTang
Copy link
Contributor Author

All tests passing, merging

@QilongTang QilongTang merged commit 4f279a9 into master Oct 27, 2023
25 checks passed
@QilongTang QilongTang deleted the LuceneIndexAmend branch October 27, 2023 18:27
QilongTang added a commit that referenced this pull request Mar 4, 2024
* Use index writer to amend index after Dynamo Launch

* Update sorting

* Update

* Update

* Update

* Make sure package loading end will already release index lock

* Update Comments

* Clean Up

* Code clean up

* clean up

* Code Clean Up

* regressions

* revert code clean up because it affects running tests in parallel

* update
@QilongTang QilongTang mentioned this pull request Mar 4, 2024
9 tasks
QilongTang added a commit that referenced this pull request Mar 5, 2024
* DYN-6038 Lucene search IMP (#14428)

* improvements

* Simply initialization logic

* Skip indexing as part of DynamoModel if the index files already exist

* Remove DynamoModel reference and use singleton

* clean up

* improvements and package manager search

* Use single constructor and other code clean up

* Update...

* Update

* Dispose all Lucene objects in the correct order

* Adding DynamoModel back to test regressions

* Null check

* Update

* IndexWriter dispose sequence only for non-RAM mode

* Update

* Lucene index amend improvements (#14513)

* Use index writer to amend index after Dynamo Launch

* Update sorting

* Update

* Update

* Update

* Make sure package loading end will already release index lock

* Update Comments

* Clean Up

* Code clean up

* clean up

* Code Clean Up

* regressions

* revert code clean up because it affects running tests in parallel

* update

* update

* Update AssemblySharedInfo.cs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants