Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add additional logs to show progress of reindexing operation. #3746

Merged
merged 2 commits into from
Aug 16, 2019

Conversation

martinmr
Copy link
Contributor

@martinmr martinmr commented Aug 2, 2019

The logs are V(1) like the rest of the reindexing logs.

Fixes #3741


This change is Reviewable

The logs are V(1) like the rest of the reindexing logs.
@martinmr martinmr requested a review from manishrjain as a code owner August 2, 2019 19:50
Copy link

@gitlw gitlw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: 0 of 1 files reviewed, all discussions resolved (waiting on @manishrjain)

Copy link
Contributor

@danielmai danielmai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logs are a bit too noisy. I tried doing the following:

  1. Load up 21-million movie data set.

  2. Reindex the name predicate by adding the exact index:

    curl localhost:8180/alter -d 'name: string @index(hash, term, exact, fulltext, trigram) @lang .'
    

I saw over 2000 lines of log output about the reindexing progress in the alpha logs over the timespan of 1m23s.

Maybe decrease how often we print out the logs (say, every 100k keys) or output a log every so often e.g., every 5 seconds.

Reviewable status: 0 of 1 files reviewed, all discussions resolved (waiting on @manishrjain)

@danielmai
Copy link
Contributor

danielmai commented Aug 3, 2019

See attached alpha1.log for the progress logs when reindexing name in 21-million movie data set.

Copy link
Contributor

@danielmai danielmai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also already see logs like these that mention the index rebuilding process and time elapsed. This comes from posting/index.go:

Rebuilding index for predicate name Time elapsed: 37s, bytes sent: 0 B, speed: 0 B/sec

Reviewable status: 0 of 1 files reviewed, all discussions resolved (waiting on @manishrjain)

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not merge this as it is. See my comments. Also, @danielmai 's comments.

Reviewable status: 0 of 1 files reviewed, 1 unresolved discussion (waiting on @manishrjain and @martinmr)


posting/index.go, line 531 at r1 (raw file):

		}

		counterChan <- struct{}{}

Pushing on a channel for every key is very contentious. This would single-handedly kill performance.

Copy link
Contributor Author

@martinmr martinmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the logs during the stream phase since the stream framework is already providing logs and there's no way to do this without having to use some kind of contention to accurately count the number of keys processed.

I added logs for the phase that writes the deltas to disk. I also updated the logs to all have the same prefix ("Rebuilding index for predicate ...").

Reviewable status: 0 of 1 files reviewed, 1 unresolved discussion (waiting on @manishrjain)


posting/index.go, line 531 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Pushing on a channel for every key is very contentious. This would single-handedly kill performance.

Removed this code.

@martinmr martinmr requested a review from manishrjain August 6, 2019 17:11
@martinmr martinmr dismissed manishrjain’s stale review August 13, 2019 23:53

addressed comments

Copy link

@gitlw gitlw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: 0 of 1 files reviewed, 1 unresolved discussion (waiting on @manishrjain)

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r2.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! all files reviewed, all discussions resolved

@martinmr martinmr merged commit 38b8ccd into master Aug 16, 2019
@martinmr martinmr deleted the martinmr/reindex-progress-log branch August 16, 2019 17:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Add progress log for reindexing process.
4 participants