Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize search (8x speedup) #2415

Merged
merged 17 commits into from
Feb 5, 2024

Conversation

LilithHafner
Copy link
Contributor

@LilithHafner LilithHafner commented Jan 20, 2024

  • Move main search workload to a web worker to keep the main thread responsive
  • Move some of search result display to web worker for further offloading and to reduce data transfer
  • Given this offloading, the debouncing is no longer necessary, even on very slow machines. Remove it.
  • Remove slow .forEach loop
  • Limit result display to 200 results
  • Store unfiltered results so that adjusting filters does not require re-running the main search workload
  • fewer lines of code, more lines of docs

Drops end to end runtime from 1872ms to 232ms with a 6x cpu throttle.

See also
Supersedes & closes #2412 (this PR's refactor and offloading from main should address @Hetarth02's blocking objection to #2412. "I did observe that characters typed did not appear instantly when debounce was disabled" is no longer the case with this PR)
Supersedes & closes #2407 (though this PR is very simple and could be merged anyway)
Fixes #2411

- Move main search workload to a web worker to keep the main thread responsive
- Move some of search result display to web worker for further offloading and to reduce data transfer
- Given this offloading, the debouncing is no longer necessary, even on very slow machines. Remove it.
- Remove slow .forEach loop
- Limit result display to 200 results
- Store unfiltered results so that adjusting filters does not require re-running the main search workload

Drops end to end runtime from 1872ms to 232ms with a 6x cpu throttle.
@LilithHafner LilithHafner changed the title Optimize search Optimize search (8x speedup) Jan 20, 2024
@Hetarth02
Copy link
Contributor

I think there was a conversation in regards to not limit the search results to a certain number. (CC @mortenpi )

@LilithHafner
Copy link
Contributor Author

Some factors that went into the choice in this PR to set a limit of 200 results

  • The index of docs.julialang.org has about 4000 unique entries and it's pretty common to return more than half of them in any given search, which indicates that the existing threshold is too permissive at times (though not always)
  • I prefer to supplement the "search value" threshold with an absolute maximum rather than increase the existing threshold because, independent of search query or package, a user is unlikely to want to scroll through more than 200 results, and if there are only a few results, it makes sense to display them all even if they have a fairly low value.
  • No matter how we handle the search backend, it is impossible to guarantee smooth performance with a naive infinite scroll with an unbounded number of elements, and I didn't want to embark on making a more sophisticated invisibly paged infinite scroll.

For comparison, Bing and Google return about 7–10 results at a cursory inspection, though each of their results are a little larger and contain more content than each of our results. They also provide the option to continue paging through results indefinitely while this PR does not provide any way to access search results past #2OO. I personally have never scrolled past the first few pages of about 10 results, but other users may have a different experience. I also haven't used the new search very much.

@LilithHafner LilithHafner marked this pull request as ready for review January 22, 2024 13:40
@LilithHafner
Copy link
Contributor Author

The CI failure says " [warn] Code style issues found in the above file. Forgot to run Prettier?" but does not state what "Prettier" is, provide instructions for how to run it, or state what style changes it requests. "Prettier" returns no results when searching the Documenter.jl docs.

@Hetarth02
Copy link
Contributor

Hetarth02 commented Jan 22, 2024

If you use vscode, there should be a plugin for prettier. Install it and then save the file it should automatically detect .prettierrc file and format accordingly.

If you use editors other than vscode then also there might be plugins available for prettier.

Documenter uses prettier to ensure a consistent style/code formatting.

@LilithHafner
Copy link
Contributor Author

Bump; this is ready to go from my end.

Copy link

@cpfiffer cpfiffer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing here stands out to me as being obviously slow or wrong, and I really appreciate the idea here. The execution is solid -- I'd merge this in.

@mortenpi mortenpi added Type: Enhancement Format: HTML Related to the default HTML output labels Feb 3, 2024
Copy link
Member

@mortenpi mortenpi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this looks pretty neat. Goes from visible lag to effectively instantaneous response.

Limiting to 200 entries seems very reasonable to me. I don't think anyone will scroll through 4000 results, so I agree we don't need the complexity of infinite scrolling or something like that.

Pretty much happy with this as is, but just a few things:

  1. Needs a note in CHANGELOG.md.
  2. The filter badges don't seem to color up anymore when you press them. I presume some class application might have gotten lost.
screen-2024-02-03_22.16.26.mp4
  1. It also seems like the filter badges are now "exclusive", i.e. you can't toggle multiple on at the same time? I'm okay with that change, if that's intentional, but we should note that in the CHANGELOG as well.
  2. The Minisearch dependency should also be excised from the HTMLWriter code if we're going to hard-code it in. I.e.
    const minisearch = RemoteLibrary("minisearch", "https://cdn.jsdelivr.net/npm/[email protected]/dist/umd/index.min.js")
    and the relevant references in the Julia code.

assets/html/js/search.js Show resolved Hide resolved
@LilithHafner
Copy link
Contributor Author

LilithHafner commented Feb 3, 2024

  1. The filter badges don't seem to color up anymore when you press them. I presume some class application might have gotten lost.
  2. It also seems like the filter badges are now "exclusive", i.e. you can't toggle multiple on at the same time? I'm okay with that change, if that's intentional, but we should note that in the CHANGELOG as well.

I accidentally made it so that reloading search results automatically clears all filters. So when you click a filter button, it lights up and then is immediately (i.e. before the frame renders) blanked out when the new results render.

In retrospect the exclusive filter semantics, while accidentally introduced by this bug, seem like a good idea and worth keeping. I've added them to the changelog.

I also noticed that there is a

VM1536:1 Uncaught SyntaxError: Unexpected identifier 'into'error that occurs when searching in 901fdba but not in either of it's parent commits.

EDIT: that error is #2441

Copy link
Member

@mortenpi mortenpi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now, thank you @LilithHafner. Took the liberty of slightly editing the CHANGELOG. I think it's time to merge this.

@mortenpi mortenpi enabled auto-merge (squash) February 5, 2024 00:33
@mortenpi mortenpi merged commit c3c8754 into JuliaDocs:master Feb 5, 2024
18 of 21 checks passed
@LilithHafner LilithHafner deleted the lh/search-optimization branch February 5, 2024 19:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Format: HTML Related to the default HTML output Type: Enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

De-selecting a search category is sometimes ignored. [bug] [low priority]
4 participants