Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[api-minor] Add support, in PDFFindController, for mixing phrase/word searches (issue 7442) #16247

Merged
merged 1 commit into from
Apr 16, 2023

Conversation

Snuffleupagus
Copy link
Collaborator

Please note: This patch only extends the PDFFindController implementation itself to support this functionality, however it's purposely not exposed in the default viewer.

This replaces the previous phraseSearch-parameter, and a query-string will now always be interpreted as a phrase-search.
To enable searching for individual words, the query-parameter must instead consist of an Array of strings. This way it's now also possible to combine phrase/word searches, with a query-parameter looking something like ["Lorem ipsum", "foo", "bar"] which will search for the phrase "Lorem ipsum" and the words "foo" respectively "bar".

this.eventBus.dispatch("findfromurlhash", {
source: this,
query: params.get("search").replaceAll('"', ""),
phraseSearch: params.get("phrase") === "true",
query: phrase ? query : query.match(/\S+/g),

Check failure

Code scanning / CodeQL

SQL database query built from user-controlled sources (experimental)

(Experimental) This may be a database query that depends on [a user-provided value](1). Identified using machine learning.
@Snuffleupagus Snuffleupagus force-pushed the issue-7442 branch 4 times, most recently from 3f4c0d7 to f8017d6 Compare April 4, 2023 13:11
@Snuffleupagus
Copy link
Collaborator Author

@timvandermeij How do you feel about this PR, since while it does fix an old feature request and essentially only extends an existing feature (i.e. phraseSearch = false), it does add some amount of additional complexity to the find-implementation?

@timvandermeij
Copy link
Contributor

timvandermeij commented Apr 15, 2023

I didn't check the code in detail, but in general the approach looks good to me, and given that this is apparently quite a wanted feature I'm OK with including it if it's not exposed in the default viewer. I'm much happier with this now that there is test coverage, which used to be an issue with the previous patches I've seen for this functionality, since I feared for regressions without that given that the find controller is not the easiest code.

…rd searches (issue 7442)

*Please note:* This patch only extends the `PDFFindController` implementation itself to support this functionality, however it's *purposely* not exposed in the default viewer.

This replaces the previous `phraseSearch`-parameter, and a `query`-string will now always be interpreted as a phrase-search.
To enable searching for individual words, the `query`-parameter must instead consist of an Array of strings. This way it's now also possible to combine phrase/word searches, with a `query`-parameter looking something like `["Lorem ipsum", "foo", "bar"]` which will search for the phrase "Lorem ipsum" *and* the words "foo" respectively "bar".
@Snuffleupagus Snuffleupagus marked this pull request as ready for review April 15, 2023 11:33
@Snuffleupagus
Copy link
Collaborator Author

/botio unittest

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Received

Command cmd_unittest from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.241.84.105:8877/55c9c3c329a947b/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_unittest from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.193.163.58:8877/0f447136c43cd3a/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Success

Full output at http://54.241.84.105:8877/55c9c3c329a947b/output.txt

Total script time: 2.51 mins

  • Unit Tests: Passed

@pdfjsbot
Copy link

From: Bot.io (Windows)


Success

Full output at http://54.193.163.58:8877/0f447136c43cd3a/output.txt

Total script time: 10.92 mins

  • Unit Tests: Passed

@Snuffleupagus
Copy link
Collaborator Author

/botio integrationtest

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Received

Command cmd_integrationtest from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.241.84.105:8877/a54f12369e9b8dd/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_integrationtest from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.193.163.58:8877/e9b61e69037227a/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Failed

Full output at http://54.241.84.105:8877/a54f12369e9b8dd/output.txt

Total script time: 4.22 mins

  • Integration Tests: FAILED

@pdfjsbot
Copy link

From: Bot.io (Windows)


Failed

Full output at http://54.193.163.58:8877/e9b61e69037227a/output.txt

Total script time: 13.29 mins

  • Integration Tests: FAILED

@timvandermeij timvandermeij merged commit f46ed43 into mozilla:master Apr 16, 2023
@timvandermeij
Copy link
Contributor

Thank you for implementing this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Search PDF for phrases and terms simultaneously
3 participants