Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specialize DisiPriorityQueue for the 2-clauses case. #14070

Merged
merged 5 commits into from
Jan 28, 2025

Conversation

jpountz
Copy link
Contributor

@jpountz jpountz commented Dec 16, 2024

Disjunctions with 2 clauses are rather common. Specializing this case enables some shortcuts.

Disjunctions with 2 clauses are rather common. Specializing this case enables
some shortcuts.
@jpountz jpountz added this to the 10.2.0 milestone Dec 16, 2024
@jpountz
Copy link
Contributor Author

jpountz commented Dec 16, 2024

Here are results on wikibigall. Several queries have a speedup with a very low p-value.

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                      OrHighRare      283.53      (4.2%)      277.97      (6.2%)   -2.0% ( -11% -    8%) 0.241
                        Wildcard       79.54      (3.8%)       78.68      (3.6%)   -1.1% (  -8% -    6%) 0.358
                      DismaxTerm      615.37      (2.9%)      608.82      (2.8%)   -1.1% (  -6% -    4%) 0.238
                   TermMonthSort     3963.49      (1.3%)     3926.24      (1.7%)   -0.9% (  -3% -    2%) 0.046
                DismaxOrHighHigh      122.08      (2.0%)      121.26      (3.1%)   -0.7% (  -5% -    4%) 0.418
                        PKLookup      283.63      (1.6%)      281.85      (2.5%)   -0.6% (  -4% -    3%) 0.351
                 FilteredPrefix3      132.74      (6.3%)      132.18      (6.6%)   -0.4% ( -12% -   13%) 0.836
                         Prefix3      138.60      (6.5%)      138.02      (6.9%)   -0.4% ( -12% -   13%) 0.843
             CountFilteredOrMany        8.60      (3.3%)        8.57      (2.7%)   -0.4% (  -6% -    5%) 0.668
                      TermDTSort      294.87      (4.6%)      294.24      (9.6%)   -0.2% ( -13% -   14%) 0.928
                      OrHighHigh       56.01      (4.8%)       55.94      (4.4%)   -0.1% (  -8% -    9%) 0.931
                          Fuzzy1       83.18      (2.5%)       83.10      (2.3%)   -0.1% (  -4% -    4%) 0.902
                            Term      494.29      (3.0%)      493.95      (3.7%)   -0.1% (  -6% -    6%) 0.947
                          Fuzzy2       78.47      (2.2%)       78.42      (2.0%)   -0.1% (  -4% -    4%) 0.923
                 DismaxOrHighMed      174.72      (1.9%)      174.71      (2.1%)   -0.0% (  -3% -    4%) 0.991
                       OrHighMed      202.12      (3.8%)      202.13      (3.5%)    0.0% (  -7% -    7%) 0.997
                     AndHighHigh       45.24      (1.7%)       45.24      (1.1%)    0.0% (  -2% -    2%) 0.988
                    FilteredTerm      156.36      (1.4%)      156.43      (1.4%)    0.0% (  -2% -    2%) 0.923
                   TermTitleSort      157.39      (2.3%)      157.47      (2.6%)    0.0% (  -4% -    5%) 0.954
      FilteredOr2Terms2StopWords      149.22      (1.2%)      149.33      (1.1%)    0.1% (  -2% -    2%) 0.825
             FilteredOrStopWords       43.36      (1.6%)       43.41      (1.7%)    0.1% (  -3% -    3%) 0.824
                     CountPhrase        4.26      (2.1%)        4.26      (1.5%)    0.2% (  -3% -    3%) 0.748
               TermDayOfYearSort      650.96      (2.2%)      652.28      (3.7%)    0.2% (  -5% -    6%) 0.832
              Or2Terms2StopWords      168.70      (3.5%)      169.20      (2.3%)    0.3% (  -5% -    6%) 0.750
                  FilteredPhrase       29.99      (1.5%)       30.11      (1.4%)    0.4% (  -2% -    3%) 0.394
             CountFilteredPhrase       25.30      (1.4%)       25.42      (1.5%)    0.5% (  -2% -    3%) 0.305
                FilteredOr3Terms      166.34      (1.4%)      167.18      (0.9%)    0.5% (  -1% -    2%) 0.165
               FilteredAnd3Terms      195.32      (1.7%)      196.32      (1.7%)    0.5% (  -2% -    4%) 0.345
                      AndHighMed      129.96      (1.8%)      130.71      (1.4%)    0.6% (  -2% -    3%) 0.258
     FilteredAnd2Terms2StopWords      197.65      (1.3%)      198.84      (1.4%)    0.6% (  -2% -    3%) 0.156
                AndMedOrHighHigh       60.53      (1.9%)       60.91      (1.6%)    0.6% (  -2% -    4%) 0.274
             And2Terms2StopWords      165.55      (2.3%)      166.57      (1.2%)    0.6% (  -2% -    4%) 0.287
                     CountOrMany        7.29      (8.8%)        7.34      (8.9%)    0.6% ( -15% -   20%) 0.817
               FilteredOrHighMed      154.91      (1.2%)      155.96      (0.7%)    0.7% (  -1% -    2%) 0.028
                       CountTerm    11039.26      (2.2%)    11114.73      (3.1%)    0.7% (  -4% -    6%) 0.418
         CountFilteredOrHighHigh       63.08      (2.2%)       63.53      (2.2%)    0.7% (  -3% -    5%) 0.295
              FilteredOrHighHigh       64.41      (1.5%)       64.91      (1.5%)    0.8% (  -2% -    3%) 0.098
                        Or3Terms      176.90      (3.7%)      178.38      (2.4%)    0.8% (  -5% -    7%) 0.399
                     OrStopWords       35.83      (5.8%)       36.16      (3.9%)    0.9% (  -8% -   11%) 0.547
              FilteredAndHighMed      129.99      (2.1%)      131.39      (1.6%)    1.1% (  -2% -    4%) 0.072
                          OrMany       19.85      (2.9%)       20.07      (2.2%)    1.1% (  -3% -    6%) 0.159
                    AndStopWords       32.29      (3.5%)       32.65      (1.4%)    1.1% (  -3% -    6%) 0.179
                CountAndHighHigh       54.45      (2.1%)       55.08      (2.2%)    1.2% (  -3% -    5%) 0.083
                       And3Terms      177.22      (2.8%)      179.33      (1.5%)    1.2% (  -3% -    5%) 0.095
            FilteredAndStopWords       47.32      (2.0%)       47.89      (2.4%)    1.2% (  -3% -    5%) 0.081
             FilteredAndHighHigh       62.16      (1.8%)       62.95      (2.0%)    1.3% (  -2% -    5%) 0.034
                 CountOrHighHigh       72.80      (8.9%)       73.83      (9.2%)    1.4% ( -15% -   21%) 0.621
                 CountAndHighMed      157.11      (2.6%)      159.49      (3.0%)    1.5% (  -4% -    7%) 0.092
                  CountOrHighMed      134.72      (6.7%)      136.79      (6.9%)    1.5% ( -11% -   16%) 0.476
                          Phrase       14.89      (4.5%)       15.14      (5.7%)    1.7% (  -8% -   12%) 0.303
                  FilteredOrMany       16.85      (3.5%)       17.17      (4.4%)    1.9% (  -5% -   10%) 0.132
              CombinedAndHighMed       55.18      (2.0%)       56.75      (2.4%)    2.8% (  -1% -    7%) 0.000
             CombinedAndHighHigh       15.23      (2.2%)       15.67      (2.5%)    2.9% (  -1% -    7%) 0.000
          CountFilteredOrHighMed       68.46      (1.9%)       70.53      (1.8%)    3.0% (   0% -    6%) 0.000
                    CombinedTerm       31.53      (2.6%)       32.52      (3.4%)    3.2% (  -2% -    9%) 0.001
               CombinedOrHighMed       71.83      (1.8%)       74.29      (2.0%)    3.4% (   0% -    7%) 0.000
              CombinedOrHighHigh       18.89      (1.8%)       19.55      (2.2%)    3.5% (   0% -    7%) 0.000
                  FilteredIntNRQ      112.01     (12.5%)      115.98     (16.1%)    3.5% ( -22% -   36%) 0.436
                          IntNRQ      112.95     (12.6%)      117.08     (16.2%)    3.7% ( -22% -   37%) 0.426
                 AndHighOrMedMed       45.28      (0.8%)       48.56      (1.0%)    7.3% (   5% -    9%) 0.000

Copy link

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!

@github-actions github-actions bot added the Stale label Dec 31, 2024
Copy link
Contributor

@gsmiller gsmiller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Left one minor comment, but I think it's on pre-existing code so feel free to ignore if you like :)

if (w.doc == list.doc) {
list = prepend(w, list);
final int left = leftNode(i);
final int right = left + 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should we use rightNode(left) since we have it defined?

@github-actions github-actions bot removed the Stale label Jan 7, 2025
@mikemccand
Copy link
Member

Hmm is this PR accidentally dying on the vine @jpountz?

@jpountz
Copy link
Contributor Author

jpountz commented Jan 15, 2025

Not completely accidentally, I wanted to merge the more impactful changes I had on my plate before taking another look at the impact of this PR. I'll get back to it shortly.

@jpountz
Copy link
Contributor Author

jpountz commented Jan 27, 2025

Updated benchmark results, there is still a speedup. I'll merge soon.

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                          IntNRQ      110.67     (13.1%)      108.35      (9.1%)   -2.1% ( -21% -   23%) 0.557
                  FilteredIntNRQ      109.23     (12.1%)      107.32      (8.9%)   -1.7% ( -20% -   21%) 0.605
                     OrStopWords       34.39      (6.7%)       34.06      (7.5%)   -1.0% ( -14% -   14%) 0.671
                            Term      473.90      (2.7%)      469.65      (3.5%)   -0.9% (  -6% -    5%) 0.364
              Or2Terms2StopWords      165.33      (3.6%)      164.35      (4.2%)   -0.6% (  -8% -    7%) 0.634
                      OrHighHigh       54.11      (3.6%)       53.80      (4.4%)   -0.6% (  -8% -    7%) 0.649
                      DismaxTerm      574.29      (2.6%)      571.25      (2.8%)   -0.5% (  -5% -    4%) 0.534
                 FilteredPrefix3      132.35      (3.4%)      131.73      (3.0%)   -0.5% (  -6% -    6%) 0.638
                        Or3Terms      171.96      (3.9%)      171.16      (4.2%)   -0.5% (  -8% -    8%) 0.722
                         Prefix3      138.18      (3.5%)      137.55      (3.1%)   -0.5% (  -6% -    6%) 0.666
             FilteredOrStopWords       47.79      (1.3%)       47.65      (1.8%)   -0.3% (  -3% -    2%) 0.536
                       OrHighMed      203.16      (2.8%)      202.71      (3.7%)   -0.2% (  -6% -    6%) 0.829
                    AndStopWords       31.57      (5.1%)       31.50      (5.1%)   -0.2% (  -9% -   10%) 0.895
                     AndHighHigh       43.82      (2.4%)       43.74      (1.7%)   -0.2% (  -4% -    4%) 0.802
     FilteredAnd2Terms2StopWords      202.63      (1.5%)      202.32      (1.5%)   -0.2% (  -3% -    2%) 0.743
               TermDayOfYearSort      657.20      (2.4%)      656.22      (3.0%)   -0.1% (  -5% -    5%) 0.861
             And2Terms2StopWords      163.59      (2.9%)      163.37      (3.1%)   -0.1% (  -5% -    5%) 0.890
                   TermTitleSort      148.27      (1.4%)      148.08      (1.5%)   -0.1% (  -2% -    2%) 0.778
                        PKLookup      282.85      (2.0%)      282.55      (2.4%)   -0.1% (  -4% -    4%) 0.879
                      AndHighMed      127.81      (2.2%)      127.67      (1.8%)   -0.1% (  -4% -    4%) 0.870
                  FilteredPhrase       33.23      (1.2%)       33.20      (1.7%)   -0.1% (  -2% -    2%) 0.841
                   TermMonthSort     3380.61      (2.1%)     3378.69      (2.0%)   -0.1% (  -4% -    4%) 0.930
                        Wildcard       78.45      (3.6%)       78.40      (2.9%)   -0.1% (  -6% -    6%) 0.958
             CountFilteredPhrase       26.48      (1.8%)       26.47      (2.0%)   -0.0% (  -3% -    3%) 0.994
                    FilteredTerm      160.77      (1.9%)      160.78      (1.7%)    0.0% (  -3% -    3%) 0.989
                       And3Terms      173.27      (3.4%)      173.35      (3.3%)    0.0% (  -6% -    7%) 0.967
      FilteredOr2Terms2StopWords      150.99      (1.0%)      151.11      (0.9%)    0.1% (  -1% -    1%) 0.782
            FilteredAndStopWords       55.49      (1.6%)       55.55      (2.4%)    0.1% (  -3% -    4%) 0.875
             FilteredAndHighHigh       69.32      (1.5%)       69.40      (2.1%)    0.1% (  -3% -    3%) 0.854
                     CountPhrase        4.29      (1.7%)        4.30      (1.3%)    0.1% (  -2% -    3%) 0.818
         CountFilteredOrHighHigh      108.11      (0.8%)      108.27      (0.7%)    0.1% (  -1% -    1%) 0.537
          CountFilteredOrHighMed      119.25      (0.8%)      119.45      (0.5%)    0.2% (  -1% -    1%) 0.410
                 DismaxOrHighMed      175.31      (1.4%)      175.62      (2.3%)    0.2% (  -3% -    3%) 0.770
              FilteredAndHighMed      131.05      (2.8%)      131.28      (2.8%)    0.2% (  -5% -    5%) 0.839
             CountFilteredOrMany       26.16      (1.6%)       26.21      (1.5%)    0.2% (  -2% -    3%) 0.678
                     CountOrMany       30.18      (1.7%)       30.26      (1.7%)    0.3% (  -3% -    3%) 0.622
                AndMedOrHighHigh       66.13      (2.2%)       66.32      (1.4%)    0.3% (  -3% -    3%) 0.605
                DismaxOrHighHigh      120.04      (1.4%)      120.43      (2.5%)    0.3% (  -3% -    4%) 0.612
                FilteredOr3Terms      165.88      (1.2%)      166.42      (1.2%)    0.3% (  -2% -    2%) 0.399
                          Phrase       15.28      (4.5%)       15.34      (5.0%)    0.3% (  -8% -   10%) 0.820
               FilteredOrHighMed      157.15      (0.9%)      157.78      (1.0%)    0.4% (  -1% -    2%) 0.193
                 CountAndHighMed      303.28      (2.3%)      304.53      (2.2%)    0.4% (  -3% -    5%) 0.560
                      TermDTSort      277.52      (6.2%)      278.78      (6.3%)    0.5% ( -11% -   13%) 0.819
               FilteredAnd3Terms      193.81      (2.0%)      194.72      (1.9%)    0.5% (  -3% -    4%) 0.456
                 CountOrHighHigh      299.53      (2.1%)      300.95      (1.9%)    0.5% (  -3% -    4%) 0.451
                          Fuzzy2       77.14      (2.4%)       77.53      (1.9%)    0.5% (  -3% -    4%) 0.459
                CountAndHighHigh      316.68      (2.2%)      318.42      (1.9%)    0.5% (  -3% -    4%) 0.395
              FilteredOrHighHigh       68.70      (0.9%)       69.11      (1.5%)    0.6% (  -1% -    3%) 0.129
                          Fuzzy1       81.90      (2.5%)       82.47      (2.2%)    0.7% (  -3% -    5%) 0.350
                  CountOrHighMed      366.87      (2.0%)      370.71      (1.6%)    1.0% (  -2% -    4%) 0.066
                       CountTerm     9735.91      (3.8%)     9843.29      (3.9%)    1.1% (  -6% -    9%) 0.364
                          OrMany       19.46      (3.8%)       19.70      (4.3%)    1.3% (  -6% -    9%) 0.328
                  FilteredOrMany       16.74      (3.1%)       16.96      (3.1%)    1.3% (  -4% -    7%) 0.191
                    CombinedTerm       31.11      (2.6%)       31.80      (2.3%)    2.2% (  -2% -    7%) 0.004
                      OrHighRare      276.02      (2.9%)      282.98      (3.2%)    2.5% (  -3% -    8%) 0.009
             CombinedAndHighHigh       15.09      (1.8%)       15.53      (1.9%)    2.9% (   0% -    6%) 0.000
              CombinedAndHighMed       55.60      (1.6%)       57.24      (1.6%)    3.0% (   0% -    6%) 0.000
              CombinedOrHighHigh       18.66      (1.7%)       19.25      (1.9%)    3.2% (   0% -    6%) 0.000
               CombinedOrHighMed       71.67      (1.7%)       74.01      (1.9%)    3.3% (   0% -    7%) 0.000
                 AndHighOrMedMed       43.72      (1.2%)       46.97      (0.6%)    7.4% (   5% -    9%) 0.000

@jpountz jpountz merged commit 71256cc into apache:main Jan 28, 2025
5 checks passed
@jpountz jpountz deleted the specialize_disi_pq_2_clauses branch January 28, 2025 14:16
jpountz added a commit that referenced this pull request Feb 27, 2025
Disjunctions with 2 clauses are rather common. Specializing this case enables
some shortcuts.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants