Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Wildcard ("Contains") filter #16357

Closed
wants to merge 1 commit into from

Conversation

shaharmor
Copy link
Contributor

@shaharmor shaharmor commented Jan 29, 2018

Hi,

This PR adds a new filter to the filter bar, for the Wildcard query.

The way it works is as follow:

  1. If the input contains a * char, it will just use the input as is.
  2. If it doesn't contain a * char, it will wrap the input string with * to get a true "contains" filter of *input*.

Update:
New logic only wraps input with * from both sides. See #16357 (comment)

@elasticmachine
Copy link
Contributor

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

@Bargs
Copy link
Contributor

Bargs commented Jan 29, 2018

Hi @shaharmor, thanks for submitting this. I'll try to give it a look as soon as I get some time.

One thing we'll have to consider before merging any sort of wildcard filter is how we can prevent it from taking down anyone's cluster. Leading wildcards can be extremely heavy and dangerous, which is why the query_string query allows admins to turn them off. It might be as simple as making wildcard queries an opt-in feature in the advanced settings. We might also consider offering prefix queries by default so users can get some of the benefit of a wildcard query without opting in to everything.

@elasticmachine
Copy link
Contributor

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

@matthew-hickok
Copy link

This would be great. Would love to see it happen.

@rashmivkulkarni
Copy link
Contributor

jenkins, test this

@elasticmachine
Copy link
Contributor

💔 Build Failed

@matthew-hickok
Copy link

So close!

@rayafratkina rayafratkina added Feature:Discover Discover Application Team:Visualizations Visualization editors, elastic-charts and infrastructure and removed Feature:Discovery labels Jun 14, 2019
@lukasolson lukasolson removed their request for review June 28, 2019 17:46
@timroes
Copy link
Contributor

timroes commented Jul 5, 2019

Hi @shaharmor,

I am sorry this PR was forgotten about and I apologize for that. If you still want to continue working on this, please merge master into your PR, and I'll make sure we're reviewing and continuing merging this time. Otherwise please feel free to close this.

Cheers,
Tim

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

@shaharmor
Copy link
Contributor Author

@timroes I will update it for master in the next couple of days

@shaharmor
Copy link
Contributor Author

Hey @timroes

The "name" of the filter is "contains", which implies that any value entered will be wrapped with * from both sides, converting a filter term of value to *value*.

  • Should this same filter also support "starts with" & "ends with"?
    This means that if a user enters the filter term of value* or *value, should we still make sure the value is wrapped with * from both sides, or just leave the term as is?

  • If we decide to always wrap the value from both sides, should I also create two additional filters, one for "starts with" and one for "ends with"?

Note that when the new filter is active, its "title" becomes field contains value.
If we'll decide to support starts with and ends with, it will have to show the * characters like this:

  • `"field" contains "*value*"
  • `"field" contains "value*"
  • `"field" contains "*value"

Which might not be what you want.

Please let me know how you'd like me to proceed.

@shaharmor shaharmor force-pushed the add-wildcard-filter branch from 26ad703 to 27ebb03 Compare July 22, 2019 11:55
@shaharmor shaharmor force-pushed the add-wildcard-filter branch from 27ebb03 to ae1d953 Compare July 22, 2019 12:04
Copy link
Contributor

@Bargs Bargs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @shaharmor, sorry again for how inconsistent we've been with feedback on this PR. We'd really like to push it forward if you have the time. I've taken a deeper look at the current code and left some feedback. In addition to the inline comments, I have a couple other comments:

  • To answer your previous question, I think it would be best to create separate filter types for "Starts with" and "Ends with".
  • We should add an advanced setting similar to the existing query:allowLeadingWildcards to prevent the creation and execution of "contains" and "ends with" filters since they could be dangerous to a cluster.

const filter = {
meta: { index, type, key, value },
};
filter.query = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing we learned when we were working on adding wildcard support to KQL is that text fields don't work well with simple wildcard queries. It's actually best to use a query_string query for wildcard queries on text fields because it does some special handling that makes it work how most users would expect. I'd suggest updating the query building logic to look like this:

  // Text fields should be treated in a special manner because their values are analyzed.
  // For example, a field configured with the standard analyzer with a value of "Foo Bar" would not
  // match with a simple wildcard query on the value "Foo Ba*" because the analyzer tokenized and
  // lowercased the text at index time. The actual values stored in the inverted index are "foo" and
  // "bar" but the wildcard query is literally searching for a token that starts with "Foo Ba".
  // The query_string query attempts to make wildcards more useful with text fields. It will do
  // tokenization and some simple normalization on the term that contains the wildcard, so that the
  // query "Foo Ba*" would actually match. This functionality is not exposed in any other query type
  // so we have to sort of abuse the query_string query here if we're using a text field
  if (field.esTypes && field.esTypes.includes('text')) {
    filter.query = {
      query_string: {
        fields: [field.name],
        query: `*${escapeQueryString(value)}*`,
      }
    };
  }
  else {
    filter.query = {
      wildcard: {
        [field.name]: {
          value: `*${value}*`,
        },
      },
    };
  }

const index = indexPattern.id;
const type = 'contains';
const key = field.name;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to take into account scripted fields, similar to how other filters builders do.

*/

// Creates an filter where the given field contains the given value
export function buildContainsFilter(field, value, indexPattern) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also add some unit tests for this function?

@spalger spalger added the test-matrix Use this label to ensure PRs are tested with matrix jobs label Sep 19, 2019
@timroes timroes added the stalled label Dec 4, 2019
@timroes
Copy link
Contributor

timroes commented Dec 4, 2019

I will close this PR for now as stalled, since there hasn't been any activity for quiet some time. Please feel free to reopen this if we want to continue with it. Thanks a lot!

@timroes timroes closed this Dec 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Discover Discover Application review stalled Team:Visualizations Visualization editors, elastic-charts and infrastructure test-matrix Use this label to ensure PRs are tested with matrix jobs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants