Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[UnifiedFieldList] Fields are not marked as Empty in Discover sidebar #147124

Closed
Tracked by #172341
jughosta opened this issue Dec 6, 2022 · 40 comments · Fixed by #174063
Closed
Tracked by #172341

[UnifiedFieldList] Fields are not marked as Empty in Discover sidebar #147124

jughosta opened this issue Dec 6, 2022 · 40 comments · Fixed by #174063
Assignees
Labels
bug Fixes for quality problems that affect the customer experience Feature:UnifiedFieldList The unified field list component used by Lens & Discover impact:high Addressing this issue will have a high level of impact on the quality/strength of our product. loe:needs-research This issue requires some research before it can be worked on or estimated Team:DataDiscovery Discover, search (e.g. data plugin and KQL), data views, saved searches. For ES|QL, use Team:ES|QL.

Comments

@jughosta
Copy link
Contributor

jughosta commented Dec 6, 2022

Some fields are expected to be under "Empty fields" section in the sidebar but they appear under Available fields section. It might be a bug in fields existence API https://github.com/elastic/kibana/blob/main/src/plugins/unified_field_list/public/hooks/use_existing_fields.ts#L122

Example: "ip_range", "phpmemory" in Logs sample data

Screenshot 2022-12-06 at 17 20 18

@jughosta jughosta added bug Fixes for quality problems that affect the customer experience Team:DataDiscovery Discover, search (e.g. data plugin and KQL), data views, saved searches. For ES|QL, use Team:ES|QL. Feature:UnifiedFieldList The unified field list component used by Lens & Discover labels Dec 6, 2022
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-data-discovery (Team:DataDiscovery)

@davismcphee davismcphee added loe:needs-research This issue requires some research before it can be worked on or estimated impact:high Addressing this issue will have a high level of impact on the quality/strength of our product. labels Dec 6, 2022
@metalshanked
Copy link

Hi @jughosta - I see this issue on 8.7.0 as well. Would this be fixed in the next update?

@kertal
Copy link
Member

kertal commented Apr 11, 2023

@metalshanked So far it has not been fixed, you can track this issue to get notified when it will be fixed. Since it has a impact:high label it's a short term priority to provide a fix for it

@metalshanked
Copy link

metalshanked commented Apr 11, 2023

Thanks @kertal. Had a quick question. I see this issue opened in Dec 2022 however I don't see this issue occur in 8.6.2 (which has a hide empty fields toggle)

Is it because the new UnifiedFieldList feature was implemented starting 8.7.0 and so this issue is first seen by users in the 8.7.0 release?

@kertal
Copy link
Member

kertal commented Apr 12, 2023

So we changed the underlying implementation of getting the available fields in 8.7.0. Correct!

@kertal
Copy link
Member

kertal commented Apr 17, 2023

We aligned Empty fields in Discover to the way it works in Lens. Additionally, to the request for all fields, we send a request via the data views API for fields providing the actual query. But the way fields capabilities API works is not returning all fields that have values, but filtering out indices that return no results for the query. This is done by the index_filter property of the Field capabilities API https://www.elastic.co/guide/en/elasticsearch/reference/current/search-field-caps.html#search-field-caps-api-request-body

This is why in the given case e.g. ip_range is part of available fields. In part of a mapping of an index that's returning results for the given query.

Before this change ip_range wouldn't be part of the available fields, because we used to returned sample of 500 documents to get a list of available fields (this approach also means, if a field had just values in the 501st document, it would not be part of the available fields)

Apart from the given setup described in this issue, it can be easily reproduced by using our demo data, by adding a Data View for kibana_*, filtering by a property that's just available in the logs indices, then all other fields (ecommerce,flights) are part of the empty fields:
Menubar_und_Discover_-_Elastic

@kertal
Copy link
Member

kertal commented Apr 17, 2023

In our demo data this ain't a big issue, but given users have a large mapping, that's just partially contains actual values, the alignment with Lens leads to having much more Available fields than before:

Example from #154632

@ninoslavmiskovic
Copy link
Contributor

Related ER: https://github.com/elastic/enhancements/issues/16766

A request for being able to only see fields in a dropdown that has values and not empty fields.

@kertal kertal added the blocked label Apr 24, 2023
@metalshanked
Copy link

metalshanked commented May 2, 2023

@kertal @ninoslavmiskovic - Sorry, had a related question on mapping which I hope you can help clarify.
I understand there can be a mapping explosion in elastic with many mapped fields even when the actual indices do not contain data for most of these mapped fields (i.e. the fields itself don't exist in the indices)

If so, how does a mapping explosion exactly happen in this case, since I understand only the metadata for the mapping is synced to the indices/shards?
For example:- If I use ECS templates master/composable template which has many fields which might not actually be present in the data, will this cause performance issues across if we have 100 such indices?

Thanks in advance!

@kertal
Copy link
Member

kertal commented May 2, 2023

@metalshanked I'm interested where you think this could cause performance issues? On ingestion? On search? In the UI?
thx!

@metalshanked
Copy link

Thanks @kertal - Yes. in any of the places, ingestion, search, ui
Eg:- scenario :

  1. Using same set of ECS component templates for multiple indices
  2. The actual data will ALWAYS be less than the fields listed in the component template mappings
  3. Is there any performance impact to Ingestion, UI search etc?

@ninoslavmiskovic
Copy link
Contributor

hi all,

A couple of comments and perhaps a path forward.

Comments:

  • There is some inconsistency currently because we are showing empty fields under "Available Fields" and that is not intuitive for the user, now that we have an "Empty fields" section. This also happens when creating a runtime field with no value, then it appears under "Available Fields". The reason why we added the "Empty section" is to provide the user with insights into how many fields are actually empty.

I suggest we do the following as a path forward:

  • Investigate if this is a bug on the API and treat "Available fields" as fields with values and fields that do not have values and place them under "Empty fields".

Improving our templates for ingesting should be driven by the responsible team for the ECS templates.

@georgivalentinov
Copy link

georgivalentinov commented Jul 17, 2023

Hey @ninoslavmiskovic and all,

Congrats on the Unified Field List feature 🎉

We're observing what we classify as regression with a similar behaviour after upgrading from 8.6.2 to 8.8.2:

After narrowing down a search to the aws elb integration logs, for example, we still see all (or most) of the fields from other logs as well under Available fields (7918 in this case):
Screenshot 2023-07-17 at 20 08 51

Before the upgrade we only saw fields that are available after search filters got applied.
We kind of relied on this feature for better UX. You set your (search) context by applying a couple of filters, then you only get a small set of fields under Available fields, that are actually available.

Should I open a new issue or this is the right place?

@davismcphee
Copy link
Contributor

@georgivalentinov Thanks for sharing, and I can confirm this is the right issue for what you're encountering. While we don't have a fix for this issue currently, we're in communication with the Elasticsearch team to figure out if/how an API can be provided which properly supports this functionality.

Regarding the old behaviour, it too didn't actually function as intended unfortunately. Here's an except from a comment I just left on another issue related to this bug (#162239 (comment)) explaining why:

The change to remove the "Hide empty fields" toggle was intentional as part of an effort to unify the field lists between Discover and Lens as detailed in #135678 (with the actual work completed in #144412).

This was done not only to improve consistency between the apps, but also because the "Hide empty fields" toggle never really functioned as the label would imply. Rather than hiding fields that would be empty based on the query and time range, the "Hide empty fields" feature would simply loop through the 500 document sample returned to Discover and hide any fields for which it couldn't find values within the sample documents. This misled users into believing these fields were empty for all documents returned by their query, when in reality the 501st document could contain values for all "empty" fields, but the fields would still be hidden in the field list.

To account for this we adopted the pattern Lens uses of including an "Empty fields" section into our sidebar which should incude all fields that don't contain values based on the current query and time range. Unfortunately this feature too does not work as intended due to limitations in the upstream Elasticsearch API we rely on -- that's where #147124 comes into play.

For your specific use case, are the fields you expect to see prefixed in a particular way? If so, using the field name filter in the field list with a wildcard could be a potential workaround in the meantime (e.g. my_prefix.*). If not, unfortunately I can't think of a good workaround until the underlying issue is fixed.

@georgivalentinov
Copy link

👋

... the "Hide empty fields" feature would simply loop through the 500 document sample returned to Discover and hide any fields for which it couldn't find values within the sample documents ...

Yep, we've noticed it works this way, but was good enough for us, and we weren't really hit by the drawbacks of this method.

To account for this we adopted the pattern Lens uses of including an "Empty fields" section into our sidebar which should incude all fields that don't contain values based on the current query and time range. Unfortunately this feature too does not work as intended due to limitations in the upstream Elasticsearch API we rely on

Oh, so we/you are on the right track with intentions around a proper fix 👏 Thank you.

For your specific use case, are the fields you expect to see prefixed in a particular way? If so, using the field name filter in the field list with a wildcard could be a potential workaround in the meantime (e.g. my_prefix.*). If not, unfortunately I can't think of a good workaround until the underlying issue is fixed.

Some are (e.g. aws.elb.*), and some aren't. Thing is we're kind of used to the previous behaviour.

We'll definitely watch this space and wait for the proper fix.
Thanks for the effort and work done 🙇

@mwtyang
Copy link

mwtyang commented Sep 11, 2023

@artificial-aidan
Copy link

Are there any updates on this? I recently upgraded from 8.5 and it has rendered the Discover view almost useless. With default filebeat indices there are over 7000 available fields, when before I had 135 to choose from. Picking the right fields has become borderline impossible.

image

@mwtyang
Copy link

mwtyang commented Oct 4, 2023

cc @ruflin

@kertal
Copy link
Member

kertal commented Dec 12, 2023

Hi @artificial-aidan @georgivalentinov @metalshanked
We are having a closer look at this currently. I'd be interested, most of the screenshots in this issues seem to originate from filebeat based indices. Using DataVisualizer (disabling random sampler), you can see how much fields have actually values. In my research I used a filebeat-* pattern, returning 2 mio records. There were 7309 fields in the Available field section. Using DataVisualizer I figured out 164 fields had values. This of course only an example, different data will lead to different results.

Would it resolve your issue when we would filter out fields that have no values in the indices that contain documents matching a query? Filtering into less fields would just work by using the filter of the UnifiedFieldList, we improved this are recently. Now you just can enter "kub awe" to get fields like "kubernetes.fieldlength.is.awesome". I'm interested in your opinion.

@metalshanked
Copy link

metalshanked commented Dec 13, 2023

Thanks @kertal. Can the filter be applied automatically in the background? i.e. Automatic filtering out fields that have no values in the indices.

@artificial-aidan
Copy link

The issue for our team is field discoverability. If I know what field I'm looking for I can filter for it easily. But when exploring and debugging, not knowing which fields have data makes it very difficult.

@kertal
Copy link
Member

kertal commented Dec 13, 2023

@metalshanked yes, this would work automatically. It also works like today, with the difference that request in background would provide just fields that have value on index level, this would reduce the amount of Available fields significantly e.g. in filebeat based scenarios

@artificial-aidan so I'd assume the suggested change would be valuable to your use case, you could also use it beyond kibana on the ES field_caps API level

@artificial-aidan
Copy link

Maybe I'm misunderstanding this statement.

Would it resolve your issue when we would filter out fields that have no values in the indices that contain documents matching a query? Filtering into less fields would just work by using the filter of the UnifiedFieldList, we improved this are recently. Now you just can enter "kub awe" to get fields like "kubernetes.fieldlength.is.awesome". I'm interested in your opinion.

I understood that is that the field filter would allow you to find fields you knew about. How would the filtering of fields without data happen?

@kertal
Copy link
Member

kertal commented Dec 13, 2023

Maybe I'm misunderstanding this statement.

Would it resolve your issue when we would filter out fields that have no values in the indices that contain documents matching a query? Filtering into less fields would just work by using the filter of the UnifiedFieldList, we improved this are recently. Now you just can enter "kub awe" to get fields like "kubernetes.fieldlength.is.awesome". I'm interested in your opinion.

I understood that is that the field filter would allow you to find fields you knew about. How would the filtering of fields without data happen?

We evaluate a new way to filter out fields on index level that never had any value. Currently all fields of an index mapping that contain documents matching a query are displayed in the Available fields section. We would reduce that to just fields that have values.

@artificial-aidan
Copy link

That makes sense. Yes that would work for us.

@georgivalentinov
Copy link

georgivalentinov commented Jan 22, 2024

For us, as we use a common data view, and "shared" data streams/indices (and I imagine others do, as that's probably quite common when the system is used for infra observability), it'd be best if fields list gets automatically reduced to what's available only for the current query/filter.

This won't force people to use distinct Indices for different log types, which frequently leads to oversharding (too small, too many shards), which in turn is a known anti-pattern (we've been there, it's painful).

@baileygm
Copy link

baileygm commented Feb 28, 2024

Is this issue supposed to have been fixed by #174063 ?

I just deployed the latest snapshot versions of elasticsearch and kibana 8.13 in a new development environment and still see nearly all (798) fields marked as "Available Fields"

image

@baileygm
Copy link

baileygm commented Feb 28, 2024

I can answer my own question....I tested again with Kibana/Elasticsearch 8.14.0-SNAPSHOT-amd64 and can see that the filtering of empty fields works correctly now - when is this version intended to be formally released ??

image

@davismcphee
Copy link
Contributor

@baileygm I can't speak to why the snapshots were outdated, but I can confirm that this functionality should be included in the released version of 8.13. I also can't give exact details on the release date, but it will be available soon.

@chris-stytch
Copy link

Noticing in 8.14.1 that empty fields are once again being returned as available fields, raised #190175.

@baileygm
Copy link

baileygm commented Aug 8, 2024

Just pushed out Kibana 8.14.3 and it’s working perfectly

@chris-stytch
Copy link

Just pushed out Kibana 8.14.3 and it’s working perfectly

@baileygm is that with default configuration? Still seeing empty fields as shown in the screenshot in #190175.

@baileygm
Copy link

baileygm commented Aug 9, 2024

Just pushed out Kibana 8.14.3 and it’s working perfectly

@baileygm is that with default configuration? Still seeing empty fields as shown in the screenshot in #190175.

I've tested with a standard config where Kibana is connected directly to elasticsearch and using a cross cluster search node
In both cases its working perfectly as long as all the elasticsearch nodes are using 8.14.3 as well as Kibana
I don't see any empty fields under "available fields"
image

@baileygm
Copy link

baileygm commented Aug 9, 2024

I now notice that I have the opposite issue to @chris-stytch

I'm seeing fields categorised as empty fields where they are not empty

In this view, all 3 fields listed as empty have values

image

@chris-stytch
Copy link

I now notice that I have the opposite issue to @chris-stytch

I'm seeing fields categorised as empty fields where they are not empty

In this view, all 3 fields listed as empty have values

Interesting, that's odd, I have the opposite issue:
image

@kertal
Copy link
Member

kertal commented Aug 13, 2024

thx for reporting, could you help us triage this issue by running a field_caps request like this?

GET {indexPattern}/_field_caps?fields={theFieldToCheck}&include_empty_fields=false

we use include_empty_fields set to false to distinguish between empty and available fields. if a field you consider to be empty is returned with this request, then we need to check why ES is returning it. One reason could be that the index had a value once and the containing document was deleted. if this can be ruled out we need to dig deeper, high likely on Elasticsearch level.

@baileygm
Copy link

GET {indexPattern}/_field_caps?fields={theFieldToCheck}&include_empty_fields=false

Looking into this further I can see that when Kibana is accessing the elasticsearch cluster directly then it is correctly showing available fields and empty fields. It is when we are accessing the cluster via a cross cluster search node that I am seeing fields containing data categorised as empty fields

I have run the fields_caps request though for both scenarios and get identical results though

  1. Direct access to cluster
    { "indices": [ "log-application-8.14.3-2024.08.09" ], "fields": { "log.file": { "object": { "type": "object", "metadata_field": false, "searchable": false, "aggregatable": false } }, "log": { "object": { "type": "object", "metadata_field": false, "searchable": false, "aggregatable": false } }, "log.file.device_id": { "keyword": { "type": "keyword", "metadata_field": false, "searchable": true, "aggregatable": true } } } }

  2. Querying via cross cluster search node
    { "indices": [ "ovh:log-application-8.14.3-2024.08.09" ], "fields": { "log.file": { "object": { "type": "object", "metadata_field": false, "searchable": false, "aggregatable": false } }, "log": { "object": { "type": "object", "metadata_field": false, "searchable": false, "aggregatable": false } }, "log.file.device_id": { "keyword": { "type": "keyword", "metadata_field": false, "searchable": true, "aggregatable": true } } } }

@kertal
Copy link
Member

kertal commented Aug 14, 2024

@baileygm thx, this sounds like an issue with CCS, in this case are all search nodes on the same version? FYI @piergm

@baileygm
Copy link

Yes, all nodes and kibana are using 8.14.3

@kertal
Copy link
Member

kertal commented Aug 20, 2024

then it sounds like a bug on ES side, could you open an issue at https://github.com/elastic/elasticsearch/issues/new/choose
this PR is closed, your problem needs some triage on ES side, many thx!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Feature:UnifiedFieldList The unified field list component used by Lens & Discover impact:high Addressing this issue will have a high level of impact on the quality/strength of our product. loe:needs-research This issue requires some research before it can be worked on or estimated Team:DataDiscovery Discover, search (e.g. data plugin and KQL), data views, saved searches. For ES|QL, use Team:ES|QL.
Projects
None yet