[Logs Explorer] Indicate data stream activity in dataset selector #171394

weltenwort · 2023-11-16T11:56:00Z

📓 Summary

Upon installation integrations install all their datasets even if only a subset of them will be populated by shippers. In the dataset selector we want to indicate to the user whether the underlying data stream has any recent data so they are less likely to visit empty datasets.

✔️ Acceptance criteria

The dataset selector annotates each dataset entry with the information whether the corresponding data stream has data. This might even be in the form of some kind of "recency indicator" like "never" or "3 minutes ago".
The performance of the dataset selector is not significantly lower than before. This might also be achieved by loading the information asynchronously.

🎨 Mock-ups

🚧 TODO

💡 Implementation hints

The data stream stats API provides storage size and recent timestamp information. The performance characteristics of this are unknown, though.

elasticmachine · 2023-11-16T11:56:25Z

Pinging @elastic/obs-ux-logs-team (Team:obs-ux-logs)

ruflin · 2023-11-24T09:15:56Z

Currently the dataset selection under integrations also shows dataset for which a data stream not even exists. We should grey these out or remove completely.

tonyghiani · 2023-11-24T10:27:05Z

The data stream stats API provides storage size and recent timestamp information. The performance characteristics of this are unknown, though.

@weltenwort Do you think we could keep a better performance to parallelize a query on the installed packages endpoint and associate this status to each dataset directly there? It's mostly to avoid additional back and forth between server and client and to avoid flashing UI changes when the user uses the selector and the dataset status is not resolved yet.

Just thinking out loud, I didn't check its performance, and it might be worse than I expected.

weltenwort · 2023-11-27T10:52:16Z

Let's think about it this way: We can't do much about the performance of the stats query itself, but we can influence whether it's on the critical path of the page load or not:

If the performance is not a problem, then putting it into the "installed packages" API is nice for simplicity.
If the performance is poor we should probably load the stats asynchronously so the dataset selector is already usable even if the activity annotation is not shown yet.

tonyghiani · 2023-11-27T11:07:06Z

Let's think about it this way: We can't do much about the performance of the stats query itself, but we can influence whether it's on the critical path of the page load or not:

If the performance is not a problem, then putting it into the "installed packages" API is nice for simplicity.

If the performance is poor we should probably load the stats asynchronously so the dataset selector is already usable even if the activity annotation is not shown yet.

Agree with everything, as additional context to consider when measuring the performance impact of this, let's keep in mind the installed packages are fetched in the background (the first page of 15 integrations) even if the DatasetSelector is not opened 👌

isaclfreire · 2024-01-12T11:59:15Z

Here in this Figma file you can find some initial explorations and questions I have regarding this issue.

achyutjhunjhunwala · 2024-01-13T21:55:03Z

As a SRE, what benefit do i get from knowing the count of log documents ? if its only to let user know that there is some data present, we could use some other visual indicator as # of documents could be overwhelming.

The idea to load the Last Activity on Popover could solve performance issue as this can be loaded on demand in a popover for each dataset.

isaclfreire · 2024-01-15T10:55:15Z

Hi @achyutjhunjhunwala thanks for the input! That's a good question. We still have many assumptions to look into and a user testing session will definitely help with that.

For the docs count, I referred to this issue requirements. It says:

The number of docs is also an indicator to the user, how "active" a dataset is.

Regarding the # of docs, I wonder if users remember the last count and, therefore, are able to understand that it increased or decreased. That's why I ask if we can track if there are new data coming in from user's last session, but it feels ultimately very unreliable to me. I added a green dot as a marker of new activity, but I'm not sure it will be enough.

(I still haven't addressed with the designs what happens when the count is 0, so bear in mind this is all up for discussion 👍)

The idea to load the Last Activity on Popover could solve performance issue as this can be loaded on demand in a popover for each dataset.

Yep, that's what I thought to try not to slow the performance. The question that remains is in what time frame this will be updated. Every x minutes, every x seconds...?

ruflin · 2024-01-15T11:49:51Z

Another idea triggered by the comment from @achyutjhunjhunwala : What if we could show a trendline? It would solve multiple problems at once. It would show if there is recent activity (and how much), it shows if there is activity at all and if activity is different from other dataset / integrations. Unfortunately the trendline could be expensive to compute ...

achyutjhunjhunwala · 2024-01-15T11:52:59Z

A date_histogram can give us this trendline, but yes this could get expensive for each dataset. We can limit this to last 24 hours and will have to run a test against a beefy cluster to see how it behaves

{
"aggs": {
  "Group By Hour": {
     "date_histogram": {
        "field": "@timestamp",
        "interval": "hour",
        "format" : "k"
        }
     }
  }
}

Yep, that's what I thought to try not to slow the performance. The question that remains is in what time frame this will be updated. Every x minutes, every x seconds...?

@isaclfreire My idea was real time, When the user clicks on the 3 dots, we fire the query for that dataset.
If we are firing only 1 (stats) query for any dataset, this should not be expensive

weltenwort · 2024-01-15T12:49:19Z

The "last activity" is an appealing option because it's part of the data stream stats API and therefore cheap to fetch. IMHO in order for it to be really useful it would have to be visible right in the dataset list. If I have to click around to see it in a context menu I might just as well select it.

achyutjhunjhunwala · 2024-01-15T13:18:18Z

In that case we can replace document count with last activity

isaclfreire · 2024-01-22T10:34:28Z

I have started some UX explorations in this Figma file, feel free to comment.

gbamparop · 2024-03-19T07:56:05Z

There's another implementation issue, can this one be closed?

weltenwort changed the title ~~Indicate data stream activity in dataset selector~~ [Logs Explorer] Indicate data stream activity in dataset selector Nov 16, 2023

weltenwort added the Team:obs-ux-logs Observability Logs User Experience Team label Nov 16, 2023

weltenwort added the needs design label Nov 16, 2023

ruflin mentioned this issue Dec 8, 2023

[Logs Explorer][Meta] Data selector improvements #172908

Open

isaclfreire self-assigned this Jan 11, 2024

isaclfreire mentioned this issue Jan 12, 2024

[Logs Explorer] Notify the user about new log entries #164074

Open

isaclfreire removed their assignment Feb 2, 2024

gbamparop closed this as not planned Won't fix, can't repro, duplicate, stale Mar 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Logs Explorer] Indicate data stream activity in dataset selector #171394

[Logs Explorer] Indicate data stream activity in dataset selector #171394

weltenwort commented Nov 16, 2023

elasticmachine commented Nov 16, 2023

ruflin commented Nov 24, 2023

tonyghiani commented Nov 24, 2023

weltenwort commented Nov 27, 2023

tonyghiani commented Nov 27, 2023

isaclfreire commented Jan 12, 2024

achyutjhunjhunwala commented Jan 13, 2024 •

edited

Loading

isaclfreire commented Jan 15, 2024

ruflin commented Jan 15, 2024

achyutjhunjhunwala commented Jan 15, 2024 •

edited

Loading

weltenwort commented Jan 15, 2024

achyutjhunjhunwala commented Jan 15, 2024

isaclfreire commented Jan 22, 2024

gbamparop commented Mar 19, 2024

[Logs Explorer] Indicate data stream activity in dataset selector #171394

[Logs Explorer] Indicate data stream activity in dataset selector #171394

Comments

weltenwort commented Nov 16, 2023

📓 Summary

✔️ Acceptance criteria

🎨 Mock-ups

💡 Implementation hints

elasticmachine commented Nov 16, 2023

ruflin commented Nov 24, 2023

tonyghiani commented Nov 24, 2023

weltenwort commented Nov 27, 2023

tonyghiani commented Nov 27, 2023

isaclfreire commented Jan 12, 2024

achyutjhunjhunwala commented Jan 13, 2024 • edited Loading

isaclfreire commented Jan 15, 2024

ruflin commented Jan 15, 2024

achyutjhunjhunwala commented Jan 15, 2024 • edited Loading

weltenwort commented Jan 15, 2024

achyutjhunjhunwala commented Jan 15, 2024

isaclfreire commented Jan 22, 2024

gbamparop commented Mar 19, 2024

achyutjhunjhunwala commented Jan 13, 2024 •

edited

Loading

achyutjhunjhunwala commented Jan 15, 2024 •

edited

Loading