-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Investigation app] add entities route and investigation Contextual Insight #194432
[Investigation app] add entities route and investigation Contextual Insight #194432
Conversation
Pinging @elastic/obs-ux-management-team (Team:obs-ux-management) |
🤖 GitHub commentsExpand to view the GitHub comments
Just comment with:
|
): Promise<{ responses: Array<InferSearchResponseOf<TDocument, TSearchRequest>> }>; | ||
} | ||
|
||
export function createEntitiesESClient({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a client specifically for searching through entities indices, but I should be using the observability es client as a dependency. Will update when I can.
.map((params) => { | ||
const searchParams: [MsearchMultisearchHeader, MsearchMultisearchBody] = [ | ||
{ | ||
index: [SERVICE_ENTITIES_LATEST_ALIAS], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is copypasta. I'd like to remove the reference to the service alias in particular.
…/investigation-entities
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just did a quick first pass, will continue
@@ -28,7 +28,7 @@ | |||
"kibanaReact", | |||
"kibanaUtils", | |||
], | |||
"optionalPlugins": [], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's already in the requiredPlugins
const alertOriginInvestigation = alertOriginSchema.safeParse(investigation?.origin); | ||
const alertId = alertOriginInvestigation.success ? alertOriginInvestigation.data.id : undefined; | ||
const { data: alert } = useFetchAlert({ id: alertId }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🍰 nit: this logic is required every time we use useFetchAlert(), maybe we can refactor the hook to encapsulate this logic: the hook itself could use useInvestigation to retrieve the investigation, and we won't need to expose the originated alert in this context. I'm already worried about this context becoming bloated with too many things.
const { data: alert } = useFetchAlertOrigin()
x-pack/plugins/observability_solution/investigate_app/public/hooks/use_fetch_entities.ts
Outdated
Show resolved
Hide resolved
…ooks/use_fetch_entities.ts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some questions and nits, but otherwise looks good to me.
I guess for testing this I need to setup a genAI connector, do you have a guide for this?
{investigation?.id && ( | ||
<EuiFlexItem grow={false}> | ||
<AssistantHypothesis investigationId={investigation.id} /> | ||
</EuiFlexItem> | ||
)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🍰 nit: use the context hook useInvestigation()
directly from AssistantHypothesis:
{investigation?.id && ( | |
<EuiFlexItem grow={false}> | |
<AssistantHypothesis investigationId={investigation.id} /> | |
</EuiFlexItem> | |
)} | |
<EuiFlexItem grow={false}> | |
<AssistantHypothesis /> | |
</EuiFlexItem> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually had this originally, but it made it so that the investigation was sometimes undefined, and I hated having to handle that all the time. Would you prefer that trade off?
}); | ||
export const SERVICE_ENTITIES_HISTORY_ALIAS = entitiesAliasPattern({ | ||
type: 'service', | ||
dataset: ENTITY_HISTORY, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought EEM had removed the history?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They did. I'll remove this for now.
hostName, | ||
entitiesEsClient, | ||
}: { | ||
context: InvestigateAppRequestHandlerContext; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If possible let's try to not leak route/request details into the services. Here we can replace the whole request handler context with the esClient, and do the wiring in the route handler.
); | ||
} | ||
|
||
const getEntitySource = async ({ index }: { index: IndicesIndexState }) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it need to be async?
return await Promise.all( | ||
Object.values(indices).map(async (index) => { | ||
return await getEntitySource({ index }); | ||
}) | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need the promise all and await here?
const sourceIndex = entity?.sourceIndex; | ||
if (!sourceIndex) return null; | ||
|
||
const indices = await esClient.indices.get({ index: sourceIndex }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🍰 nit: might be probably too early to optimize, but this call is made in a double for-loop. Is there a way to call the esClient.indices.get for all sourceIndex at once?
34235a6
to
53655e5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than:
It's already in the requiredPlugins
kibana.jsonc lgtm
The guide for setting up the connector can be found here https://github.com/elastic/kibana/blob/main/x-pack/plugins/observability_solution/observability_ai_assistant/README.md You'll also need to start your knowledge base. The easiest way to do that is, after setting up your connector, open the Assistant flyout via the Assistant button on the top right and click the start knowledge base button. |
…clarke/kibana into feature/investigation-entities
…clarke/kibana into feature/investigation-entities
b9de0ca
to
f95017d
Compare
💚 Build Succeeded
Metrics [docs]Module Count
Public APIs missing comments
Async chunks
Page load bundle
History
To update your PR or re-run it, just comment with: |
Starting backport for target branches: 8.x https://github.com/elastic/kibana/actions/runs/11184673144 |
💔 All backports failed
Manual backportTo create the backport manually run:
Questions ?Please refer to the Backport tool documentation |
💚 All backports created successfully
Note: Successful backport PRs will be merged automatically after passing CI. Questions ?Please refer to the Backport tool documentation |
…nsight (elastic#194432) ## Summary Adds a route that can be used to fetch entities related to an investigation. The route fetches associated entities by service name, host name, or container id. It then identifies the associated indices and datastreams. The discovered entities are passed to the contextual insight to inform the LLM. ![image](https://github.com/user-attachments/assets/855a8d68-b039-4557-ba23-5661cd961021) This PR represents the first step in developing an AI-informed hypothesis at the beginning of the investigation. Over time, further insights will be provided to the LLM to deepen it's investigative analysis and propose a more helpful root cause hypothesis. ### Testing 1. Create some APM data. I'm using the otel demo and triggering a failure via the flagd service. Since this is in flux, you can reach out to me about this workflow. However, you can also create APM data via `synth-trace`. 2. Create an custom threshold rule that you expect to trigger an alert. I created mine to using `http.response.status_code: 500 / http.response.status_code : *` and set a low threshold base on the amount of failures in my current test data. Be sure to also group the alert by `service.name` 3. Wait for the alert to fire, then visit the alert details page and start an investigation 4. notice the contextual insight. Expand it to see more information --------- Co-authored-by: kibanamachine <[email protected]> (cherry picked from commit e4bb435)
…nsight (elastic#194432) ## Summary Adds a route that can be used to fetch entities related to an investigation. The route fetches associated entities by service name, host name, or container id. It then identifies the associated indices and datastreams. The discovered entities are passed to the contextual insight to inform the LLM. ![image](https://github.com/user-attachments/assets/855a8d68-b039-4557-ba23-5661cd961021) This PR represents the first step in developing an AI-informed hypothesis at the beginning of the investigation. Over time, further insights will be provided to the LLM to deepen it's investigative analysis and propose a more helpful root cause hypothesis. ### Testing 1. Create some APM data. I'm using the otel demo and triggering a failure via the flagd service. Since this is in flux, you can reach out to me about this workflow. However, you can also create APM data via `synth-trace`. 2. Create an custom threshold rule that you expect to trigger an alert. I created mine to using `http.response.status_code: 500 / http.response.status_code : *` and set a low threshold base on the amount of failures in my current test data. Be sure to also group the alert by `service.name` 3. Wait for the alert to fire, then visit the alert details page and start an investigation 4. notice the contextual insight. Expand it to see more information --------- Co-authored-by: kibanamachine <[email protected]>
…tual Insight (#194432) (#195158) # Backport This will backport the following commits from `main` to `8.x`: - [[Investigation app] add entities route and investigation Contextual Insight (#194432)](#194432) <!--- Backport version: 8.9.8 --> ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport) <!--BACKPORT [{"author":{"name":"Dominique Clarke","email":"[email protected]"},"sourceCommit":{"committedDate":"2024-10-04T17:58:28Z","message":"[Investigation app] add entities route and investigation Contextual Insight (#194432)\n\n## Summary\r\n\r\nAdds a route that can be used to fetch entities related to an\r\ninvestigation.\r\n\r\nThe route fetches associated entities by service name, host name, or\r\ncontainer id. It then identifies the associated indices and datastreams.\r\n\r\nThe discovered entities are passed to the contextual insight to inform\r\nthe LLM.\r\n\r\n\r\n![image](https://github.com/user-attachments/assets/855a8d68-b039-4557-ba23-5661cd961021)\r\n\r\nThis PR represents the first step in developing an AI-informed\r\nhypothesis at the beginning of the investigation. Over time, further\r\ninsights will be provided to the LLM to deepen it's investigative\r\nanalysis and propose a more helpful root cause hypothesis.\r\n\r\n### Testing\r\n\r\n1. Create some APM data. I'm using the otel demo and triggering a\r\nfailure via the flagd service. Since this is in flux, you can reach out\r\nto me about this workflow. However, you can also create APM data via\r\n`synth-trace`.\r\n2. Create an custom threshold rule that you expect to trigger an alert.\r\nI created mine to using `http.response.status_code: 500 /\r\nhttp.response.status_code : *` and set a low threshold base on the\r\namount of failures in my current test data. Be sure to also group the\r\nalert by `service.name`\r\n3. Wait for the alert to fire, then visit the alert details page and\r\nstart an investigation\r\n4. notice the contextual insight. Expand it to see more information\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine <[email protected]>","sha":"e4bb435b48560852b37e4de54fb9c05cf5a7f3b1","branchLabelMapping":{"^v9.0.0$":"main","^v8.16.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","v9.0.0","backport:prev-minor","ci:project-deploy-observability","Team:obs-ux-management","v8.16.0"],"number":194432,"url":"https://github.com/elastic/kibana/pull/194432","mergeCommit":{"message":"[Investigation app] add entities route and investigation Contextual Insight (#194432)\n\n## Summary\r\n\r\nAdds a route that can be used to fetch entities related to an\r\ninvestigation.\r\n\r\nThe route fetches associated entities by service name, host name, or\r\ncontainer id. It then identifies the associated indices and datastreams.\r\n\r\nThe discovered entities are passed to the contextual insight to inform\r\nthe LLM.\r\n\r\n\r\n![image](https://github.com/user-attachments/assets/855a8d68-b039-4557-ba23-5661cd961021)\r\n\r\nThis PR represents the first step in developing an AI-informed\r\nhypothesis at the beginning of the investigation. Over time, further\r\ninsights will be provided to the LLM to deepen it's investigative\r\nanalysis and propose a more helpful root cause hypothesis.\r\n\r\n### Testing\r\n\r\n1. Create some APM data. I'm using the otel demo and triggering a\r\nfailure via the flagd service. Since this is in flux, you can reach out\r\nto me about this workflow. However, you can also create APM data via\r\n`synth-trace`.\r\n2. Create an custom threshold rule that you expect to trigger an alert.\r\nI created mine to using `http.response.status_code: 500 /\r\nhttp.response.status_code : *` and set a low threshold base on the\r\namount of failures in my current test data. Be sure to also group the\r\nalert by `service.name`\r\n3. Wait for the alert to fire, then visit the alert details page and\r\nstart an investigation\r\n4. notice the contextual insight. Expand it to see more information\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine <[email protected]>","sha":"e4bb435b48560852b37e4de54fb9c05cf5a7f3b1"}},"sourceBranch":"main","suggestedTargetBranches":["8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","labelRegex":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/194432","number":194432,"mergeCommit":{"message":"[Investigation app] add entities route and investigation Contextual Insight (#194432)\n\n## Summary\r\n\r\nAdds a route that can be used to fetch entities related to an\r\ninvestigation.\r\n\r\nThe route fetches associated entities by service name, host name, or\r\ncontainer id. It then identifies the associated indices and datastreams.\r\n\r\nThe discovered entities are passed to the contextual insight to inform\r\nthe LLM.\r\n\r\n\r\n![image](https://github.com/user-attachments/assets/855a8d68-b039-4557-ba23-5661cd961021)\r\n\r\nThis PR represents the first step in developing an AI-informed\r\nhypothesis at the beginning of the investigation. Over time, further\r\ninsights will be provided to the LLM to deepen it's investigative\r\nanalysis and propose a more helpful root cause hypothesis.\r\n\r\n### Testing\r\n\r\n1. Create some APM data. I'm using the otel demo and triggering a\r\nfailure via the flagd service. Since this is in flux, you can reach out\r\nto me about this workflow. However, you can also create APM data via\r\n`synth-trace`.\r\n2. Create an custom threshold rule that you expect to trigger an alert.\r\nI created mine to using `http.response.status_code: 500 /\r\nhttp.response.status_code : *` and set a low threshold base on the\r\namount of failures in my current test data. Be sure to also group the\r\nalert by `service.name`\r\n3. Wait for the alert to fire, then visit the alert details page and\r\nstart an investigation\r\n4. notice the contextual insight. Expand it to see more information\r\n\r\n---------\r\n\r\nCo-authored-by: kibanamachine <[email protected]>","sha":"e4bb435b48560852b37e4de54fb9c05cf5a7f3b1"}},{"branch":"8.x","label":"v8.16.0","labelRegex":"^v8.16.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}] BACKPORT--> Co-authored-by: Rickyanto Ang <[email protected]>
Summary
Adds a route that can be used to fetch entities related to an investigation.
The route fetches associated entities by service name, host name, or container id. It then identifies the associated indices and datastreams.
The discovered entities are passed to the contextual insight to inform the LLM.
This PR represents the first step in developing an AI-informed hypothesis at the beginning of the investigation. Over time, further insights will be provided to the LLM to deepen it's investigative analysis and propose a more helpful root cause hypothesis.
Testing
synth-trace
.http.response.status_code: 500 / http.response.status_code : *
and set a low threshold base on the amount of failures in my current test data. Be sure to also group the alert byservice.name