Skip to content

Commit

Permalink
[8.x] [Security Solution] [Attack discovery] Output chunking / refine…
Browse files Browse the repository at this point in the history
…ment, LangGraph migration, and evaluation improvements (#195669) (#196334)

# Backport

This will backport the following commits from `main` to `8.x`:
- [[Security Solution] [Attack discovery] Output chunking / refinement,
LangGraph migration, and evaluation improvements
(#195669)](#195669)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Andrew
Macri","email":"[email protected]"},"sourceCommit":{"committedDate":"2024-10-15T14:39:48Z","message":"[Security
Solution] [Attack discovery] Output chunking / refinement, LangGraph
migration, and evaluation improvements (#195669)\n\n## [Security
Solution] [Attack discovery] Output chunking / refinement, LangGraph
migration, and evaluation improvements\r\n\r\n### Summary\r\n\r\nThis PR
improves the Attack discovery user and developer experience with output
chunking / refinement, migration to LangGraph, and improvements to
evaluations.\r\n\r\nThe improvements were realized by transitioning from
directly using lower-level LangChain apis to LangGraph in this PR, and a
deeper integration with the evaluation features of
LangSmith.\r\n\r\n#### Output chunking\r\n\r\n_Output chunking_
increases the maximum and default number of alerts sent as context,
working around the output token limitations of popular large language
models (LLMs):\r\n\r\n| | Old | New
|\r\n|----------------|-------|-------|\r\n| max alerts | `100` | `500`
|\r\n| default alerts | `20` | `200` |\r\n\r\nSee _Output chunking
details_ below for more information.\r\n\r\n#### Settings\r\n\r\nA new
settings modal makes it possible to configure the number of alerts sent
as context directly from the Attack discovery
page:\r\n\r\n![settings](https://github.com/user-attachments/assets/3f5ab4e9-5eae-4f99-8490-e392c758fa6e)\r\n\r\n-
Previously, users configured this value for Attack discovery via the
security assistant Knowledge base settings, as documented
[here](https://www.elastic.co/guide/en/security/8.15/attack-discovery.html#attack-discovery-generate-discoveries)\r\n-
The new settings modal uses local storage (instead of the
previously-shared assistant Knowledge base setting, which is stored in
Elasticsearch)\r\n\r\n#### Output refinement\r\n\r\n_Output refinement_
automatically combines related discoveries (that were previously
represented as two or more discoveries):\r\n\r\n
![default_attack_discovery_graph](https://github.com/user-attachments/assets/c092bb42-a41e-4fba-85c2-a4b2c1ef3053)\r\n\r\n-
The `refine` step in the graph diagram above may (for example), combine
three discoveries from the `generate` step into two discoveries when
they are related\r\n\r\n### Hallucination detection\r\n\r\nNew
_hallucination detection_ displays an error in lieu of showing
hallucinated
output:\r\n\r\n![hallucination_detection](https://github.com/user-attachments/assets/1d849908-3f10-4fe8-8741-c0cf418b1524)\r\n\r\n-
A new tour step was added to the Attack discovery page to share the
improvements:\r\n\r\n![tour_step](https://github.com/user-attachments/assets/0cedf770-baba-41b1-8ec6-b12b14c0c57a)\r\n\r\n###
Summary of improvements for developers\r\n\r\nThe following features
improve the developer experience when running evaluations for Attack
discovery:\r\n\r\n#### Replay alerts in evaluations\r\n\r\nThis
evaluation feature eliminates the need to populate a local environment
with alerts to (re)run evaluations:\r\n\r\n
![alerts_as_input](https://github.com/user-attachments/assets/b29dc847-3d53-4b17-8757-ed59852c1623)\r\n\r\nAlert
replay skips the `retrieve_anonymized_alerts` step in the graph, because
it uses the `anonymizedAlerts` and `replacements` provided as `Input` in
a dataset example. See _Replay alerts in evaluations details_ below for
more information.\r\n\r\n#### Override graph state\r\n\r\nOverride graph
state via datatset examples to test prompt improvements and edge cases
via evaluations:\r\n\r\n
![override_graph_input](https://github.com/user-attachments/assets/a685177b-1e07-4f49-9b8d-c0b652975237)\r\n\r\nTo
use this feature, add an `overrides` key to the `Input` of a dataset
example. See _Override graph state details_ below for more
information.\r\n\r\n#### New custom evaluator\r\n\r\nPrior to this PR,
an evaluator had to be manually added to each dataset in LangSmith to
use an LLM as the judge for correctness.\r\n\r\nThis PR introduces a
custom, programmatic evaluator that handles anonymization automatically,
and eliminates the need to manually create evaluators in LangSmith. To
use it, simply run evaluations from the `Evaluation` tab in
settings.\r\n\r\n#### New evaluation settings\r\n\r\nThis PR introduces
new settings in the `Evaluation`
tab:\r\n\r\n![new_evaluation_settings](https://github.com/user-attachments/assets/ca72aa2a-b0dc-4bec-9409-386d77d6a2f4)\r\n\r\nNew
evaluation settings:\r\n\r\n- `Evaluator model (optional)` - Judge the
quality of predictions using a single model. (Default: use the same
model as the connector)\r\n\r\nThis new setting is useful when you want
to use the same model, e.g. `GPT-4o` to judge the quality of all the
models evaluated in an experiment.\r\n\r\n- `Default max alerts` - The
default maximum number of alerts to send as context, which may be
overridden by the example input\r\n\r\nThis new setting is useful when
using the alerts in the local environment to run evaluations. Examples
that use the Alerts replay feature will ignore this value, because the
alerts in the example `Input` will be used instead.\r\n\r\n####
Directory structure refactoring\r\n\r\n- The server-side directory
structure was refactored to consolidate the location of Attack discovery
related files\r\n\r\n### Details\r\n\r\nThis section describes some of
the improvements above in detail.\r\n\r\n#### Output chunking
details\r\n\r\nThe new output chunking feature increases the maximum and
default number of alerts that may be sent as context. It achieves this
improvement by working around output token limitations.\r\n\r\nLLMs have
different limits for the number of tokens accepted as _input_ for
requests, and the number of tokens available for _output_ when
generating responses.\r\n\r\nToday, the output token limits of most
popular models are significantly smaller than the input token
limits.\r\n\r\nFor example, at the time of this writing, the Gemini 1.5
Pro model's limits are
([source](https://ai.google.dev/gemini-api/docs/models/gemini)):\r\n\r\n-
Input token limit: `2,097,152`\r\n- Output token limit:
`8,192`\r\n\r\nAs a result of this relatively smaller output token
limit, previous versions of Attack discovery would simply fail when an
LLM ran out of output tokens when generating a response. This often
happened \"mid sentence\", and resulted in errors or hallucinations
being displayed to users.\r\n\r\nThe new output chunking feature detects
incomplete responses from the LLM in the `generate` step of the Graph.
When an incomplete response is detected, the `generate` step will run
again with:\r\n\r\n- The original prompt\r\n- The Alerts provided as
context\r\n- The partially generated response\r\n- Instructions to
\"continue where you left off\"\r\n\r\nThe `generate` step in the graph
will run until one of the following conditions is met:\r\n\r\n- The
incomplete response can be successfully parsed\r\n- The maximum number
of generation attempts (default: `10`) is reached\r\n- The maximum
number of hallucinations detected (default: `5`) is reached\r\n\r\n####
Output refinement details\r\n\r\nThe new output refinement feature
automatically combines related discoveries (that were previously
represented as two or more discoveries).\r\n\r\nThe new `refine` step in
the graph re-submits the discoveries from the `generate` step with a
`refinePrompt` to combine related attack discoveries.\r\n\r\nThe
`refine` step is subject to the model's output token limits, just like
the `generate` step. That means a response to the refine prompt from the
LLM may be cut off \"mid\" sentence. To that end:\r\n\r\n- The refine
step will re-run until the (same, shared) `maxGenerationAttempts` and
`maxHallucinationFailures` limits as the `generate` step are
reached\r\n- The maximum number of attempts (default: `10`) is _shared_
with the `generate` step. For example, if it took `7` tries
(`generationAttempts`) to complete the `generate` step, the refine
`step` will only run up to `3` times.\r\n\r\nThe `refine` step will
return _unrefined_ results from the `generate` step when:\r\n\r\n- The
`generate` step uses all `10` generation attempts. When this happens,
the `refine` step will be skipped, and the unrefined output of the
`generate` step will be returned to the user\r\n- If the `refine` step
uses all remaining attempts, but fails to produce a refined response,
due to output token limitations, or hallucinations in the refined
response\r\n\r\n#### Hallucination detection details\r\n\r\nBefore this
PR, Attack discovery directly used lower level LangChain APIs to parse
responses from the LLM. After this PR, Attack discovery uses
LangGraph.\r\n\r\nIn the previous implementation, when Attack discovery
received an incomplete response because the output token limits of a
model were hit, the LangChain APIs automatically re-submitted the
incomplete response in an attempt to \"repair\" it. However, the
re-submitted results didn't include all of the original context (i.e.
alerts that generated them). The repair process often resulted in
hallucinated results being presented to users, especially with some
models i.e. `Claude 3.5 Haiku`.\r\n\r\nIn this PR, the `generate` and
`refine` steps detect (some) hallucinations. When hallucinations are
detected:\r\n\r\n- The current accumulated `generations` or
`refinements` are (respectively) discarded, effectively restarting the
`generate` or `refine` process\r\n- The `generate` and `refine` steps
will be retried until the maximum generation attempts (default: `10`) or
hallucinations detected (default: `5`) limits are reached\r\n\r\nHitting
the hallucination limit during the `generate` step will result in an
error being displayed to the user.\r\n\r\nHitting the hallucination
limit during the `refine` step will result in the unrefined discoveries
being displayed to the user.\r\n\r\n#### Replay alerts in evaluations
details\r\n\r\nAlerts replay makes it possible to re-run evaluations,
even when your local deployment has zero alerts.\r\n\r\nThis feature
eliminates the chore of populating your local instance with specific
alerts for each example.\r\n\r\nEvery example in a dataset may
(optionally) specify a different set of alerts.\r\n\r\nAlert replay
skips the `retrieve_anonymized_alerts` step in the graph, because it
uses the `anonymizedAlerts` and `replacements` provided as `Input` in a
dataset example.\r\n\r\nThe following instructions document the process
of creating a new LangSmith dataset example that uses the Alerts replay
feature:\r\n\r\n1) In Kibana, navigate to Security > Attack
discovery\r\n\r\n2) Click `Generate` to generate Attack
discoveries\r\n\r\n3) In LangSmith, navigate to Projects > _Your
project_\r\n\r\n4) In the `Runs` tab of the LangSmith project, click on
the latest `Attack discovery` entry to open the trace\r\n\r\n5)
**IMPORTANT**: In the trace, select the **LAST**
`ChannelWriteChannelWrite<attackDiscoveries,attackDisc...` entry. The
last entry will appear inside the **LAST** `refine` step in the trace,
as illustrated by the screenshot
below:\r\n\r\n![last_channel_write](https://github.com/user-attachments/assets/c57fc803-3bbb-4603-b99f-d2b130428201)\r\n\r\n6)
With the last `ChannelWriteChannelWrite<attackDiscoveries,attackDisc...`
entry selected, click `Add to` > `Add to Dataset`\r\n\r\n7) Copy-paste
the `Input` to the `Output`, because evaluation Experiments always
compare the current run with the `Output` in an example.\r\n\r\n- This
step is _always_ required to create a dataset.\r\n- If you don't want to
use the Alert replay feature, replace `Input` with an empty
object:\r\n\r\n```json\r\n{}\r\n```\r\n\r\n8) Choose an existing
dataset, or create a new one\r\n\r\n9) Click the `Submit` button to add
the example to the dataset.\r\n\r\nAfter completing the steps above, the
dataset is ready to be run in evaluations.\r\n\r\n#### Override graph
state details\r\n\r\nWhen a dataset is run in an evaluation (to create
Experiments):\r\n\r\n- The (optional) `anonymizedAlerts` and
`replacements` provided as `Input` in the example will be replayed,
bypassing the `retrieve_anonymized_alerts` step in the graph\r\n- The
rest of the properties in `Input` will not be used as inputs to the
graph\r\n- In contrast, an empty object `{}` in `Input` means the latest
and riskiest alerts in the last 24 hours in the local environment will
be queried\r\n\r\nIn addition to the above, you may add an optional
`overrides` key in the `Input` of a dataset example to test changes or
edge cases. This is useful for evaluating changes without updating the
code directly.\r\n\r\nThe `overrides` set the initial state of the graph
before it's run in an evaluation.\r\n\r\nThe example `Input` below
overrides the prompts used in the `generate` and `refine`
steps:\r\n\r\n```json\r\n{\r\n \"overrides\": {\r\n \"refinePrompt\":
\"This overrides the refine prompt\",\r\n \"attackDiscoveryPrompt\":
\"This overrides the attack discovery prompt\"\r\n
}\r\n}\r\n```\r\n\r\nTo use the `overrides` feature in evaluations to
set the initial state of the graph:\r\n\r\n1) Create a dataset example,
as documented in the _Replay alerts in evaluations details_ section
above\r\n\r\n2) In LangSmith, navigate to Datasets & Testing > _Your
Dataset_\r\n\r\n3) In the dataset, click the Examples tab\r\n\r\n4)
Click an example to open it in the flyout\r\n\r\n5) Click the `Edit`
button to edit the example\r\n\r\n6) Add the `overrides` key shown below
to the `Input` e.g.:\r\n\r\n```json\r\n{\r\n \"overrides\": {\r\n
\"refinePrompt\": \"This overrides the refine prompt\",\r\n
\"attackDiscoveryPrompt\": \"This overrides the attack discovery
prompt\"\r\n }\r\n}\r\n```\r\n\r\n7) Edit the `overrides` in the example
`Input` above to add (or remove) entries that will determine the initial
state of the graph.\r\n\r\nAll of the `overides` shown in step 6 are
optional. The `refinePrompt` and `attackDiscoveryPrompt` could be
removed from the `overrides` example above, and replaced with
`maxGenerationAttempts` to test a higher limit.\r\n\r\nAll valid graph
state may be specified in
`overrides`.","sha":"2c21adb8faafc0016ad7a6591837118f6bdf0907","branchLabelMapping":{"^v9.0.0$":"main","^v8.16.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:enhancement","v9.0.0","Team:
SecuritySolution","ci:cloud-deploy","ci:cloud-persist-deployment","Team:Security
Generative AI","v8.16.0","backport:version"],"title":"[Security
Solution] [Attack discovery] Output chunking / refinement, LangGraph
migration, and evaluation
improvements","number":195669,"url":"https://github.com/elastic/kibana/pull/195669","mergeCommit":{"message":"[Security
Solution] [Attack discovery] Output chunking / refinement, LangGraph
migration, and evaluation improvements (#195669)\n\n## [Security
Solution] [Attack discovery] Output chunking / refinement, LangGraph
migration, and evaluation improvements\r\n\r\n### Summary\r\n\r\nThis PR
improves the Attack discovery user and developer experience with output
chunking / refinement, migration to LangGraph, and improvements to
evaluations.\r\n\r\nThe improvements were realized by transitioning from
directly using lower-level LangChain apis to LangGraph in this PR, and a
deeper integration with the evaluation features of
LangSmith.\r\n\r\n#### Output chunking\r\n\r\n_Output chunking_
increases the maximum and default number of alerts sent as context,
working around the output token limitations of popular large language
models (LLMs):\r\n\r\n| | Old | New
|\r\n|----------------|-------|-------|\r\n| max alerts | `100` | `500`
|\r\n| default alerts | `20` | `200` |\r\n\r\nSee _Output chunking
details_ below for more information.\r\n\r\n#### Settings\r\n\r\nA new
settings modal makes it possible to configure the number of alerts sent
as context directly from the Attack discovery
page:\r\n\r\n![settings](https://github.com/user-attachments/assets/3f5ab4e9-5eae-4f99-8490-e392c758fa6e)\r\n\r\n-
Previously, users configured this value for Attack discovery via the
security assistant Knowledge base settings, as documented
[here](https://www.elastic.co/guide/en/security/8.15/attack-discovery.html#attack-discovery-generate-discoveries)\r\n-
The new settings modal uses local storage (instead of the
previously-shared assistant Knowledge base setting, which is stored in
Elasticsearch)\r\n\r\n#### Output refinement\r\n\r\n_Output refinement_
automatically combines related discoveries (that were previously
represented as two or more discoveries):\r\n\r\n
![default_attack_discovery_graph](https://github.com/user-attachments/assets/c092bb42-a41e-4fba-85c2-a4b2c1ef3053)\r\n\r\n-
The `refine` step in the graph diagram above may (for example), combine
three discoveries from the `generate` step into two discoveries when
they are related\r\n\r\n### Hallucination detection\r\n\r\nNew
_hallucination detection_ displays an error in lieu of showing
hallucinated
output:\r\n\r\n![hallucination_detection](https://github.com/user-attachments/assets/1d849908-3f10-4fe8-8741-c0cf418b1524)\r\n\r\n-
A new tour step was added to the Attack discovery page to share the
improvements:\r\n\r\n![tour_step](https://github.com/user-attachments/assets/0cedf770-baba-41b1-8ec6-b12b14c0c57a)\r\n\r\n###
Summary of improvements for developers\r\n\r\nThe following features
improve the developer experience when running evaluations for Attack
discovery:\r\n\r\n#### Replay alerts in evaluations\r\n\r\nThis
evaluation feature eliminates the need to populate a local environment
with alerts to (re)run evaluations:\r\n\r\n
![alerts_as_input](https://github.com/user-attachments/assets/b29dc847-3d53-4b17-8757-ed59852c1623)\r\n\r\nAlert
replay skips the `retrieve_anonymized_alerts` step in the graph, because
it uses the `anonymizedAlerts` and `replacements` provided as `Input` in
a dataset example. See _Replay alerts in evaluations details_ below for
more information.\r\n\r\n#### Override graph state\r\n\r\nOverride graph
state via datatset examples to test prompt improvements and edge cases
via evaluations:\r\n\r\n
![override_graph_input](https://github.com/user-attachments/assets/a685177b-1e07-4f49-9b8d-c0b652975237)\r\n\r\nTo
use this feature, add an `overrides` key to the `Input` of a dataset
example. See _Override graph state details_ below for more
information.\r\n\r\n#### New custom evaluator\r\n\r\nPrior to this PR,
an evaluator had to be manually added to each dataset in LangSmith to
use an LLM as the judge for correctness.\r\n\r\nThis PR introduces a
custom, programmatic evaluator that handles anonymization automatically,
and eliminates the need to manually create evaluators in LangSmith. To
use it, simply run evaluations from the `Evaluation` tab in
settings.\r\n\r\n#### New evaluation settings\r\n\r\nThis PR introduces
new settings in the `Evaluation`
tab:\r\n\r\n![new_evaluation_settings](https://github.com/user-attachments/assets/ca72aa2a-b0dc-4bec-9409-386d77d6a2f4)\r\n\r\nNew
evaluation settings:\r\n\r\n- `Evaluator model (optional)` - Judge the
quality of predictions using a single model. (Default: use the same
model as the connector)\r\n\r\nThis new setting is useful when you want
to use the same model, e.g. `GPT-4o` to judge the quality of all the
models evaluated in an experiment.\r\n\r\n- `Default max alerts` - The
default maximum number of alerts to send as context, which may be
overridden by the example input\r\n\r\nThis new setting is useful when
using the alerts in the local environment to run evaluations. Examples
that use the Alerts replay feature will ignore this value, because the
alerts in the example `Input` will be used instead.\r\n\r\n####
Directory structure refactoring\r\n\r\n- The server-side directory
structure was refactored to consolidate the location of Attack discovery
related files\r\n\r\n### Details\r\n\r\nThis section describes some of
the improvements above in detail.\r\n\r\n#### Output chunking
details\r\n\r\nThe new output chunking feature increases the maximum and
default number of alerts that may be sent as context. It achieves this
improvement by working around output token limitations.\r\n\r\nLLMs have
different limits for the number of tokens accepted as _input_ for
requests, and the number of tokens available for _output_ when
generating responses.\r\n\r\nToday, the output token limits of most
popular models are significantly smaller than the input token
limits.\r\n\r\nFor example, at the time of this writing, the Gemini 1.5
Pro model's limits are
([source](https://ai.google.dev/gemini-api/docs/models/gemini)):\r\n\r\n-
Input token limit: `2,097,152`\r\n- Output token limit:
`8,192`\r\n\r\nAs a result of this relatively smaller output token
limit, previous versions of Attack discovery would simply fail when an
LLM ran out of output tokens when generating a response. This often
happened \"mid sentence\", and resulted in errors or hallucinations
being displayed to users.\r\n\r\nThe new output chunking feature detects
incomplete responses from the LLM in the `generate` step of the Graph.
When an incomplete response is detected, the `generate` step will run
again with:\r\n\r\n- The original prompt\r\n- The Alerts provided as
context\r\n- The partially generated response\r\n- Instructions to
\"continue where you left off\"\r\n\r\nThe `generate` step in the graph
will run until one of the following conditions is met:\r\n\r\n- The
incomplete response can be successfully parsed\r\n- The maximum number
of generation attempts (default: `10`) is reached\r\n- The maximum
number of hallucinations detected (default: `5`) is reached\r\n\r\n####
Output refinement details\r\n\r\nThe new output refinement feature
automatically combines related discoveries (that were previously
represented as two or more discoveries).\r\n\r\nThe new `refine` step in
the graph re-submits the discoveries from the `generate` step with a
`refinePrompt` to combine related attack discoveries.\r\n\r\nThe
`refine` step is subject to the model's output token limits, just like
the `generate` step. That means a response to the refine prompt from the
LLM may be cut off \"mid\" sentence. To that end:\r\n\r\n- The refine
step will re-run until the (same, shared) `maxGenerationAttempts` and
`maxHallucinationFailures` limits as the `generate` step are
reached\r\n- The maximum number of attempts (default: `10`) is _shared_
with the `generate` step. For example, if it took `7` tries
(`generationAttempts`) to complete the `generate` step, the refine
`step` will only run up to `3` times.\r\n\r\nThe `refine` step will
return _unrefined_ results from the `generate` step when:\r\n\r\n- The
`generate` step uses all `10` generation attempts. When this happens,
the `refine` step will be skipped, and the unrefined output of the
`generate` step will be returned to the user\r\n- If the `refine` step
uses all remaining attempts, but fails to produce a refined response,
due to output token limitations, or hallucinations in the refined
response\r\n\r\n#### Hallucination detection details\r\n\r\nBefore this
PR, Attack discovery directly used lower level LangChain APIs to parse
responses from the LLM. After this PR, Attack discovery uses
LangGraph.\r\n\r\nIn the previous implementation, when Attack discovery
received an incomplete response because the output token limits of a
model were hit, the LangChain APIs automatically re-submitted the
incomplete response in an attempt to \"repair\" it. However, the
re-submitted results didn't include all of the original context (i.e.
alerts that generated them). The repair process often resulted in
hallucinated results being presented to users, especially with some
models i.e. `Claude 3.5 Haiku`.\r\n\r\nIn this PR, the `generate` and
`refine` steps detect (some) hallucinations. When hallucinations are
detected:\r\n\r\n- The current accumulated `generations` or
`refinements` are (respectively) discarded, effectively restarting the
`generate` or `refine` process\r\n- The `generate` and `refine` steps
will be retried until the maximum generation attempts (default: `10`) or
hallucinations detected (default: `5`) limits are reached\r\n\r\nHitting
the hallucination limit during the `generate` step will result in an
error being displayed to the user.\r\n\r\nHitting the hallucination
limit during the `refine` step will result in the unrefined discoveries
being displayed to the user.\r\n\r\n#### Replay alerts in evaluations
details\r\n\r\nAlerts replay makes it possible to re-run evaluations,
even when your local deployment has zero alerts.\r\n\r\nThis feature
eliminates the chore of populating your local instance with specific
alerts for each example.\r\n\r\nEvery example in a dataset may
(optionally) specify a different set of alerts.\r\n\r\nAlert replay
skips the `retrieve_anonymized_alerts` step in the graph, because it
uses the `anonymizedAlerts` and `replacements` provided as `Input` in a
dataset example.\r\n\r\nThe following instructions document the process
of creating a new LangSmith dataset example that uses the Alerts replay
feature:\r\n\r\n1) In Kibana, navigate to Security > Attack
discovery\r\n\r\n2) Click `Generate` to generate Attack
discoveries\r\n\r\n3) In LangSmith, navigate to Projects > _Your
project_\r\n\r\n4) In the `Runs` tab of the LangSmith project, click on
the latest `Attack discovery` entry to open the trace\r\n\r\n5)
**IMPORTANT**: In the trace, select the **LAST**
`ChannelWriteChannelWrite<attackDiscoveries,attackDisc...` entry. The
last entry will appear inside the **LAST** `refine` step in the trace,
as illustrated by the screenshot
below:\r\n\r\n![last_channel_write](https://github.com/user-attachments/assets/c57fc803-3bbb-4603-b99f-d2b130428201)\r\n\r\n6)
With the last `ChannelWriteChannelWrite<attackDiscoveries,attackDisc...`
entry selected, click `Add to` > `Add to Dataset`\r\n\r\n7) Copy-paste
the `Input` to the `Output`, because evaluation Experiments always
compare the current run with the `Output` in an example.\r\n\r\n- This
step is _always_ required to create a dataset.\r\n- If you don't want to
use the Alert replay feature, replace `Input` with an empty
object:\r\n\r\n```json\r\n{}\r\n```\r\n\r\n8) Choose an existing
dataset, or create a new one\r\n\r\n9) Click the `Submit` button to add
the example to the dataset.\r\n\r\nAfter completing the steps above, the
dataset is ready to be run in evaluations.\r\n\r\n#### Override graph
state details\r\n\r\nWhen a dataset is run in an evaluation (to create
Experiments):\r\n\r\n- The (optional) `anonymizedAlerts` and
`replacements` provided as `Input` in the example will be replayed,
bypassing the `retrieve_anonymized_alerts` step in the graph\r\n- The
rest of the properties in `Input` will not be used as inputs to the
graph\r\n- In contrast, an empty object `{}` in `Input` means the latest
and riskiest alerts in the last 24 hours in the local environment will
be queried\r\n\r\nIn addition to the above, you may add an optional
`overrides` key in the `Input` of a dataset example to test changes or
edge cases. This is useful for evaluating changes without updating the
code directly.\r\n\r\nThe `overrides` set the initial state of the graph
before it's run in an evaluation.\r\n\r\nThe example `Input` below
overrides the prompts used in the `generate` and `refine`
steps:\r\n\r\n```json\r\n{\r\n \"overrides\": {\r\n \"refinePrompt\":
\"This overrides the refine prompt\",\r\n \"attackDiscoveryPrompt\":
\"This overrides the attack discovery prompt\"\r\n
}\r\n}\r\n```\r\n\r\nTo use the `overrides` feature in evaluations to
set the initial state of the graph:\r\n\r\n1) Create a dataset example,
as documented in the _Replay alerts in evaluations details_ section
above\r\n\r\n2) In LangSmith, navigate to Datasets & Testing > _Your
Dataset_\r\n\r\n3) In the dataset, click the Examples tab\r\n\r\n4)
Click an example to open it in the flyout\r\n\r\n5) Click the `Edit`
button to edit the example\r\n\r\n6) Add the `overrides` key shown below
to the `Input` e.g.:\r\n\r\n```json\r\n{\r\n \"overrides\": {\r\n
\"refinePrompt\": \"This overrides the refine prompt\",\r\n
\"attackDiscoveryPrompt\": \"This overrides the attack discovery
prompt\"\r\n }\r\n}\r\n```\r\n\r\n7) Edit the `overrides` in the example
`Input` above to add (or remove) entries that will determine the initial
state of the graph.\r\n\r\nAll of the `overides` shown in step 6 are
optional. The `refinePrompt` and `attackDiscoveryPrompt` could be
removed from the `overrides` example above, and replaced with
`maxGenerationAttempts` to test a higher limit.\r\n\r\nAll valid graph
state may be specified in
`overrides`.","sha":"2c21adb8faafc0016ad7a6591837118f6bdf0907"}},"sourceBranch":"main","suggestedTargetBranches":["8.x"],"targetPullRequestStates":[{"branch":"main","label":"v9.0.0","branchLabelMappingKey":"^v9.0.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/195669","number":195669,"mergeCommit":{"message":"[Security
Solution] [Attack discovery] Output chunking / refinement, LangGraph
migration, and evaluation improvements (#195669)\n\n## [Security
Solution] [Attack discovery] Output chunking / refinement, LangGraph
migration, and evaluation improvements\r\n\r\n### Summary\r\n\r\nThis PR
improves the Attack discovery user and developer experience with output
chunking / refinement, migration to LangGraph, and improvements to
evaluations.\r\n\r\nThe improvements were realized by transitioning from
directly using lower-level LangChain apis to LangGraph in this PR, and a
deeper integration with the evaluation features of
LangSmith.\r\n\r\n#### Output chunking\r\n\r\n_Output chunking_
increases the maximum and default number of alerts sent as context,
working around the output token limitations of popular large language
models (LLMs):\r\n\r\n| | Old | New
|\r\n|----------------|-------|-------|\r\n| max alerts | `100` | `500`
|\r\n| default alerts | `20` | `200` |\r\n\r\nSee _Output chunking
details_ below for more information.\r\n\r\n#### Settings\r\n\r\nA new
settings modal makes it possible to configure the number of alerts sent
as context directly from the Attack discovery
page:\r\n\r\n![settings](https://github.com/user-attachments/assets/3f5ab4e9-5eae-4f99-8490-e392c758fa6e)\r\n\r\n-
Previously, users configured this value for Attack discovery via the
security assistant Knowledge base settings, as documented
[here](https://www.elastic.co/guide/en/security/8.15/attack-discovery.html#attack-discovery-generate-discoveries)\r\n-
The new settings modal uses local storage (instead of the
previously-shared assistant Knowledge base setting, which is stored in
Elasticsearch)\r\n\r\n#### Output refinement\r\n\r\n_Output refinement_
automatically combines related discoveries (that were previously
represented as two or more discoveries):\r\n\r\n
![default_attack_discovery_graph](https://github.com/user-attachments/assets/c092bb42-a41e-4fba-85c2-a4b2c1ef3053)\r\n\r\n-
The `refine` step in the graph diagram above may (for example), combine
three discoveries from the `generate` step into two discoveries when
they are related\r\n\r\n### Hallucination detection\r\n\r\nNew
_hallucination detection_ displays an error in lieu of showing
hallucinated
output:\r\n\r\n![hallucination_detection](https://github.com/user-attachments/assets/1d849908-3f10-4fe8-8741-c0cf418b1524)\r\n\r\n-
A new tour step was added to the Attack discovery page to share the
improvements:\r\n\r\n![tour_step](https://github.com/user-attachments/assets/0cedf770-baba-41b1-8ec6-b12b14c0c57a)\r\n\r\n###
Summary of improvements for developers\r\n\r\nThe following features
improve the developer experience when running evaluations for Attack
discovery:\r\n\r\n#### Replay alerts in evaluations\r\n\r\nThis
evaluation feature eliminates the need to populate a local environment
with alerts to (re)run evaluations:\r\n\r\n
![alerts_as_input](https://github.com/user-attachments/assets/b29dc847-3d53-4b17-8757-ed59852c1623)\r\n\r\nAlert
replay skips the `retrieve_anonymized_alerts` step in the graph, because
it uses the `anonymizedAlerts` and `replacements` provided as `Input` in
a dataset example. See _Replay alerts in evaluations details_ below for
more information.\r\n\r\n#### Override graph state\r\n\r\nOverride graph
state via datatset examples to test prompt improvements and edge cases
via evaluations:\r\n\r\n
![override_graph_input](https://github.com/user-attachments/assets/a685177b-1e07-4f49-9b8d-c0b652975237)\r\n\r\nTo
use this feature, add an `overrides` key to the `Input` of a dataset
example. See _Override graph state details_ below for more
information.\r\n\r\n#### New custom evaluator\r\n\r\nPrior to this PR,
an evaluator had to be manually added to each dataset in LangSmith to
use an LLM as the judge for correctness.\r\n\r\nThis PR introduces a
custom, programmatic evaluator that handles anonymization automatically,
and eliminates the need to manually create evaluators in LangSmith. To
use it, simply run evaluations from the `Evaluation` tab in
settings.\r\n\r\n#### New evaluation settings\r\n\r\nThis PR introduces
new settings in the `Evaluation`
tab:\r\n\r\n![new_evaluation_settings](https://github.com/user-attachments/assets/ca72aa2a-b0dc-4bec-9409-386d77d6a2f4)\r\n\r\nNew
evaluation settings:\r\n\r\n- `Evaluator model (optional)` - Judge the
quality of predictions using a single model. (Default: use the same
model as the connector)\r\n\r\nThis new setting is useful when you want
to use the same model, e.g. `GPT-4o` to judge the quality of all the
models evaluated in an experiment.\r\n\r\n- `Default max alerts` - The
default maximum number of alerts to send as context, which may be
overridden by the example input\r\n\r\nThis new setting is useful when
using the alerts in the local environment to run evaluations. Examples
that use the Alerts replay feature will ignore this value, because the
alerts in the example `Input` will be used instead.\r\n\r\n####
Directory structure refactoring\r\n\r\n- The server-side directory
structure was refactored to consolidate the location of Attack discovery
related files\r\n\r\n### Details\r\n\r\nThis section describes some of
the improvements above in detail.\r\n\r\n#### Output chunking
details\r\n\r\nThe new output chunking feature increases the maximum and
default number of alerts that may be sent as context. It achieves this
improvement by working around output token limitations.\r\n\r\nLLMs have
different limits for the number of tokens accepted as _input_ for
requests, and the number of tokens available for _output_ when
generating responses.\r\n\r\nToday, the output token limits of most
popular models are significantly smaller than the input token
limits.\r\n\r\nFor example, at the time of this writing, the Gemini 1.5
Pro model's limits are
([source](https://ai.google.dev/gemini-api/docs/models/gemini)):\r\n\r\n-
Input token limit: `2,097,152`\r\n- Output token limit:
`8,192`\r\n\r\nAs a result of this relatively smaller output token
limit, previous versions of Attack discovery would simply fail when an
LLM ran out of output tokens when generating a response. This often
happened \"mid sentence\", and resulted in errors or hallucinations
being displayed to users.\r\n\r\nThe new output chunking feature detects
incomplete responses from the LLM in the `generate` step of the Graph.
When an incomplete response is detected, the `generate` step will run
again with:\r\n\r\n- The original prompt\r\n- The Alerts provided as
context\r\n- The partially generated response\r\n- Instructions to
\"continue where you left off\"\r\n\r\nThe `generate` step in the graph
will run until one of the following conditions is met:\r\n\r\n- The
incomplete response can be successfully parsed\r\n- The maximum number
of generation attempts (default: `10`) is reached\r\n- The maximum
number of hallucinations detected (default: `5`) is reached\r\n\r\n####
Output refinement details\r\n\r\nThe new output refinement feature
automatically combines related discoveries (that were previously
represented as two or more discoveries).\r\n\r\nThe new `refine` step in
the graph re-submits the discoveries from the `generate` step with a
`refinePrompt` to combine related attack discoveries.\r\n\r\nThe
`refine` step is subject to the model's output token limits, just like
the `generate` step. That means a response to the refine prompt from the
LLM may be cut off \"mid\" sentence. To that end:\r\n\r\n- The refine
step will re-run until the (same, shared) `maxGenerationAttempts` and
`maxHallucinationFailures` limits as the `generate` step are
reached\r\n- The maximum number of attempts (default: `10`) is _shared_
with the `generate` step. For example, if it took `7` tries
(`generationAttempts`) to complete the `generate` step, the refine
`step` will only run up to `3` times.\r\n\r\nThe `refine` step will
return _unrefined_ results from the `generate` step when:\r\n\r\n- The
`generate` step uses all `10` generation attempts. When this happens,
the `refine` step will be skipped, and the unrefined output of the
`generate` step will be returned to the user\r\n- If the `refine` step
uses all remaining attempts, but fails to produce a refined response,
due to output token limitations, or hallucinations in the refined
response\r\n\r\n#### Hallucination detection details\r\n\r\nBefore this
PR, Attack discovery directly used lower level LangChain APIs to parse
responses from the LLM. After this PR, Attack discovery uses
LangGraph.\r\n\r\nIn the previous implementation, when Attack discovery
received an incomplete response because the output token limits of a
model were hit, the LangChain APIs automatically re-submitted the
incomplete response in an attempt to \"repair\" it. However, the
re-submitted results didn't include all of the original context (i.e.
alerts that generated them). The repair process often resulted in
hallucinated results being presented to users, especially with some
models i.e. `Claude 3.5 Haiku`.\r\n\r\nIn this PR, the `generate` and
`refine` steps detect (some) hallucinations. When hallucinations are
detected:\r\n\r\n- The current accumulated `generations` or
`refinements` are (respectively) discarded, effectively restarting the
`generate` or `refine` process\r\n- The `generate` and `refine` steps
will be retried until the maximum generation attempts (default: `10`) or
hallucinations detected (default: `5`) limits are reached\r\n\r\nHitting
the hallucination limit during the `generate` step will result in an
error being displayed to the user.\r\n\r\nHitting the hallucination
limit during the `refine` step will result in the unrefined discoveries
being displayed to the user.\r\n\r\n#### Replay alerts in evaluations
details\r\n\r\nAlerts replay makes it possible to re-run evaluations,
even when your local deployment has zero alerts.\r\n\r\nThis feature
eliminates the chore of populating your local instance with specific
alerts for each example.\r\n\r\nEvery example in a dataset may
(optionally) specify a different set of alerts.\r\n\r\nAlert replay
skips the `retrieve_anonymized_alerts` step in the graph, because it
uses the `anonymizedAlerts` and `replacements` provided as `Input` in a
dataset example.\r\n\r\nThe following instructions document the process
of creating a new LangSmith dataset example that uses the Alerts replay
feature:\r\n\r\n1) In Kibana, navigate to Security > Attack
discovery\r\n\r\n2) Click `Generate` to generate Attack
discoveries\r\n\r\n3) In LangSmith, navigate to Projects > _Your
project_\r\n\r\n4) In the `Runs` tab of the LangSmith project, click on
the latest `Attack discovery` entry to open the trace\r\n\r\n5)
**IMPORTANT**: In the trace, select the **LAST**
`ChannelWriteChannelWrite<attackDiscoveries,attackDisc...` entry. The
last entry will appear inside the **LAST** `refine` step in the trace,
as illustrated by the screenshot
below:\r\n\r\n![last_channel_write](https://github.com/user-attachments/assets/c57fc803-3bbb-4603-b99f-d2b130428201)\r\n\r\n6)
With the last `ChannelWriteChannelWrite<attackDiscoveries,attackDisc...`
entry selected, click `Add to` > `Add to Dataset`\r\n\r\n7) Copy-paste
the `Input` to the `Output`, because evaluation Experiments always
compare the current run with the `Output` in an example.\r\n\r\n- This
step is _always_ required to create a dataset.\r\n- If you don't want to
use the Alert replay feature, replace `Input` with an empty
object:\r\n\r\n```json\r\n{}\r\n```\r\n\r\n8) Choose an existing
dataset, or create a new one\r\n\r\n9) Click the `Submit` button to add
the example to the dataset.\r\n\r\nAfter completing the steps above, the
dataset is ready to be run in evaluations.\r\n\r\n#### Override graph
state details\r\n\r\nWhen a dataset is run in an evaluation (to create
Experiments):\r\n\r\n- The (optional) `anonymizedAlerts` and
`replacements` provided as `Input` in the example will be replayed,
bypassing the `retrieve_anonymized_alerts` step in the graph\r\n- The
rest of the properties in `Input` will not be used as inputs to the
graph\r\n- In contrast, an empty object `{}` in `Input` means the latest
and riskiest alerts in the last 24 hours in the local environment will
be queried\r\n\r\nIn addition to the above, you may add an optional
`overrides` key in the `Input` of a dataset example to test changes or
edge cases. This is useful for evaluating changes without updating the
code directly.\r\n\r\nThe `overrides` set the initial state of the graph
before it's run in an evaluation.\r\n\r\nThe example `Input` below
overrides the prompts used in the `generate` and `refine`
steps:\r\n\r\n```json\r\n{\r\n \"overrides\": {\r\n \"refinePrompt\":
\"This overrides the refine prompt\",\r\n \"attackDiscoveryPrompt\":
\"This overrides the attack discovery prompt\"\r\n
}\r\n}\r\n```\r\n\r\nTo use the `overrides` feature in evaluations to
set the initial state of the graph:\r\n\r\n1) Create a dataset example,
as documented in the _Replay alerts in evaluations details_ section
above\r\n\r\n2) In LangSmith, navigate to Datasets & Testing > _Your
Dataset_\r\n\r\n3) In the dataset, click the Examples tab\r\n\r\n4)
Click an example to open it in the flyout\r\n\r\n5) Click the `Edit`
button to edit the example\r\n\r\n6) Add the `overrides` key shown below
to the `Input` e.g.:\r\n\r\n```json\r\n{\r\n \"overrides\": {\r\n
\"refinePrompt\": \"This overrides the refine prompt\",\r\n
\"attackDiscoveryPrompt\": \"This overrides the attack discovery
prompt\"\r\n }\r\n}\r\n```\r\n\r\n7) Edit the `overrides` in the example
`Input` above to add (or remove) entries that will determine the initial
state of the graph.\r\n\r\nAll of the `overides` shown in step 6 are
optional. The `refinePrompt` and `attackDiscoveryPrompt` could be
removed from the `overrides` example above, and replaced with
`maxGenerationAttempts` to test a higher limit.\r\n\r\nAll valid graph
state may be specified in
`overrides`.","sha":"2c21adb8faafc0016ad7a6591837118f6bdf0907"}},{"branch":"8.x","label":"v8.16.0","branchLabelMappingKey":"^v8.16.0$","isSourceBranch":false,"state":"NOT_CREATED"}]}]
BACKPORT-->

Co-authored-by: Andrew Macri <[email protected]>
  • Loading branch information
kibanamachine and andrew-goldstein authored Oct 15, 2024
1 parent 760021b commit e3996ca
Show file tree
Hide file tree
Showing 190 changed files with 8,378 additions and 2,148 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
* 2.0.
*/

import { getOpenAndAcknowledgedAlertsQuery } from './get_open_and_acknowledged_alerts_query';
import { getOpenAndAcknowledgedAlertsQuery } from '.';

describe('getOpenAndAcknowledgedAlertsQuery', () => {
it('returns the expected query', () => {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,13 @@
* 2.0.
*/

import type { AnonymizationFieldResponse } from '@kbn/elastic-assistant-common/impl/schemas/anonymization_fields/bulk_crud_anonymization_fields_route.gen';
import type { AnonymizationFieldResponse } from '../../schemas/anonymization_fields/bulk_crud_anonymization_fields_route.gen';

/**
* This query returns open and acknowledged (non-building block) alerts in the last 24 hours.
*
* The alerts are ordered by risk score, and then from the most recent to the oldest.
*/
export const getOpenAndAcknowledgedAlertsQuery = ({
alertsIndexPattern,
anonymizationFields,
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/

import { getRawDataOrDefault } from '.';

describe('getRawDataOrDefault', () => {
it('returns the raw data when it is valid', () => {
const rawData = {
field1: [1, 2, 3],
field2: ['a', 'b', 'c'],
};

expect(getRawDataOrDefault(rawData)).toEqual(rawData);
});

it('returns an empty object when the raw data is invalid', () => {
const rawData = {
field1: [1, 2, 3],
field2: 'invalid',
};

expect(getRawDataOrDefault(rawData)).toEqual({});
});
});
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/

import { isRawDataValid } from '../is_raw_data_valid';
import type { MaybeRawData } from '../types';

/** Returns the raw data if it valid, or a default if it's not */
export const getRawDataOrDefault = (rawData: MaybeRawData): Record<string, unknown[]> =>
isRawDataValid(rawData) ? rawData : {};
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/

import { isRawDataValid } from '.';

describe('isRawDataValid', () => {
it('returns true for valid raw data', () => {
const rawData = {
field1: [1, 2, 3], // the Fields API may return a number array
field2: ['a', 'b', 'c'], // the Fields API may return a string array
};

expect(isRawDataValid(rawData)).toBe(true);
});

it('returns true when a field array is empty', () => {
const rawData = {
field1: [1, 2, 3], // the Fields API may return a number array
field2: ['a', 'b', 'c'], // the Fields API may return a string array
field3: [], // the Fields API may return an empty array
};

expect(isRawDataValid(rawData)).toBe(true);
});

it('returns false when a field does not have an array of values', () => {
const rawData = {
field1: [1, 2, 3],
field2: 'invalid',
};

expect(isRawDataValid(rawData)).toBe(false);
});

it('returns true for empty raw data', () => {
const rawData = {};

expect(isRawDataValid(rawData)).toBe(true);
});

it('returns false when raw data is an unexpected type', () => {
const rawData = 1234;

// @ts-expect-error
expect(isRawDataValid(rawData)).toBe(false);
});
});
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/

import { MaybeRawData } from '../types';

export const isRawDataValid = (rawData: MaybeRawData): rawData is Record<string, unknown[]> =>
typeof rawData === 'object' && Object.keys(rawData).every((x) => Array.isArray(rawData[x]));
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/

import { sizeIsOutOfRange } from '.';
import { MAX_SIZE, MIN_SIZE } from '../types';

describe('sizeIsOutOfRange', () => {
it('returns true when size is undefined', () => {
const size = undefined;

expect(sizeIsOutOfRange(size)).toBe(true);
});

it('returns true when size is less than MIN_SIZE', () => {
const size = MIN_SIZE - 1;

expect(sizeIsOutOfRange(size)).toBe(true);
});

it('returns true when size is greater than MAX_SIZE', () => {
const size = MAX_SIZE + 1;

expect(sizeIsOutOfRange(size)).toBe(true);
});

it('returns false when size is exactly MIN_SIZE', () => {
const size = MIN_SIZE;

expect(sizeIsOutOfRange(size)).toBe(false);
});

it('returns false when size is exactly MAX_SIZE', () => {
const size = MAX_SIZE;

expect(sizeIsOutOfRange(size)).toBe(false);
});

it('returns false when size is within the valid range', () => {
const size = MIN_SIZE + 1;

expect(sizeIsOutOfRange(size)).toBe(false);
});
});
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/

import { MAX_SIZE, MIN_SIZE } from '../types';

/** Return true if the provided size is out of range */
export const sizeIsOutOfRange = (size?: number): boolean =>
size == null || size < MIN_SIZE || size > MAX_SIZE;
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0; you may not use this file except in compliance with the Elastic License
* 2.0.
*/

import type { SearchResponse } from '@elastic/elasticsearch/lib/api/types';

export const MIN_SIZE = 10;
export const MAX_SIZE = 10000;

/** currently the same shape as "fields" property in the ES response */
export type MaybeRawData = SearchResponse['fields'] | undefined;
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ export const AttackDiscovery = z.object({
/**
* A short (no more than a sentence) summary of the attack discovery featuring only the host.name and user.name fields (when they are applicable), using the same syntax
*/
entitySummaryMarkdown: z.string(),
entitySummaryMarkdown: z.string().optional(),
/**
* An array of MITRE ATT&CK tactic for the attack discovery
*/
Expand All @@ -55,7 +55,7 @@ export const AttackDiscovery = z.object({
/**
* The time the attack discovery was generated
*/
timestamp: NonEmptyString,
timestamp: NonEmptyString.optional(),
});

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,7 @@ components:
required:
- 'alertIds'
- 'detailsMarkdown'
- 'entitySummaryMarkdown'
- 'summaryMarkdown'
- 'timestamp'
- 'title'
properties:
alertIds:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,10 +22,12 @@ export type PostEvaluateBody = z.infer<typeof PostEvaluateBody>;
export const PostEvaluateBody = z.object({
graphs: z.array(z.string()),
datasetName: z.string(),
evaluatorConnectorId: z.string().optional(),
connectorIds: z.array(z.string()),
runName: z.string().optional(),
alertsIndexPattern: z.string().optional().default('.alerts-security.alerts-default'),
langSmithApiKey: z.string().optional(),
langSmithProject: z.string().optional(),
replacements: Replacements.optional().default({}),
size: z.number().optional().default(20),
});
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,8 @@ components:
type: string
datasetName:
type: string
evaluatorConnectorId:
type: string
connectorIds:
type: array
items:
Expand All @@ -72,6 +74,8 @@ components:
default: ".alerts-security.alerts-default"
langSmithApiKey:
type: string
langSmithProject:
type: string
replacements:
$ref: "../conversations/common_attributes.schema.yaml#/components/schemas/Replacements"
default: {}
Expand Down
16 changes: 16 additions & 0 deletions x-pack/packages/kbn-elastic-assistant-common/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -25,3 +25,19 @@ export {
export { transformRawData } from './impl/data_anonymization/transform_raw_data';
export { parseBedrockBuffer, handleBedrockChunk } from './impl/utils/bedrock';
export * from './constants';

/** currently the same shape as "fields" property in the ES response */
export { type MaybeRawData } from './impl/alerts/helpers/types';

/**
* This query returns open and acknowledged (non-building block) alerts in the last 24 hours.
*
* The alerts are ordered by risk score, and then from the most recent to the oldest.
*/
export { getOpenAndAcknowledgedAlertsQuery } from './impl/alerts/get_open_and_acknowledged_alerts_query';

/** Returns the raw data if it valid, or a default if it's not */
export { getRawDataOrDefault } from './impl/alerts/helpers/get_raw_data_or_default';

/** Return true if the provided size is out of range */
export { sizeIsOutOfRange } from './impl/alerts/helpers/size_is_out_of_range';
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ import * as i18n from '../../../knowledge_base/translations';
export const MIN_LATEST_ALERTS = 10;
export const MAX_LATEST_ALERTS = 100;
export const TICK_INTERVAL = 10;
export const RANGE_CONTAINER_WIDTH = 300; // px
export const RANGE_CONTAINER_WIDTH = 600; // px
const LABEL_WRAPPER_MIN_WIDTH = 95; // px

interface Props {
Expand Down Expand Up @@ -52,6 +52,7 @@ const AlertsSettingsComponent = ({ knowledgeBase, setUpdatedKnowledgeBaseSetting
<AlertsRange
knowledgeBase={knowledgeBase}
setUpdatedKnowledgeBaseSettings={setUpdatedKnowledgeBaseSettings}
value={knowledgeBase.latestAlerts}
/>
<EuiSpacer size="s" />
</EuiFlexItem>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ export const AlertsSettingsManagement: React.FC<Props> = React.memo(
knowledgeBase={knowledgeBase}
setUpdatedKnowledgeBaseSettings={setUpdatedKnowledgeBaseSettings}
compressed={false}
value={knowledgeBase.latestAlerts}
/>
</EuiPanel>
);
Expand Down
Loading

0 comments on commit e3996ca

Please sign in to comment.