[Security AI] Bedrock prompt tuning and inference corrections #209011

stephmilovic · 2025-01-30T18:41:12Z

Summary

@jamesspi noticed some issues when invoking inference with an ESQL query, specifically:

Adversaries leverage persistence to keep their foothold within a network. A common way to establish persistence is to add Windows Registry keys that autostart applications after reboot.

Generate an ES|QL query for me to hunt for the addition of registry keys to autorun applications made by suspicious processes. I need to use the information found to name the name of the post-exploitation rootkit used by the bad actor.

Windows Registry Keys are stored in the winlog.event_data.TargetObject ECS field name. Use logs-* as the index pattern.

1. Bedrock needs a Respond step

We noticed the model starts streaming back the conversation with the tool. Meaning, we saw this "Certainly..." line coming into the stream:

To address this, I needed to add the same "Respond" step we have for Bedrock. There is no distinction in Bedrock when the stream is communicating with the tool or with the final answer, which is why we needed the "Respond" step. Since inference is currently using different providers behind the scene, I added a check for the provider and if it's Bedrock we need the "Respond" step.

2. The ESQL query was not in the response

Despite the system prompt including Always return value from NaturalLanguageESQLTool as is., this is not the behavior we were seeing with both Bedrock and Inference. Updating the prompt to be even more strict and specific fixed the issue for both Bedrock and Inference. This is the new portion of the prompt that enforces the results from the ESQL tool: ALWAYS return the exact response from NaturalLanguageESQLTool verbatim in the final response, without adding further description.

3. Evals not configured for inference

I tried to run evals to confirm this change and realized that the post_evaluate route was not properly configured for inference. To correct this, I updated the agent from the deprecated createOpenAIFunctionsAgent to createOpenAIToolsAgent and added a condition to ensure llmType === 'inference' uses the tool calling agent. This update was already made in the other place we have the agent selection logic, callAssistantGraph.

Running the evals after the prompt change improved the accuracy from 94% to 97%

Testing

Testing prompt:

Adversaries leverage persistence to keep their foothold within a network. A common way to establish persistence is to add Windows Registry keys that autostart applications after reboot.

Generate an ES|QL query for me to hunt for the addition of registry keys to autorun applications made by suspicious processes. I need to use the information found to name the name of the post-exploitation rootkit used by the bad actor.

Windows Registry Keys are stored in the winlog.event_data.TargetObject ECS field name. Use logs-* as the index pattern.

Have a Bedrock connector and an Inference connector configured with EIS (ping Steph if you need help setting that up)
With Bedrock selected, send the testing prompt. Ensure the response contains an ESQL query.
With Inference selected, send the testing prompt. Ensure the response does not start streaming early with the tool invocation (Certainly...) and contains an ESQL query.
To confirm the eval fix, run the ESQL eval with Inference connector selected. Previously, an error showed on the securityAiAssistantManagement?tab=evaluation view after hitting "Perform evaluation..."

stephmilovic · 2025-01-30T19:02:43Z

...ity/plugins/elastic_assistant/server/lib/langchain/graphs/default_assistant_graph/helpers.ts

@@ -155,8 +159,6 @@ export const streamGraph = async ({
          const chunk = data?.chunk;
          const msg = chunk.message;
          if (msg?.tool_call_chunks && msg?.tool_call_chunks.length > 0) {
-            // I don't think we hit this anymore because of our check for AGENT_NODE_TAG


this is actually an important param for OpenAI, removing my comment

elasticmachine · 2025-01-30T19:08:06Z

Pinging @elastic/security-solution (Team: SecuritySolution)

KDKHD

Looks great! I was able to see the issues before this PR and confirm they are now resolved. The only thing I was unable to test was this section of the test guide:

To confirm the eval fix, run the ESQL eval with Inference connector selected. Previously, an error showed on the securityAiAssistantManagement?tab=evaluation view after hitting "Perform evaluation..."

because the inference connector did not appear on the evaluation page.

stephmilovic · 2025-01-31T21:38:23Z

the inference connector did not appear on the evaluation page.

2e87e12

This commit ensures inference shows on eval page

elasticmachine · 2025-01-31T23:14:04Z

💛 Build succeeded, but was flaky

Buildkite Build
Commit: 2e87e12

Failed CI Steps

FTR Configs #118

Test Failures

[job] [logs] FTR Configs #118 / console app console autocomplete feature Autocomplete behavior JSON autocompletion with placeholder fields

Metrics [docs]

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id	before	after	diff
`@kbn/security-ai-prompts`	31	38	+7

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id	before	after	diff
`securitySolution`	21.4MB	21.4MB	+40.0B

Unknown metric groups

API count

id	before	after	diff
`@kbn/security-ai-prompts`	33	40	+7

History

💚 Build #272168 succeeded 52ff53e
💛 Build #272103 was flaky c2a2361

kibanamachine · 2025-01-31T23:14:51Z

Starting backport for target branches: 8.18, 8.x, 9.0

https://github.com/elastic/kibana/actions/runs/13082347057

…c#209011) (cherry picked from commit 0d415a6)

kibanamachine · 2025-01-31T23:20:25Z

💔 Some backports could not be created

Status	Branch	Result
❌	8.18	Backport failed because of merge conflicts
✅	8.x
✅	9.0

Note: Successful backport PRs will be merged automatically after passing CI.

Manual backport

To create the backport manually run:

node scripts/backport --pr 209011

Questions ?

Please refer to the Backport tool documentation

…c#209011) (cherry picked from commit 0d415a6) # Conflicts: # x-pack/solutions/search/plugins/enterprise_search/public/applications/app_search/utils/encode_path_params/index.ts # x-pack/solutions/security/packages/security-ai-prompts/src/get_prompt.ts

stephmilovic · 2025-01-31T23:22:22Z

💚 All backports created successfully

Status	Branch	Result
✅	8.18

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

…209011) (#209191) # Backport This will backport the following commits from `main` to `9.0`: - [[Security AI] Bedrock prompt tuning and inference corrections (#209011)](#209011)  ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport)  Co-authored-by: Steph Milovic <[email protected]>

…209011) (#209189) # Backport This will backport the following commits from `main` to `8.x`: - [[Security AI] Bedrock prompt tuning and inference corrections (#209011)](#209011)  ### Questions ? Please refer to the [Backport tool documentation](https://github.com/sqren/backport)  Co-authored-by: Steph Milovic <[email protected]>

kibanamachine · 2025-02-03T01:49:11Z

Looks like this PR has backport PRs but they still haven't been merged. Please merge them ASAP to keep the branches relatively in sync.

stephmilovic added 3 commits January 30, 2025 11:09

work

0136d07

get inference provider

fa10307

provider

61dc8ba

stephmilovic added 2 commits January 30, 2025 11:41

rm log

dd10090

tests

c2a2361

stephmilovic commented Jan 30, 2025

View reviewed changes

stephmilovic marked this pull request as ready for review January 30, 2025 19:08

stephmilovic requested a review from a team as a code owner January 30, 2025 19:08

stephmilovic added 2 commits January 30, 2025 16:39

more better

52ff53e

Merge branch 'main' into fix_inference_tools

958a755

KDKHD approved these changes Jan 31, 2025

View reviewed changes

add inference to eval connector dropdown

2e87e12

stephmilovic merged commit 0d415a6 into elastic:main Jan 31, 2025
9 checks passed

kibanamachine added the v9.1.0 label Jan 31, 2025

kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Jan 31, 2025

[Security AI] Bedrock prompt tuning and inference corrections (elasti…

a1b5eb7

…c#209011) (cherry picked from commit 0d415a6)

kibanamachine mentioned this pull request Jan 31, 2025

[8.x] [Security AI] Bedrock prompt tuning and inference corrections (#209011) #209189

Merged

kibanamachine pushed a commit to kibanamachine/kibana that referenced this pull request Jan 31, 2025

[Security AI] Bedrock prompt tuning and inference corrections (elasti…

0a94085

…c#209011) (cherry picked from commit 0d415a6)

kibanamachine mentioned this pull request Jan 31, 2025

[9.0] [Security AI] Bedrock prompt tuning and inference corrections (#209011) #209191

Merged

stephmilovic mentioned this pull request Jan 31, 2025

[8.18] [Security AI] Bedrock prompt tuning and inference corrections (#209011) #209192

Open

kibanamachine added v8.19.0 backport missing Added to PRs automatically when the are determined to be missing a backport. labels Feb 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Security AI] Bedrock prompt tuning and inference corrections #209011

[Security AI] Bedrock prompt tuning and inference corrections #209011

stephmilovic commented Jan 30, 2025 •

edited by kibanamachine

Loading

stephmilovic Jan 30, 2025

elasticmachine commented Jan 30, 2025

KDKHD left a comment

stephmilovic commented Jan 31, 2025

elasticmachine commented Jan 31, 2025

API count

kibanamachine commented Jan 31, 2025

kibanamachine commented Jan 31, 2025

stephmilovic commented Jan 31, 2025

kibanamachine commented Feb 3, 2025

[Security AI] Bedrock prompt tuning and inference corrections #209011

[Security AI] Bedrock prompt tuning and inference corrections #209011

Conversation

stephmilovic commented Jan 30, 2025 • edited by kibanamachine Loading

Summary

1. Bedrock needs a Respond step

2. The ESQL query was not in the response

3. Evals not configured for inference

Testing

stephmilovic Jan 30, 2025

Choose a reason for hiding this comment

elasticmachine commented Jan 30, 2025

KDKHD left a comment

Choose a reason for hiding this comment

stephmilovic commented Jan 31, 2025

elasticmachine commented Jan 31, 2025

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

Metrics [docs]

Public APIs missing comments

Async chunks

API count

History

kibanamachine commented Jan 31, 2025

kibanamachine commented Jan 31, 2025

💔 Some backports could not be created

Manual backport

Questions ?

stephmilovic commented Jan 31, 2025

💚 All backports created successfully

Questions ?

kibanamachine commented Feb 3, 2025

stephmilovic commented Jan 30, 2025 •

edited by kibanamachine

Loading