Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Security AI] Bedrock prompt tuning and inference corrections #209011

Merged
merged 8 commits into from
Jan 31, 2025

Conversation

stephmilovic
Copy link
Contributor

@stephmilovic stephmilovic commented Jan 30, 2025

Summary

@jamesspi noticed some issues when invoking inference with an ESQL query, specifically:

Adversaries leverage persistence to keep their foothold within a network. A common way to establish persistence is to add Windows Registry keys that autostart applications after reboot.

Generate an ES|QL query for me to hunt for the addition of registry keys to autorun applications made by suspicious processes. I need to use the information found to name the name of the post-exploitation rootkit used by the bad actor.

Windows Registry Keys are stored in the winlog.event_data.TargetObject ECS field name. Use logs-* as the index pattern.

1. Bedrock needs a Respond step

We noticed the model starts streaming back the conversation with the tool. Meaning, we saw this "Certainly..." line coming into the stream:

Screenshot 2025-01-30 at 11 09 48 AM

To address this, I needed to add the same "Respond" step we have for Bedrock. There is no distinction in Bedrock when the stream is communicating with the tool or with the final answer, which is why we needed the "Respond" step. Since inference is currently using different providers behind the scene, I added a check for the provider and if it's Bedrock we need the "Respond" step.

2. The ESQL query was not in the response

Despite the system prompt including Always return value from NaturalLanguageESQLTool as is., this is not the behavior we were seeing with both Bedrock and Inference. Updating the prompt to be even more strict and specific fixed the issue for both Bedrock and Inference. This is the new portion of the prompt that enforces the results from the ESQL tool: ALWAYS return the exact response from NaturalLanguageESQLTool verbatim in the final response, without adding further description.

3. Evals not configured for inference

I tried to run evals to confirm this change and realized that the post_evaluate route was not properly configured for inference. To correct this, I updated the agent from the deprecated createOpenAIFunctionsAgent to createOpenAIToolsAgent and added a condition to ensure llmType === 'inference' uses the tool calling agent. This update was already made in the other place we have the agent selection logic, callAssistantGraph.

Running the evals after the prompt change improved the accuracy from 94% to 97%

Testing

Testing prompt:

Adversaries leverage persistence to keep their foothold within a network. A common way to establish persistence is to add Windows Registry keys that autostart applications after reboot.

Generate an ES|QL query for me to hunt for the addition of registry keys to autorun applications made by suspicious processes. I need to use the information found to name the name of the post-exploitation rootkit used by the bad actor.

Windows Registry Keys are stored in the winlog.event_data.TargetObject ECS field name. Use logs-* as the index pattern.
  1. Have a Bedrock connector and an Inference connector configured with EIS (ping Steph if you need help setting that up)
  2. With Bedrock selected, send the testing prompt. Ensure the response contains an ESQL query.
  3. With Inference selected, send the testing prompt. Ensure the response does not start streaming early with the tool invocation (Certainly...) and contains an ESQL query.
  4. To confirm the eval fix, run the ESQL eval with Inference connector selected. Previously, an error showed on the securityAiAssistantManagement?tab=evaluation view after hitting "Perform evaluation..."

@stephmilovic stephmilovic added release_note:skip Skip the PR/issue when compiling release notes v9.0.0 Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. backport:prev-minor Backport to (9.0) the previous minor version (i.e. one version back from main) Team:Security Generative AI Security Generative AI v8.18.0 labels Jan 30, 2025
@@ -155,8 +159,6 @@ export const streamGraph = async ({
const chunk = data?.chunk;
const msg = chunk.message;
if (msg?.tool_call_chunks && msg?.tool_call_chunks.length > 0) {
// I don't think we hit this anymore because of our check for AGENT_NODE_TAG
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is actually an important param for OpenAI, removing my comment

@stephmilovic stephmilovic marked this pull request as ready for review January 30, 2025 19:08
@stephmilovic stephmilovic requested a review from a team as a code owner January 30, 2025 19:08
@elasticmachine
Copy link
Contributor

Pinging @elastic/security-solution (Team: SecuritySolution)

Copy link
Member

@KDKHD KDKHD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I was able to see the issues before this PR and confirm they are now resolved. The only thing I was unable to test was this section of the test guide:

To confirm the eval fix, run the ESQL eval with Inference connector selected. Previously, an error showed on the securityAiAssistantManagement?tab=evaluation view after hitting "Perform evaluation..."

because the inference connector did not appear on the evaluation page.

@stephmilovic
Copy link
Contributor Author

the inference connector did not appear on the evaluation page.

2e87e12

This commit ensures inference shows on eval page

@elasticmachine
Copy link
Contributor

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

  • [job] [logs] FTR Configs #118 / console app console autocomplete feature Autocomplete behavior JSON autocompletion with placeholder fields

Metrics [docs]

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
@kbn/security-ai-prompts 31 38 +7

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
securitySolution 21.4MB 21.4MB +40.0B
Unknown metric groups

API count

id before after diff
@kbn/security-ai-prompts 33 40 +7

History

@stephmilovic stephmilovic merged commit 0d415a6 into elastic:main Jan 31, 2025
9 checks passed
@kibanamachine
Copy link
Contributor

Starting backport for target branches: 8.18, 8.x, 9.0

https://github.com/elastic/kibana/actions/runs/13082347057

@kibanamachine
Copy link
Contributor

💔 Some backports could not be created

Status Branch Result
8.18 Backport failed because of merge conflicts
8.x
9.0

Note: Successful backport PRs will be merged automatically after passing CI.

Manual backport

To create the backport manually run:

node scripts/backport --pr 209011

Questions ?

Please refer to the Backport tool documentation

stephmilovic added a commit to stephmilovic/kibana that referenced this pull request Jan 31, 2025
…c#209011)

(cherry picked from commit 0d415a6)

# Conflicts:
#	x-pack/solutions/search/plugins/enterprise_search/public/applications/app_search/utils/encode_path_params/index.ts
#	x-pack/solutions/security/packages/security-ai-prompts/src/get_prompt.ts
@stephmilovic
Copy link
Contributor Author

💚 All backports created successfully

Status Branch Result
8.18

Note: Successful backport PRs will be merged automatically after passing CI.

Questions ?

Please refer to the Backport tool documentation

kibanamachine added a commit that referenced this pull request Feb 1, 2025
…209011) (#209191)

# Backport

This will backport the following commits from `main` to `9.0`:
- [[Security AI] Bedrock prompt tuning and inference corrections
(#209011)](#209011)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Steph
Milovic","email":"[email protected]"},"sourceCommit":{"committedDate":"2025-01-31T23:14:34Z","message":"[Security
AI] Bedrock prompt tuning and inference corrections
(#209011)","sha":"0d415a6d3a09200dad48a58851d89d81ef897b81","branchLabelMapping":{"^v9.1.0$":"main","^v8.19.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","v9.0.0","Team:
SecuritySolution","backport:prev-minor","Team:Security Generative
AI","v8.18.0","v9.1.0"],"title":"[Security AI] Bedrock prompt tuning and
inference
corrections","number":209011,"url":"https://github.com/elastic/kibana/pull/209011","mergeCommit":{"message":"[Security
AI] Bedrock prompt tuning and inference corrections
(#209011)","sha":"0d415a6d3a09200dad48a58851d89d81ef897b81"}},"sourceBranch":"main","suggestedTargetBranches":["9.0","8.18"],"targetPullRequestStates":[{"branch":"9.0","label":"v9.0.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"8.18","label":"v8.18.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v9.1.0","branchLabelMappingKey":"^v9.1.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/209011","number":209011,"mergeCommit":{"message":"[Security
AI] Bedrock prompt tuning and inference corrections
(#209011)","sha":"0d415a6d3a09200dad48a58851d89d81ef897b81"}}]}]
BACKPORT-->

Co-authored-by: Steph Milovic <[email protected]>
kibanamachine added a commit that referenced this pull request Feb 1, 2025
…209011) (#209189)

# Backport

This will backport the following commits from `main` to `8.x`:
- [[Security AI] Bedrock prompt tuning and inference corrections
(#209011)](#209011)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Steph
Milovic","email":"[email protected]"},"sourceCommit":{"committedDate":"2025-01-31T23:14:34Z","message":"[Security
AI] Bedrock prompt tuning and inference corrections
(#209011)","sha":"0d415a6d3a09200dad48a58851d89d81ef897b81","branchLabelMapping":{"^v9.1.0$":"main","^v8.19.0$":"8.x","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","v9.0.0","Team:
SecuritySolution","backport:prev-minor","Team:Security Generative
AI","v8.18.0","v9.1.0"],"title":"[Security AI] Bedrock prompt tuning and
inference
corrections","number":209011,"url":"https://github.com/elastic/kibana/pull/209011","mergeCommit":{"message":"[Security
AI] Bedrock prompt tuning and inference corrections
(#209011)","sha":"0d415a6d3a09200dad48a58851d89d81ef897b81"}},"sourceBranch":"main","suggestedTargetBranches":["9.0","8.18"],"targetPullRequestStates":[{"branch":"9.0","label":"v9.0.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"8.18","label":"v8.18.0","branchLabelMappingKey":"^v(\\d+).(\\d+).\\d+$","isSourceBranch":false,"state":"NOT_CREATED"},{"branch":"main","label":"v9.1.0","branchLabelMappingKey":"^v9.1.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/209011","number":209011,"mergeCommit":{"message":"[Security
AI] Bedrock prompt tuning and inference corrections
(#209011)","sha":"0d415a6d3a09200dad48a58851d89d81ef897b81"}}]}]
BACKPORT-->

Co-authored-by: Steph Milovic <[email protected]>
@kibanamachine kibanamachine added v8.19.0 backport missing Added to PRs automatically when the are determined to be missing a backport. labels Feb 1, 2025
@kibanamachine
Copy link
Contributor

Looks like this PR has backport PRs but they still haven't been merged. Please merge them ASAP to keep the branches relatively in sync.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport missing Added to PRs automatically when the are determined to be missing a backport. backport:prev-minor Backport to (9.0) the previous minor version (i.e. one version back from main) release_note:skip Skip the PR/issue when compiling release notes Team:Security Generative AI Security Generative AI Team: SecuritySolution Security Solutions Team working on SIEM, Endpoint, Timeline, Resolver, etc. v8.18.0 v8.19.0 v9.0.0 v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants