Add support for variables in outputs and default provider #6602

blakerouse · 2025-01-24T20:26:04Z

What does this PR do?

Adds support for context variables (ONLY) in outputs. Adds support for a default provide prefix to be defined for variables (default is env). This provides support for the original ${ES_PASSWORD} when used in outputs to still work as it will automatically map that to ${env.ES_PASSWORD} and it will get resolved for the context variables the same as it was being done by go-ucfg.

This includes an improvement to how variables are observed in the composable controller and how it is used by the coordinator. Now when a set of observable's are passed to the composable controller it will return the current set of variables after the debounce time, this ensures that before the variables are substituted that it is using the latest set of variables. Without this change running would always show an error at first with ${env.ES_PASSWORD} is an unknown variable and then less than a few milliseconds it would find it. This change removes that behavior and is able to find the variable on initial render.

Why is it important?

This allows the ability for all context provides to be able used in the outputs configuration. This is useful for using say the kubernetes_secrets provider in the policy for credentials or the new filesource provider to get a value from a file into the outputs. This is done in a way to not break compatibility with ${ES_PASSWORD} that could be used in outputs when it was rendered by go-ucfg, by the addition of a default provider setting (which defaults to env).

Checklist

I have read and understood the pull request guidelines of this project.
My code follows the style guidelines of this project
I have commented my code, particularly in hard-to-understand areas
~~[ ] I have made corresponding changes to the documentation~~
~~[ ] I have made corresponding change to the default configuration files~~
I have added tests that prove my fix is effective or that my feature works
I have added an entry in ./changelog/fragments using the changelog tool
~~[ ] I have added an integration test or an E2E test~~ (great coverage in unit tests)

Disruptive User Impact

None.

How to test this PR locally

Replace a field in the elastic-agent.yml with a variable like from the environment provider.

outputs:
  default:
    type: elasticsearch
    hosts: [127.0.0.1:9200]
    username: "${ES_USER}"
    password: "${ES_PASSWORD}"
    preset: balanced

Run with the variables.

ES_USER=elastic ES_PASSWORD=password ./elastic-agent run -e

Observe that the ES_USER and ES_PASSWORD are substituted.

Related issues

Relates Support context variables in the Elastic Agent policy output section #6376

mergify · 2025-01-24T20:26:41Z

This pull request does not have a backport label. Could you fix it @blakerouse? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

backport-./d./d is the label to automatically backport to the 8./d branch. /d is the digit

mergify · 2025-01-24T20:26:41Z

backport-v8.x has been added to help with the transition to the new branch 8.x.
If you don't need it please use backport-skip label and remove the backport-8.x label.

elasticmachine · 2025-01-25T14:50:47Z

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

pchila

Didn't manage to get past halfway through the PR so this is not a full review but I already have a couple of comments that can be shared

internal/pkg/agent/transpiler/outputs.go

pchila · 2025-01-30T12:45:09Z

internal/pkg/agent/transpiler/outputs_test.go

+		"basic no provider var": {
+			input: NewKey("outputs", NewDict([]Node{
+				NewKey("default", NewDict([]Node{
+					NewKey("key", NewStrVal("${name}")),
+				})),
+			})),
+			expected: NewDict([]Node{
+				NewKey("default", NewDict([]Node{
+					NewKey("key", NewStrVal("value1")),
+				})),
+			}),
+			vars: mustMakeVarsWithDefault(map[string]interface{}{
+				"var1": map[string]interface{}{
+					"name": "value1",
+				},
+			}, "var1"),


If I understand the introduction of the default provider correctly, shouldn't this case test that ${var1} is resolved as ${default_provider.var1} if default_provider is passed to mustMakeVarsWithDefault which then contains the variable ?
I understand that technically it's not different since those are all map[string]any but it's closer to the actual usage for env vars (using var1 as default provider here is a bit confusing since that is already used as variable name elsewhere)

I don't understand the confusion. It is testing that it is name is resolved to var1.name.

It's just that var1 looks to me as a variable name rather than a provider name. Here it is used as a default provider and I find name a bit misleading. It may be just my own bias though since it's all strings it doesn't matter for testing the functionality.

I agree that this is misleading name. provider1 would be better.

It is because most of these tests are testing different providers and each provider is named var1 and the other var2. Just keeping the same naming.

internal/pkg/agent/transpiler/vars.go

internal/pkg/agent/transpiler/vars_test.go

pchila · 2025-01-31T11:13:30Z

internal/pkg/agent/transpiler/vars_test.go

 			if test.Error {
 				assert.Error(t, err)
 			} else if test.NoMatch {
-				assert.Error(t, ErrNoMatch, err)
+				assert.ErrorIs(t, err, ErrNoMatch)
 			} else {
 				require.NoError(t, err)
 				assert.Equal(t, test.Result, res)


Nit: this whole block would probably be easier to read as an ErrorAssertionFunc member in the testcase struct, so input and error assertions can be kept together instead of signaling with a boolean

pchila · 2025-01-31T11:21:08Z

internal/pkg/agent/transpiler/vars.go

@@ -276,3 +294,15 @@ func varPrefixMatched(val string, key string) bool {
 	s := strings.SplitN(val, ".", 2)
 	return s[0] == key
 }
+
+func maybeAddDefaultProvider(val string, defaultProvider string) string {


From a general solution standpoint, I assumed that the defaultProvider would come into play when resolving variables rather than when adding values: I was expecting that the resolver would first try resolving an expression like ${somevar.somefield.somestuff} "as-is" and only in case of failure it would try to solve again by adding the default provider and resolving ${default_provider.somevar.somefield.somestuff}.

What is being done here is modifying variable names that do not contain "." by prepending defaultProvider. when adding/replacing a variable and it only works for expressions that do not already contain "." which is a strong limitation.

Are there specific concerns/constraints that would make the current change the preferred solution?

It cannot be done at resolve time because we need it to be known at Observe time. Without it at Observe time the code that was added to run only the providers that are referenced will not work. The issue is also present in the case that the variable references a fetch provider, as we will not be able to know if that variable is truly resolvable until that time occurs to resolve it.

If an expression contains . the text before the first . is always the provider, always! There is no other way to do this, because of the issues I placed above.

If an expression contains . the text before the first . is always the provider, always! There is no other way to do this, because of the issues I placed above.

Is this documented somewhere? I didn't see it in the PR changes but I didn't look at the whole ast package so I may have missed it. The reason I asked the question is that I suspected that there was a constraint somewhere, it wasn't obvious which and probably other people reading this snippet may have the same doubt.
It could be worth adding a comment on the function itself explaining that it will prepend the defaultProvider only on flat, single-token variable expression like ${somevar}

Maybe we should change the name to addDefaultProviderIfNotPresent? That's clearer to me at first glance.

I added a doc string to the function to explain this better so it is known.

pchila

Finished the first pass of review. I added some more questions about the concurrent part of the change, specifically about the handleObserved(), observedResult in internal/pkg/composable/controller.go and surrounding code.

A couple of questions about providers that used to exit almost immediately and now block on ctx.Done: is this necessary to implement the functionality or is it something that was fixed "while we're in there" ?

internal/pkg/composable/controller.go

pchila · 2025-01-31T13:01:17Z

internal/pkg/composable/controller.go

+		// this new set of vars replaces that set if that current
+		// value has not been read then it will result in vars state being incorrect
+		select {
+		case <-c.ch:


Why is this draining for c.ch added here when it was already present in previous code (line 297 after the change) ?
Is it necessary to drain this before pushing vars in observedResult channel?
Is it still necessary to drain again in line 297 ?

The comment right above this says exactly why it is required. It has to be here and is different then the one on line 297, because this drains the channel and then sends it over the observedResult where the other drains the channel and then sends it over that same channel.

I can see what the code is doing, what is not obvious is why it's doing it: who is supposed to read from the observedResult ?
The drain on line 297 is supposed to remove a stale value before pushing the latest but again we don't know who reads from that and why it is a different channel from the observedResult

I added a lot more comments in this flow, that should provide a good explanation on what it is doing.

pchila · 2025-01-31T13:03:12Z

internal/pkg/composable/controller.go

+//
+// This is a blocking call until the observation is handled and the most recent
+// set of variables are returned the caller in the case a change occurred. If no change occurred then
+// it will return with a nil array. If changed the current observed state of variables


How is "change" calculated? Is it intended as a difference from the last set of variable values that have been provided or just retrieved and observed through previous calls to handleObserved() ?

Only in the case that the observed state changes, not the values in each provider that will come over the Watch channel. But if the change happens to occur during the debounce window then it will have that change in the Observed state.

pchila · 2025-01-31T13:03:40Z

internal/pkg/composable/controller.go

-// Observe sends the observed variables from the AST to the controller.
+// Observe instructs the controller to enable the observed providers.
+//
+// This is a blocking call until the observation is handled and the most recent


Is there an upper limit to the time Observe() will take?

It either resolves or the context is closed.

Is this on the critical path of agent startup?
My main worry here is that we could end up with an agent stuck in the initialization phase similar to when the coordinator is waiting for variables before processing configuration changes.

Sorry there is a maximum here which is 500ms. I added that to the docstring. It will never fully block indefinitely. It is in the path of variable resolution all previous logic in that area applies.

pchila · 2025-01-31T14:43:34Z

internal/pkg/composable/controller_test.go

+					default:
+						if !observed {
+							vars, err := c.Observe(timeoutCtx, tt.observed)
+							require.NoError(t, err)
+							if vars != nil {
+								setVars = vars
+							}
+							observed = true
+						}


This default block is meant to run at every iteration of the for loop where <-timeoutCtx.Done() and vars := <-c.Watch() would block?
If this has to happen only once could it be called outside of the loop as it was done previously ?

It is only called once, but cannot be done outside of the loop because <-c.Watch() must have a reader. To ensure that it has a reader and is not blocked it does it this way. This removes the need to add locks around setVars.

internal/pkg/composable/providers/agent/agent.go

swiatekm

Did a partial review, gotta say mixing all three changes in a single PR makes it harder to review.

The variables in outputs part looks fine.
The default provider part looks mostly fine, but like @pchila I'm curious why the substitution doesn't happen at resolution.
The change to composable manager logic I'm not sure about. Could we get this one in a separate PR?

internal/pkg/agent/transpiler/vars.go

blakerouse · 2025-01-31T18:36:30Z

The change to composable manager logic I'm not sure about. Could we get this one in a separate PR?

It is required for variable substitution works correct in the outputs. The reason I keep it inside of this PR is because it is directly related to the behavior, and I think it provides more context on why it is done.

jlind23 · 2025-02-04T16:45:07Z

@pchila @swiatekm Could you please review this again now that Blake added some changes.

pchila

@blakerouse thanks for integrating code review feedback: I feel the added comments help with understanding the reason for parts of this change.
I have a couple of follow-up questions about the channel usage that I asked under the original discussions.
Could you please have a look at those?

internal/pkg/agent/transpiler/vars.go

internal/pkg/agent/transpiler/vars_test.go

…ts/processor/lsmintervalprocessor from 0.3.0 to 0.4.0 (elastic#6533) * build(deps): bump github.com/elastic/opentelemetry-collector-components/processor/lsmintervalprocessor Bumps [github.com/elastic/opentelemetry-collector-components/processor/lsmintervalprocessor](https://github.com/elastic/opentelemetry-collector-components) from 0.3.0 to 0.4.0. - [Release notes](https://github.com/elastic/opentelemetry-collector-components/releases) - [Commits](elastic/opentelemetry-collector-components@processor/lsmintervalprocessor/v0.3.0...processor/lsmintervalprocessor/v0.4.0) --- updated-dependencies: - dependency-name: github.com/elastic/opentelemetry-collector-components/processor/lsmintervalprocessor dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]> * Update NOTICE.txt * Update otel README.md --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: dependabot[bot] <dependabot[bot]@users.noreply.github.com>

…elastic#6705)

…lastic#6326)

…tive-all (elastic#6708)

blakerouse · 2025-02-05T18:50:50Z

@pchila I have updated the PR with your latest revisions and have answered all your remaining questions. Thanks!

mergify · 2025-02-05T20:19:11Z

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b variables-outputs upstream/variables-outputs
git merge upstream/main
git push upstream variables-outputs

elastic-sonarqube · 2025-02-06T21:16:42Z

Quality Gate passed

Issues
6 New issues
2 Fixed issues
0 Accepted issues

Measures
0 Security Hotspots
82.5% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube

pchila

LGTM

swiatekm

🚀

Adds support for context variables (ONLY) in outputs. Adds support for a default provide prefix to be defined for variables (default is env). This provides support for the original ${ES_PASSWORD} when used in outputs to still work as it will automatically map that to ${env.ES_PASSWORD} and it will get resolved for the context variables the same as it was being done by go-ucfg. This includes an improvement to how variables are observed in the composable controller and how it is used by the coordinator. Now when a set of observable's are passed to the composable controller it will return the current set of variables after the debounce time, this ensures that before the variables are substituted that it is using the latest set of variables. Without this change running would always show an error at first with ${env.ES_PASSWORD} is an unknown variable and then less than a few milliseconds it would find it. This change removes that behavior and is able to find the variable on initial render. (cherry picked from commit b59d51a)

) Adds support for context variables (ONLY) in outputs. Adds support for a default provide prefix to be defined for variables (default is env). This provides support for the original ${ES_PASSWORD} when used in outputs to still work as it will automatically map that to ${env.ES_PASSWORD} and it will get resolved for the context variables the same as it was being done by go-ucfg. This includes an improvement to how variables are observed in the composable controller and how it is used by the coordinator. Now when a set of observable's are passed to the composable controller it will return the current set of variables after the debounce time, this ensures that before the variables are substituted that it is using the latest set of variables. Without this change running would always show an error at first with ${env.ES_PASSWORD} is an unknown variable and then less than a few milliseconds it would find it. This change removes that behavior and is able to find the variable on initial render. (cherry picked from commit b59d51a) Co-authored-by: Blake Rouse <[email protected]>

blakerouse added 3 commits January 24, 2025 14:41

Add support for variables in outputs and default provider.

2c31f82

Remove variable expansion in outputs from go-ucfg parsing.

89ea910

Fix some issues.

4ef799f

blakerouse added the Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team label Jan 24, 2025

blakerouse self-assigned this Jan 24, 2025

mergify bot added the backport-8.x Automated backport to the 8.x branch with mergify label Jan 24, 2025

blakerouse added 2 commits January 25, 2025 08:46

Fix tests.

2554740

Fix incorrect tests.

9b0dc70

blakerouse changed the title ~~Add support for variables in outputs and default provider.~~ Add support for variables in outputs and default provider Jan 25, 2025

blakerouse added 2 commits January 25, 2025 09:47

Fix out of order vars publish.

f6fd32c

Add changelog.

f45d091

blakerouse marked this pull request as ready for review January 25, 2025 14:50

blakerouse requested a review from a team as a code owner January 25, 2025 14:50

blakerouse requested review from andrzej-stencel and pchila January 25, 2025 14:50

swiatekm self-requested a review January 30, 2025 16:15

pchila reviewed Jan 31, 2025

View reviewed changes

swiatekm reviewed Jan 31, 2025

View reviewed changes

internal/pkg/agent/transpiler/vars.go Outdated Show resolved Hide resolved

Fixes from code review.

a162e83

blakerouse added 2 commits January 31, 2025 13:38

Add const.

950f5c8

Merge branch 'main' into variables-outputs

effe33d

pchila reviewed Feb 5, 2025

View reviewed changes

internal/pkg/agent/transpiler/vars.go Outdated Show resolved Hide resolved

internal/pkg/agent/transpiler/vars.go Outdated Show resolved Hide resolved

internal/pkg/agent/transpiler/vars_test.go Outdated Show resolved Hide resolved

pierrehilbert added the backport-9.0 Automated backport to the 9.0 branch label Feb 5, 2025

Fixes from code review.

1ac5474

dependabot bot and others added 4 commits February 5, 2025 13:49

updatecli: set force to false, it won't try to update the base branch (…

3885b01

…elastic#6705)

github-actions: filter by current active branches in the repository (e…

ed6a475

…lastic#6326)

mergify: support backport-active-8, backport-active-9 and backport-ac…

bf59784

…tive-all (elastic#6708)

blakerouse requested a review from a team as a code owner February 5, 2025 18:49

blakerouse added 2 commits February 6, 2025 11:32

Merge branch 'main' into variables-outputs

e888429

Merge branch 'main' into variables-outputs

d65ad44

blakerouse removed request for a team and andrzej-stencel February 7, 2025 02:22

pchila approved these changes Feb 7, 2025

View reviewed changes

swiatekm approved these changes Feb 7, 2025

View reviewed changes

blakerouse merged commit b59d51a into elastic:main Feb 7, 2025
14 checks passed

blakerouse deleted the variables-outputs branch February 7, 2025 15:15

mergify bot mentioned this pull request Feb 7, 2025

[8.x](backport #6602) Add support for variables in outputs and default provider #6753

Merged

5 tasks

mergify bot mentioned this pull request Feb 7, 2025

[9.0](backport #6602) Add support for variables in outputs and default provider #6754

Merged

5 tasks

Add support for variables in outputs and default provider #6602

Add support for variables in outputs and default provider #6602

Conversation

blakerouse commented Jan 24, 2025 • edited Loading

What does this PR do?

Why is it important?

Checklist

Disruptive User Impact

How to test this PR locally

Related issues

mergify bot commented Jan 24, 2025

mergify bot commented Jan 24, 2025

elasticmachine commented Jan 25, 2025

pchila left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

swiatekm Feb 5, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pchila left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pchila Feb 5, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

swiatekm left a comment

Choose a reason for hiding this comment

blakerouse commented Jan 31, 2025

jlind23 commented Feb 4, 2025

pchila left a comment

Choose a reason for hiding this comment

blakerouse commented Feb 5, 2025

mergify bot commented Feb 5, 2025

elastic-sonarqube bot commented Feb 6, 2025

Quality Gate passed

pchila left a comment

Choose a reason for hiding this comment

swiatekm left a comment

Choose a reason for hiding this comment

blakerouse commented Jan 24, 2025 •

edited

Loading

swiatekm Feb 5, 2025 •

edited

Loading

pchila Feb 5, 2025 •

edited

Loading