Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support format expressions for routing in the opensearch sink #3863

Merged
merged 5 commits into from
Jan 11, 2024

Conversation

kkondaka
Copy link
Collaborator

Description

Added support to allow format expressions for routing field in open search sink.

Resolves #3833

Issues Resolved

Resolves #3833

Check List

  • [X ] New functionality includes testing.
  • New functionality has a documentation issue. Please link to it in this PR.
    • New functionality has javadoc added
  • [X ] Commits are signed with a real name per the DCO

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.


final PluginSetting pluginSetting = generatePluginSetting(null, testIndexAlias, null);
pluginSetting.getSettings().put(IndexConfiguration.ROUTING_FIELD, "${/"+testRoutingField+"}");
when(expressionEvaluator.isValidFormatExpression(any(String.class))).thenReturn(true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we mock this, can you just run one test manually to make sure that

routing_field: "normalField"

does not have any issues just to make sure there is no breaking change. I think it is preferable to deprecate routing_field like we did for document_id_field, and to have a new routing_id that supports the expressions. What do you think?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous test is with routing_field set to normal value. There is no need to test it manually. We can support the routing_field to support both normal field and expressions. Isn't that better than introducing routing_id and deprecating the current one?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it is but we are mocking the expression evaluator so it's really just to make sure that this gets hit (

) and we don't return it as valid. I just noticed we don't have a test for normal keys here to return false (https://github.com/opensearch-project/data-prepper/blame/f19de03d5418925935e019837fc4824fb250820c/data-prepper-expression/src/test/java/org/opensearch/dataprepper/expression/GenericExpressionEvaluatorTest.java#L134). Can you add a parameter to this test for a regular key

Also the only reason to change would be naming, since routing_field implies just a single field to be used, but I guess it isn't that big of a difference

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to prefer making a new property with this configuration - routing. It will be consistent with document_id and clarify that you are not supply the field only.

The OpenSearch API uses the term routing, so I recommend that over routing_id.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My only thought about routing is that it could collide with routes, but I suppose routing_id could collide just as much

@sharraj
Copy link

sharraj commented Dec 31, 2023

can we add new yaml needed for this functionality in the description ?

Signed-off-by: Krishna Kondaka <[email protected]>
@kkondaka
Copy link
Collaborator Author

kkondaka commented Jan 3, 2024

can we add new yaml needed for this functionality in the description ?

@sharraj I have added the examples in the README.md instead of adding to the PR.

if (routingField != null) {
LOG.warn("routing_field is deprecated in favor of routing, and support for document_field will be removed in a future major version release.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy error in document_field

routing = event.formatString(routingField, expressionEvaluator);
routingValue = event.get(routingField, String.class);
} else if (routing != null) {
if (expressionEvaluator.isValidFormatExpression(routing)) {
Copy link
Member

@graytaylor0 graytaylor0 Jan 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function will return true for static values such as routing: "my_value". We could change this function to be an invalid expression when there is no ${} at all, but we really have been proposing users do

routing: "${my_key}" 

to access fields, and then static values are static values (think index field which can be static or format expression)

Signed-off-by: Krishna Kondaka <[email protected]>
@@ -473,11 +475,20 @@ private SerializedJson getDocument(final Event event) {
}
}

String routing = (routingField != null) ? event.get(routingField, String.class) : null;
String routingValue = null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have some unit tests on this code.

try {
routingValue = event.formatString(routing, expressionEvaluator);
} catch (final ExpressionEvaluationException | EventKeyNotFoundException e) {
LOG.error("Unable to construct routing with format {}, the document_id will be generated by OpenSearch", routing, e);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy past error for document_id here. However, the routing is different from document id a bit, as it defaults to being the same value as document_id (https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-routing-field.html). So we should change this to

 LOG.error("Unable to construct routing with format {}, the routing will default to being the same as the document_id", routing, e);

Krishna Kondaka added 2 commits January 5, 2024 04:29
Signed-off-by: Krishna Kondaka <[email protected]>
Signed-off-by: Krishna Kondaka <[email protected]>
@kkondaka kkondaka merged commit f60c469 into opensearch-project:main Jan 11, 2024
63 of 68 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Jan 11, 2024
* Support format expressions for routing in the opensearch sink

Signed-off-by: Krishna Kondaka <[email protected]>

* Addressed review comments

Signed-off-by: Krishna Kondaka <[email protected]>

* Addressed review comments

Signed-off-by: Krishna Kondaka <[email protected]>

* Addressed review comments

Signed-off-by: Krishna Kondaka <[email protected]>

* Addressed review comments

Signed-off-by: Krishna Kondaka <[email protected]>

---------

Signed-off-by: Krishna Kondaka <[email protected]>
Co-authored-by: Krishna Kondaka <[email protected]>
(cherry picked from commit f60c469)
@kkondaka kkondaka deleted the expr-routing-os branch May 13, 2024 05:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support format expressions for routing in the opensearch sink
4 participants