-
Notifications
You must be signed in to change notification settings - Fork 213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create decrompress processor to decompress gzipped keys #4118
Create decrompress processor to decompress gzipped keys #4118
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! I have a few requests.
public class DecompressProcessor extends AbstractProcessor<Record<Event>, Record<Event>> { | ||
|
||
private static final Logger LOG = LoggerFactory.getLogger(DecompressProcessor.class); | ||
static final String DECOMPRESSION_PROCESSING_ERRORS = "decompressionProcessingErrors"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you add decompression
in this name? It will already be scoped to the processor, so this will yield:
my-pipeline.decompress.decompressionProcessingErrors
Why not just make it processingErrors
?
@DataPrepperPluginConstructor | ||
public DecompressProcessor(final PluginMetrics pluginMetrics, | ||
final DecompressProcessorConfig decompressProcessorConfig, | ||
final ExpressionEvaluator expressionEvaluator) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please validate the expression in the constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it better to do the validation in the Config.java? I thought that's the convention we follow. I am ok with this too. Just checking.
|
||
import org.opensearch.dataprepper.model.codec.DecompressionEngine; | ||
|
||
public interface IDecompressionType { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The IInterfaceName
convention is a .NET convention and not used often in Java. Additionally, we can improve our interfaces by describing what they are capable of rather than name them based on a type. In this case, this could be HasDecompressionEngine
, or DecompressionEngineFactory
. Both of those names align with existing conventions in this project and in Java.
|
||
package org.opensearch.dataprepper.plugins.processor.decompress.encoding; | ||
|
||
public interface IEncodingType { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment about the I
in the name.
import static org.opensearch.dataprepper.plugins.processor.decompress.DecompressProcessorTest.buildRecordWithEvent; | ||
|
||
@ExtendWith(MockitoExtension.class) | ||
public class ITDecompressProcessorTest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use the name DecompressProcessorIT
. This works with Gradle and is how other integration tests in the project are named.
|
||
} | ||
|
||
private String getDecompressedString(final BufferedReader bufferedReader) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can probably use commons-io to replace this code. The class org.apache.commons.io.output.IOUtils
has a method String toString(InputStream input, Charset charset)
. It should work for this.
* SPDX-License-Identifier: Apache-2.0 | ||
*/ | ||
|
||
plugins { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need this block. The base Gradle project adds it to all sub-projects. You can remove it.
import java.util.Map; | ||
import java.util.stream.Collectors; | ||
|
||
public enum DecompressionType implements IDecompressionType { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add some unit tests for the string to enum parsing. See ForwardingAuthenticationTest
which does something similar for another enum. Also see that some tests use @EnumSource
to help with future additions.
Line 24 in 7e2331f
class ForwardingAuthenticationTest { |
import java.util.Map; | ||
import java.util.stream.Collectors; | ||
|
||
public enum EncodingType implements IEncodingType { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment about adding tests for the name parsing.
import java.io.InputStreamReader; | ||
import java.util.Collection; | ||
|
||
@DataPrepperPlugin(name = "decompress", pluginType = Processor.class, pluginConfigurationType = DecompressProcessorConfig.class) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please open a documentation issue so that we track the need to document this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Create this issue (opensearch-project/documentation-website#6407)
Signed-off-by: Taylor Gray <[email protected]>
Signed-off-by: Taylor Gray <[email protected]>
80058c4
to
3e06739
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great. Thanks!
Description
Adds a
decompress
processor to decompress specific fields in Events.Base configuration is
This processor is extendible to more
types
, but it currently only supportsgzip
.This processor assumes that all fields to be decompressed are base64 encoded.
Issues Resolved
Resolves #4016
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.