You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
We need to differentiate requests from an instance of Data Prepper that our solution is using and from the rest of a cluster's clients.
To migrate data, our solution uses does a bulk move of data from a source cluster to a target cluster. Independently, individual requests are recorded from the source cluster and replayed to the target to both keep the target cluster in sync and also to compare the behavior of the two clusters.
When we capture traffic, depending on the order that a customer chooses to perform each step, there may be overlap with the Data Prepper requests to the source. We'd like to be able to mask out those requests from our replay. Those would at the very least, create more noisy data for users and could cause confusion as they would see updates replayed on already existing data that was migrated with Data Prepper. Allowing the customer/us to set a unique value that we can easily filter on the capture side would eliminate this problem and be more more efficient (much lower costs).
Describe the solution you'd like
I'd like to have a command line flag to set the user-agent HTTP header for all requests that Data Prepper sends. A default value of something different than the ES/OS user-agent may be beneficial too.
Describe alternatives you've considered (Optional)
Other HTTP header values could work too, but user-agent seems like it could be the most natural and easy to explain one. For our greater solution, dealing with the duplicate data better is possible, but it is 1) considerable effort to mitigate, 2) still will be expensive as we aren't able to remove the data passively.
Additional context
N/A
The text was updated successfully, but these errors were encountered:
A command line argument or a setting in the pipeline file would work (so would an environment variable, but that seems like it wouldn't be the best experience for users in general). We'll want the same user-agent for all requests, so a static value loaded once is fine.
Our needs at this time are just for the source - so you can use one user-agent configuration for both or separate ones. We don't have an opinion on that detail.
Is your feature request related to a problem? Please describe.
We need to differentiate requests from an instance of Data Prepper that our solution is using and from the rest of a cluster's clients.
To migrate data, our solution uses does a bulk move of data from a source cluster to a target cluster. Independently, individual requests are recorded from the source cluster and replayed to the target to both keep the target cluster in sync and also to compare the behavior of the two clusters.
When we capture traffic, depending on the order that a customer chooses to perform each step, there may be overlap with the Data Prepper requests to the source. We'd like to be able to mask out those requests from our replay. Those would at the very least, create more noisy data for users and could cause confusion as they would see updates replayed on already existing data that was migrated with Data Prepper. Allowing the customer/us to set a unique value that we can easily filter on the capture side would eliminate this problem and be more more efficient (much lower costs).
Describe the solution you'd like
I'd like to have a command line flag to set the user-agent HTTP header for all requests that Data Prepper sends. A default value of something different than the ES/OS user-agent may be beneficial too.
Describe alternatives you've considered (Optional)
Other HTTP header values could work too, but user-agent seems like it could be the most natural and easy to explain one. For our greater solution, dealing with the duplicate data better is possible, but it is 1) considerable effort to mitigate, 2) still will be expensive as we aren't able to remove the data passively.
Additional context
N/A
The text was updated successfully, but these errors were encountered: