-
Notifications
You must be signed in to change notification settings - Fork 508
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add guidance for using transformation during migration #9063
Add guidance for using transformation during migration #9063
Conversation
Signed-off-by: Peter Nied <[email protected]>
Thank you for submitting your PR. The PR states are In progress (or Draft) -> Tech review -> Doc review -> Editorial review -> Merged. Before you submit your PR for doc review, make sure the content is technically accurate. If you need help finding a tech reviewer, tag a maintainer. When you're ready for doc review, tag the assignee of this PR. The doc reviewer may push edits to the PR directly or leave comments and editorial suggestions for you to address (let us know in a comment if you have a preference). The doc reviewer will arrange for an editorial review. |
@AndreKurait I know you are focusing in on the transformation scenarios, documenting what I had locally. Let me know if you had other documentation in mind that would better replace this PR and I'll close this out one. |
Signed-off-by: Peter Nied <[email protected]>
Signed-off-by: Peter Nied <[email protected]>
Signed-off-by: Peter Nied <[email protected]>
|
||
During a migration there is an opportunity change names and paths of index and data structures that are not modifiable after they are declared. This page provides a cookbook of different scenarios and templates that can be used to make these adjustments. | ||
|
||
### Rename an index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we suspect that this will be a common requirement/ask from customers? I suspect that most customers will either NOT want to rename any of their indices, or if they do, will have very specific reasons and requirements around doing that.
Is this guidance general enough that more than one customer (ideally, more than 10%) would benefit from it? If not, I'd rather keep this in a set of internal runbooks for more targeted communications with users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think rename is common ask and its a simple and useful example case. The only request that is more popular is how to change the shard count - requiring more thoughtful messaging of the calculations that I would prefer not to start with.
Do you think we should have a different scenario we could do that instead of this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See https://opensearch.atlassian.net/browse/MIGRATIONS-2359 for us to write a blog post covering this
```json | ||
[ | ||
{ | ||
"JsonConditionalTransformerProvider": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AndreKurait - do you think we'll keep conditional transformers in place once we support writing transformations in scripting languages (e.g. javascript)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is functional - when JavaScript becomes available we can always edit this example to be cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've got approval on the UPL license yesterday, so I'd expect that Javascript will be available imminently (and jolt transforms will be removed probably at the 2.2 release). Please replace these instructions with javascript ones so that this description isn't DOA and confusing to customers.
2. Add/Update the key `reindexFromSnapshotExtraArgs` to include `--doc-transformer-config-file /shared-logs-output/rfs-transform.json` | ||
3. Redeploy the Migration Assistant | ||
4. Navigate on to the Migration Assistant console | ||
5. Create a file with `vim /shared-logs-output/rfs-transform.json` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't recommend to users that they WRITE a transform in a shared log. It's arguable if that volume should be writable by the migration console. Same argument for the other file below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks - what is an alternative way to accomplish this that a customer can use today?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you have a file, here's how you can convert it into json and then base64
jq -n --rawfile script1 ~/Downloads/es-load-test.py '{"fileContent": $script1}'| base64
That could then be pushed into the extra args for the applications ("--transformer-config-base64 ..."). Notice that we'll need the json to match what the JsonJSTransformerProvider expects in the PR (see here).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See https://opensearch.atlassian.net/browse/MIGRATIONS-2273 for a better way for users to package this
``` | ||
10. Replace both `{{INDEX_ORIGINAL_NAME}}` and `{{INDEX_NEW_NAME}}` | ||
11. Run metadata migration with the additional parameter `console metadata migrate --doc-transformer-config-file /shared-logs-output/rfs-transform.json` | ||
12. Run backfill as normal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't include directions for the replayer. Is this documentation repo the best place for this?
Maybe this should be an RFC style github issue. From there, we can show what we do now & what we'd like to do...
Here's a 'renameIndex' transform that's configured like so (1 config, automatically works in all 3 transform-capable applications).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know how to write one, would you like to add one so we can have both at the same time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be complicated and would possibly need to evolve over time. Index names are used in a lot of places! That's even more reason to have a complete story here - so that we don't confuse users.
If this is one of the most popular transform requests, we should have a direct transformation to provide this functionality. We should start with the interface for the TypeMappingsSanitization transform and refactor some of the common logic between renaming indices. Would you be comfortable if we extended this jira to track this work?
These instructions are great to show a customer that's using the system today (or yesterday) how they could do this for part of the system, but documentation can be permanently persistent once crawlers pick it up. Showing such an incomplete and in-flux setup isn't likely to our customers in a better spot or to put us in a better spot to support them. If we know that there's a better way to do something we shouldn't put more light on deprecated solutions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added some AC to https://opensearch.atlassian.net/browse/MIGRATIONS-2356
Thanks for the reviews, after talking this over we don't like the user experience and are going to come back with documentation after we've cleaned up the experience in Migration Assistant. |
Description
Add guidance for migrations that need to use the transformation systems.
Version
The most recent version of the Migration Assistant.
Checklist
For more information on following Developer Certificate of Origin and signing off your commits, please check here.