-
Notifications
You must be signed in to change notification settings - Fork 481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Blogpost] Configurable Automation for OpenSearch ML Use Cases #2698
[Blogpost] Configurable Automation for OpenSearch ML Use Cases #2698
Conversation
5853a50
to
e270e65
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion to lets add all the team members to the blogpost.
_posts/2024-04-05-Configurable-Automation-for-OpenSearch-ML-Use-Cases.md
Outdated
Show resolved
Hide resolved
Signed-off-by: owaiskazi19 <[email protected]>
e270e65
to
43ddfae
Compare
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: owaiskazi19 <[email protected]>
44d2993
to
8c9810a
Compare
_posts/2024-04-05-Configurable-Automation-for-OpenSearch-ML-Use-Cases.md
Outdated
Show resolved
Hide resolved
Signed-off-by: owaiskazi19 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@owaiskazi19 @kolchfa-aws Editorial review complete. Please see my comments and changes and let me know if you have any questions. Thanks!
_posts/2024-04-05-Configurable-Automation-for-OpenSearch-ML-Use-Cases.md
Outdated
Show resolved
Hide resolved
} | ||
``` | ||
|
||
With Flow Framework, we've simplified this complex setup process, enabling you to focus on your tasks without the burden of navigating complex APIs. Our goal is for you to use OpenSearch seamlessly, unlocking new possibilities in your projects. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"realizing" or "uncovering" instead of "unlocking"?
|
||
## Additional default use cases | ||
|
||
You can explore more default use cases by viewing [substitution templates](https://github.com/opensearch-project/flow-framework/tree/2.13/src/main/resources/substitutionTemplates) with their corresponding [defaults](https://github.com/opensearch-project/flow-framework/tree/2.13/src/main/resources/defaults). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"and" their corresponding defaults?
_posts/2024-04-05-Configurable-Automation-for-OpenSearch-ML-Use-Cases.md
Outdated
Show resolved
Hide resolved
_posts/2024-04-05-Configurable-Automation-for-OpenSearch-ML-Use-Cases.md
Outdated
Show resolved
Hide resolved
_posts/2024-04-05-Configurable-Automation-for-OpenSearch-ML-Use-Cases.md
Outdated
Show resolved
Hide resolved
_posts/2024-04-05-Configurable-Automation-for-OpenSearch-ML-Use-Cases.md
Outdated
Show resolved
Hide resolved
Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>
@natebower Thank you for the review. I addressed your comments and accepted your suggestions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, a few suggestions.
1. Create a connector for a remote model, specifying pre- and post-processing functions. | ||
1. Register an embedding model using the connector ID obtained in the previous step. | ||
1. Configure an ingest pipeline to generate vector embeddings using the model ID of the registered model. | ||
1. Create a k-NN index and add the pipeline created in the previous step. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear to the reader how complex this is, particularly since the same steps are essentially repeated on lines 38-40. We need to highlight in the sentence above (line 25) that these require 4 separate API calls, perhaps adding the words "copy and paste" when refer to "using the X ID"
1. Configure an ingest pipeline to generate vector embeddings using the model ID of the registered model. | ||
1. Create a k-NN index and add the pipeline created in the previous step. | ||
|
||
This complex setup typically required you to be familiar with the OpenSearch ML Commons APIs. However, we are simplifying this experience through the Flow Framework plugin. Let's demonstrate how the plugin simplifies this process using the preceding semantic search example. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't think we need "typically"
|
||
```json | ||
{ | ||
"create_index.name": "my-nlp-index" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is confusing having this separated from the line 47 API, particularly with the response JSON in between. I think this goes in the block under line 49, but even I'm not sure. Make it clear, perhaps including it as a line 50 after describing that it's optional, or perhaps repeating the whole REST call with both lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will remove the json completely
Once the workflow is provisioned, you can ingest documents into the index created by the workflow: | ||
|
||
```json | ||
PUT /my-nlp-index/_doc/1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably use the path without the document ID included. Having the 1
here and the s1
in the "id" field is confusing. (We can leave out the "id" field as well since it's not the same as the "_id".)
"neural": { | ||
"passage_embedding": { | ||
"query_text": "Hi world", | ||
"k": 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a smaller k?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a lowercase k only
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant a number less than 100, heh. Like "k": 10
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Signed-off-by: owaiskazi19 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks really good. Some minor suggestions.
meta_description: Explore the simplicity of integrating Machine Learning capabilities within OpenSearch through an innovative and groundbreaking framework designed to simplify complex setup tasks. | ||
--- | ||
|
||
In OpenSearch, to use machine learning (ML) offerings, such as semantic, hybrid, and multimodal search, you often have to grapple with complex setup and preprocessing tasks. Additionally, you must write verbose queries, which can be a time-consuming and error-prone process. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence has a lot of commas. I tried to rewrite it to make it better but couldn't really do much better. So I guess it's fine as is! :|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kolchfa-aws any inputs here?
|
||
In OpenSearch, to use machine learning (ML) offerings, such as semantic, hybrid, and multimodal search, you often have to grapple with complex setup and preprocessing tasks. Additionally, you must write verbose queries, which can be a time-consuming and error-prone process. | ||
|
||
In this blog post, we introduce the OpenSearch Flow Framework plugin, [released in version 2.13](https://opensearch.org/blog/2.13-is-ready-for-download/) and designed to streamline this cumbersome process. By using this plugin, you can simplify complex setups with just one click. We've provided automated templates, enabling you to create connectors, register models, deploy them, and register agents and tools through a single API call. This eliminates the complexity of calling multiple APIs and orchestrating setups based on the responses. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: "click" is kind of GUI centric and we're still API. Can we maybe say "one simple API call?"
|
||
## Before the Flow Framework plugin | ||
|
||
Previously, setting up semantic search involves *4 separate API* calls outlined in the [semantic search documentation](https://opensearch.org/docs/latest/search-plugins/semantic-search/): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put the closing *
after "calls"
"neural": { | ||
"passage_embedding": { | ||
"query_text": "Hi world", | ||
"k": 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant a number less than 100, heh. Like "k": 10
Signed-off-by: owaiskazi19 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kolchfa-aws Final edits
@@ -22,14 +22,14 @@ In this blog post, we introduce the OpenSearch Flow Framework plugin, [released | |||
|
|||
## Before the Flow Framework plugin | |||
|
|||
Previously, setting up semantic search involved the steps outlined in the [semantic search documentation](https://opensearch.org/docs/latest/search-plugins/semantic-search/): | |||
Previously, setting up semantic search involves *4 separate API* calls outlined in the [semantic search documentation](https://opensearch.org/docs/latest/search-plugins/semantic-search/): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"involved four". Comma after "calls".
"create_index.name": "my-nlp-index" | ||
} | ||
``` | ||
Note: The workflow in the previous step creates a default k-NN index. The default index name is `my-nlp-index` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add terminating period.
Signed-off-by: Fanit Kolchina <[email protected]>
Thank you for reviewing the edits, @natebower! I've addressed your final comments. |
Signed-off-by: Fanit Kolchina <[email protected]>
…ct-website into flow-framework
@nateynateynate @krisfreedain - Blog is ready to publish today. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's push this live.
@nateynateynate thanks for pushing it to live. I don't see @dbwiddis and @jackiehanyang as the author though I have added them in the PR. Any idea? |
Description
Configurable Automation for OpenSearch ML Use Cases for 2.13
Issues Resolved
#2697
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the BSD-3-Clause License.