Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add vector search param source #425

Merged
merged 1 commit into from
Dec 22, 2023

Conversation

VijayanB
Copy link
Member

@VijayanB VijayanB commented Dec 20, 2023

Description

Added new param source to partition vector dataset and
neighbors. This will be passed to runner to perform
search and compare response with neighbors for recall
calculation.

This param source extends Search ParamSource to inherit search's
other query parameters.
Vector Param Source will add additional paramter that are required
for vector serach operation type.

Issues Resolved

Part of #103

Testing

  • New functionality includes testing

[Describe how this change was tested]

make test

tests/workload/params_test.py::VectorSearchParamSourceTests::test_invalid_data_set_format PASSED [ 96%]
tests/workload/params_test.py::VectorSearchParamSourceTests::test_invalid_data_set_path PASSED [ 96%]
tests/workload/params_test.py::VectorSearchParamSourceTests::test_missing_params PASSED [ 96%]
tests/workload/params_test.py::VectorSearchParamSourceTests::test_partition_bigann PASSED [ 96%]
tests/workload/params_test.py::VectorSearchParamSourceTests::test_partition_hdf5 PASSED [ 96%]
tests/workload/params_test.py::VectorSearchPartitionPartitionParamSourceTestCase::test_params PASSED [ 96%]


================= 1206 passed, 5 skipped, 3 warnings in 15.59s =================

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@VijayanB VijayanB marked this pull request as draft December 20, 2023 00:57
@VijayanB
Copy link
Member Author

This depends on #424

@VijayanB VijayanB force-pushed the add-params-source branch 2 times, most recently from 2acd13f to b9e453e Compare December 20, 2023 08:09
@VijayanB
Copy link
Member Author

Tested this PR with 1m-128d-l2-hdf5 vector dataset

@VijayanB VijayanB force-pushed the add-params-source branch 4 times, most recently from 84a9651 to 61a71bd Compare December 21, 2023 17:12
@VijayanB VijayanB marked this pull request as ready for review December 21, 2023 17:13
osbenchmark/workload/params.py Outdated Show resolved Hide resolved
osbenchmark/workload/params.py Outdated Show resolved Hide resolved
osbenchmark/workload/params.py Outdated Show resolved Hide resolved
return self.delegate_param_source.partition(partition_index, total_partitions)

def params(self):
raise exceptions.BenchmarkError("Do not use a VectorSearchParamSource without partitioning")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be more appropriate to use exceptions.WorkloadConfigError as apposed to catch-all exception exceptions.BenchmarkError

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

Added new param source to partition vector dataset and
neighbors. This will be passed to runner to perform
search and compare response with neighbors for recall
calculation.

This param source extends Search ParamSource to inherit search's
other query parameters.
Vector Param Source will add additional paramter that are required
for vector serach operation type.

Signed-off-by: Vijayan Balasubramanian <[email protected]>
@IanHoang IanHoang merged commit 1dc9de5 into opensearch-project:main Dec 22, 2023
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants