Handle exceptions gracefully when delete non-existent resources during integ test resource clean up #1154

weijia-aws · 2025-01-30T22:16:38Z

Description

When integ test runs, the test will create model, index, pipeline, etc., then perform some assertions, finally clean up the resources.

public void processor_integrationTest() {
    try {
        // load model
        // create pipeline/index
        // do tests/asserts
    } finally {
        // cleanup pipeline, index, model
    }
}

If somehow resource creation fail and receive an exception, the test will enter the finally block and try to delete the resources. Since these resources don't exist, exception will be thrown when try to delete them, see issue 1091. I was able to reproduce the exception by following #1093 (comment). So in this case, we should gracefully handle such NOT FOUND exceptions

Related Issues

Resolves #1098

Related issue: #1091 and #1093

Check List

New functionality includes testing.
New functionality has been documented.
API changes companion pull request created.
Commits are signed per the DCO using --signoff.
Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

heemin32 · 2025-01-31T17:40:23Z

Can we expand the scope of this PR to shift the responsibility for resource cleanup from individual tests to the framework? This way, each test wouldn’t need to handle resource cleanup individually.

vibrantvarun · 2025-01-31T17:46:37Z

Can we expand the scope of this PR to shift the responsibility for resource cleanup from individual tests to the framework? This way, each test wouldn’t need to handle resource cleanup individually.

@heemin32 I would suggest not to do that. As Integ tests are meant to be run in isolation mode. With resource cleanup at framework level, it will create issues where test will start sharing resources with each other and can cause critical issues to be skipped at integ test level because the resource might be created by some other method and used by some other method.

vibrantvarun · 2025-01-31T17:49:23Z

LGTM

minalsha · 2025-01-31T17:49:46Z

#1154 (comment)

+1 to Varun's comment here.

heemin32 · 2025-01-31T17:50:24Z

@heemin32 I would suggest not to do that. As Integ tests are meant to be run in isolation mode. With resource cleanup at framework level, it will create issues where test will start sharing resources with each other and can cause critical issues to be skipped at integ test level because the resource might be created by some other method and used by some other method.

Have you seen any such issue in k-nn repo where the resource clean up is handling in framework level? The resource sharing issue can happen when clean up is happening in individual test level and developer missed to clean up the resource in a test.

src/testFixtures/java/org/opensearch/neuralsearch/BaseNeuralSearchIT.java

…g integ test resource clean up Signed-off-by: Weijia Zhao <[email protected]>

Signed-off-by: Weijia Zhao <[email protected]>

q-andy · 2025-02-05T19:56:51Z

src/testFixtures/java/org/opensearch/neuralsearch/BaseNeuralSearchIT.java

+     * @return get pipeline response as a map object
+     */
+    @SneakyThrows(ParseException.class)
+    protected Map<String, Object> retrievePipelines(final String pipelineType, final String pipelineName) throws IOException {


Great work, I'm curious if there's any increased latency for the integ tests by adding extra get requests per test? Is the runtime for all the integ tests impacted?

The latency should be minimal and it is generally okay to have some slowness for testing.

So I ran the integ tests with and without this commit. Total runtime is:

Previous: 8m3.59s
Now: 8m2.34s

The latency is really minimal

src/testFixtures/java/org/opensearch/neuralsearch/BaseNeuralSearchIT.java

codecov · 2025-02-05T20:34:23Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 81.69%. Comparing base (e8ed3a4) to head (2f601fb).

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #1154      +/-   ##
============================================
- Coverage     81.72%   81.69%   -0.03%     
  Complexity     2494     2494              
============================================
  Files           186      186              
  Lines          8426     8426              
  Branches       1428     1428              
============================================
- Hits           6886     6884       -2     
- Misses         1000     1002       +2     
  Partials        540      540

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

src/testFixtures/java/org/opensearch/neuralsearch/BaseNeuralSearchIT.java

Signed-off-by: Weijia Zhao <[email protected]>

heemin32

LGTM. Thanks!

heemin32 · 2025-02-06T01:14:26Z

Test is failing. Please take a look.

weijia-aws · 2025-02-06T20:25:08Z

Test is failing. Please take a look.

qa tests are failing because: testAgainstNewCluster depends on resources that created in testAgainstOldCluster, however when testAgainstOldCluster finishes, all resources are cleaned up, resulting in testAgainstNewCluster tests fail with java.lang.NullPointerException or Resource not found exception. In order to fix this, we will need to re-create resources in testAgainstNewCluster tasks

weijia-aws · 2025-02-06T20:32:25Z

In order to fix this, we will need to re-create resources in testAgainstNewCluster tasks

We can also modify the cleanUp method to not delete resources if running against old cluster?

weijia-aws · 2025-02-06T20:37:53Z

We can also modify the cleanUp method to not delete resources if running against old cluster?

I think we should do this, as this is how the current behavior is, and requires less code changes comparing to re-create resources

heemin32 · 2025-02-06T21:19:11Z

We can also modify the cleanUp method to not delete resources if running against old cluster?

I think we should do this, as this is how the current behavior is, and requires less code changes comparing to re-create resources

Recreation of resource break the purpose of bwc test where we are validating resource created in old cluster still works in new cluster.

How about exposing a method which tell if sub class want to clean up resource or not? Then, for bwc test, we can skip deletion of resources and delete the resources manually as we do today.

Signed-off-by: Weijia Zhao <[email protected]>

heemin32 · 2025-02-08T02:12:40Z

qa/restart-upgrade/src/test/java/org/opensearch/neuralsearch/bwc/restart/HybridSearchIT.java

-            String modelId = uploadTextEmbeddingModel();
-            loadModel(modelId);
-            createPipelineProcessor(modelId, pipelineName);
+            super.modelId = uploadTextEmbeddingModel();


I suggest not making any changes to the BWC tests, as this approach may not work. For instance, there are two tests: testNormalizationProcessor_whenIndexWithMultipleShards_E2EFlow and testNormalizationProcessor_whenIndexWithSingleShard_E2EFlow. Each test uploads a model, but the model ID from the previous test is overwritten by the next, resulting in a failure to delete the all resources.

weijia-aws requested review from heemin32, navneet1v, VijayanB, vamshin, jmazanec15, naveentatikonda, junqiu-lei, martin-gaievski, sean-zheng-amazon, model-collapse, zane-neo, vibrantvarun, zhichao-aws, yuye-aws and minalsha as code owners January 30, 2025 22:16

github-actions bot added good first issue Good for newcomers Infrastructure labels Jan 30, 2025

weijia-aws force-pushed the main branch from f040e3a to 3ab98bf Compare January 30, 2025 22:32

vibrantvarun added the skip-changelog label Jan 31, 2025

vibrantvarun approved these changes Jan 31, 2025

View reviewed changes

heemin32 reviewed Jan 31, 2025

View reviewed changes

src/testFixtures/java/org/opensearch/neuralsearch/BaseNeuralSearchIT.java Outdated Show resolved Hide resolved

q-andy reviewed Jan 31, 2025

View reviewed changes

src/testFixtures/java/org/opensearch/neuralsearch/BaseNeuralSearchIT.java Outdated Show resolved Hide resolved

src/testFixtures/java/org/opensearch/neuralsearch/BaseNeuralSearchIT.java Outdated Show resolved Hide resolved

vibrantvarun self-requested a review January 31, 2025 22:59

weijia-aws added 2 commits February 4, 2025 14:31

Handle exceptions gracefully when delete non-existent resources durin…

e83ad5d

…g integ test resource clean up Signed-off-by: Weijia Zhao <[email protected]>

Clean up test resource dynamically after each test case

b9dd581

Signed-off-by: Weijia Zhao <[email protected]>

weijia-aws force-pushed the main branch from e824609 to b9dd581 Compare February 4, 2025 22:31

q-andy reviewed Feb 5, 2025

View reviewed changes

weijia-aws commented Feb 5, 2025

View reviewed changes

src/testFixtures/java/org/opensearch/neuralsearch/BaseNeuralSearchIT.java Outdated Show resolved Hide resolved

heemin32 reviewed Feb 5, 2025

View reviewed changes

src/testFixtures/java/org/opensearch/neuralsearch/BaseNeuralSearchIT.java Outdated Show resolved Hide resolved

heemin32 reviewed Feb 5, 2025

View reviewed changes

src/testFixtures/java/org/opensearch/neuralsearch/BaseNeuralSearchIT.java Outdated Show resolved Hide resolved

junqiu-lei reviewed Feb 5, 2025

View reviewed changes

src/testFixtures/java/org/opensearch/neuralsearch/BaseNeuralSearchIT.java Outdated Show resolved Hide resolved

heemin32 reviewed Feb 6, 2025

View reviewed changes

src/testFixtures/java/org/opensearch/neuralsearch/BaseNeuralSearchIT.java Outdated Show resolved Hide resolved

heemin32 reviewed Feb 6, 2025

View reviewed changes

src/testFixtures/java/org/opensearch/neuralsearch/BaseNeuralSearchIT.java Outdated Show resolved Hide resolved

heemin32 reviewed Feb 6, 2025

View reviewed changes

src/testFixtures/java/org/opensearch/neuralsearch/BaseNeuralSearchIT.java Outdated Show resolved Hide resolved

Code clean up

3780933

Signed-off-by: Weijia Zhao <[email protected]>

weijia-aws force-pushed the main branch from 805fa36 to 3780933 Compare February 6, 2025 01:01

heemin32 reviewed Feb 6, 2025

View reviewed changes

heemin32 approved these changes Feb 6, 2025

View reviewed changes

heemin32 added the backport 2.x Label will add auto workflow to backport PR to 2.x branch label Feb 6, 2025

Fixing failing bwc tests

2f601fb

Signed-off-by: Weijia Zhao <[email protected]>

heemin32 reviewed Feb 8, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle exceptions gracefully when delete non-existent resources during integ test resource clean up #1154

Handle exceptions gracefully when delete non-existent resources during integ test resource clean up #1154

weijia-aws commented Jan 30, 2025 •

edited

Loading

heemin32 commented Jan 31, 2025

vibrantvarun commented Jan 31, 2025 •

edited

Loading

vibrantvarun commented Jan 31, 2025

minalsha commented Jan 31, 2025

heemin32 commented Jan 31, 2025 •

edited

Loading

q-andy Feb 5, 2025

heemin32 Feb 5, 2025

weijia-aws Feb 5, 2025

codecov bot commented Feb 5, 2025 •

edited

Loading

heemin32 left a comment

heemin32 commented Feb 6, 2025

weijia-aws commented Feb 6, 2025

weijia-aws commented Feb 6, 2025

weijia-aws commented Feb 6, 2025

heemin32 commented Feb 6, 2025

heemin32 Feb 8, 2025

Handle exceptions gracefully when delete non-existent resources during integ test resource clean up #1154

Are you sure you want to change the base?

Handle exceptions gracefully when delete non-existent resources during integ test resource clean up #1154

Conversation

weijia-aws commented Jan 30, 2025 • edited Loading

Description

Related Issues

Check List

heemin32 commented Jan 31, 2025

vibrantvarun commented Jan 31, 2025 • edited Loading

vibrantvarun commented Jan 31, 2025

minalsha commented Jan 31, 2025

heemin32 commented Jan 31, 2025 • edited Loading

q-andy Feb 5, 2025

Choose a reason for hiding this comment

heemin32 Feb 5, 2025

Choose a reason for hiding this comment

weijia-aws Feb 5, 2025

Choose a reason for hiding this comment

codecov bot commented Feb 5, 2025 • edited Loading

Codecov Report

heemin32 left a comment

Choose a reason for hiding this comment

heemin32 commented Feb 6, 2025

weijia-aws commented Feb 6, 2025

weijia-aws commented Feb 6, 2025

weijia-aws commented Feb 6, 2025

heemin32 commented Feb 6, 2025

heemin32 Feb 8, 2025

Choose a reason for hiding this comment

weijia-aws commented Jan 30, 2025 •

edited

Loading

vibrantvarun commented Jan 31, 2025 •

edited

Loading

heemin32 commented Jan 31, 2025 •

edited

Loading

codecov bot commented Feb 5, 2025 •

edited

Loading