CI: Initial additions for fp32 and windows GPU test support #1778

ethanglaser · 2024-03-27T07:23:36Z

Description

Changes for green CI on GPU validation on devices without fp64 support (includes windows GPU sklearnex validation)

To be merged with infra PR 713

Changes include:

Correct sequencing of _convert_to_supported() function for proper dtype
Use of result.dtype instead of pytest parameter dtype so that in non fp32-supported systems where fp64 is passed into the estimator, the necessary tolerance is selected
Minor general threshold realignments based on initial test results
Disabling of forest GPU tests as results are inconsistent, task create to look further into the issue
Add required platform for spmd algo examples (spmd backend not created for windows)

icfaust · 2024-04-16T10:59:39Z

@ethanglaser PRs #1674 should solve some of the outstanding issues with IncrementalEmpiricalCovariance for fp32 sklearnex testing, but it needs #1795 to be pulled in first. I will make a meeting for #1674, but could you review #1795? Its a relatively small change, already passes CI

samir-nasibli · 2024-04-17T07:34:23Z

@ethanglaser please rebase your branch

ethanglaser · 2024-05-07T01:30:12Z

Latest CI with just GPU tests: http://intel-ci.intel.com/ef0e2f47-e2c3-f1f1-aee4-a4bf010d0e2e

onedal/cluster/tests/test_kmeans_init.py

Co-authored-by: Samir Nasibli <[email protected]>

icfaust

Just some small things, otherwise looks fine.

onedal/linear_model/incremental_linear_model.py

onedal/linear_model/tests/test_logistic_regression.py

icfaust · 2024-05-15T04:45:42Z

onedal/primitives/tests/test_kernel_functions.py

@@ -66,7 +66,8 @@ def test_dense_self_rbf_kernel(queue):
    result = rbf_kernel(X, queue=queue)
    expected = sklearn_rbf_kernel(X)

-    assert_allclose(result, expected, rtol=1e-14)
+    tol = 1e-5 if result.dtype == np.float32 else 1e-14


Yikes, a 10^9 change in performance. Would this also warrant an investigation?

I am not sure why 1e-14 was used here specifically - generally we don't go more specific than 1e-7. For rtol, 1e-5 should be acceptable, as this is a typical threshold used in other fp32 testing

sklearnex/ensemble/tests/test_forest.py

ethanglaser · 2024-05-23T16:19:30Z

Should be ready for final review

ethanglaser · 2024-06-06T19:16:32Z

/intelci: run

ethanglaser · 2024-06-06T20:40:42Z

CI with infra branch (includes fp32 + windows GPU validation): http://intel-ci.intel.com/ef2436e8-ef5e-f182-b1e2-a4bf010d0e2e

md-shafiul-alam · 2024-06-10T18:40:52Z

onedal/cluster/tests/test_kmeans_init.py

@@ -85,6 +85,8 @@ def test_generated_dataset(queue, dtype, n_dim, n_cluster):
        d, i = nn.fit(rs_centroids).kneighbors(cs)
        # We have applied 2 sigma rule once
        desired_accuracy = int(0.9973 * n_cluster)
+        if d.dtype == np.float64:
+            desired_accuracy = desired_accuracy - 1


What was the logic behind the desired accuracy -1?

it matches a minor threshold change added to test_kmeans (https://github.com/intel/scikit-learn-intelex/blob/main/onedal/cluster/tests/test_kmeans.py#L87) not sure exactly how the desired_accuracy was set but seems the threshold doesn't necessarily matchup with actual performance

I see, the kmeans++ init issue may get fixed with the changes proposed here uxlfoundation/oneDAL#2796. The onedal code uses 1 trial at the moment.

md-shafiul-alam

looks good to me, but worth waiting for @icfaust about all the force_all_finite business

ethanglaser · 2024-06-10T20:48:52Z

Thanks for reviews :)

ethanglaser added 11 commits March 25, 2024 16:10

debug

767fa2d

brute force moments cpp

1c71877

remove debug

d473b54

debug

b16f232

oops

768204d

remove debug

e515416

require lnx for spmd examples

e0e7977

tolerance updates

079951a

minor threshold revisions

bd702f2

trying PCA fix

d65a4e0

revert last check

bc802a9

samir-nasibli changed the title ~~CI: Initial additions for fp32 and windows GPU test support~~ ENH: Initial additions for fp32 and windows GPU test support Apr 3, 2024

ethanglaser added 10 commits April 29, 2024 11:45

Merge branch 'main' into dev/eglaser-fp32-support

faf5574

address current tolerance/fp64 fails

cc4cc21

lint

cd98477

additional small fixes

bfbf572

minor inclinreg y dtype

0d33ff4

forest test skips

23cf7b0

skip windows gpu logreg

4650e1f

logreg and forest adjustments

68cdfd2

et regressor gpu skip

a3fe427

lint

3b152a3

samir-nasibli reviewed May 7, 2024

View reviewed changes

onedal/cluster/tests/test_kmeans_init.py Outdated Show resolved Hide resolved

Update onedal/cluster/tests/test_kmeans_init.py

8764370

Co-authored-by: Samir Nasibli <[email protected]>

ethanglaser changed the title ~~ENH: Initial additions for fp32 and windows GPU test support~~ TEST: Initial additions for fp32 and windows GPU test support May 8, 2024

ethanglaser requested review from icfaust and samir-nasibli May 8, 2024 21:56

icfaust marked this pull request as ready for review May 15, 2024 04:40

icfaust requested a review from Alexsandruss as a code owner May 15, 2024 04:40

icfaust reviewed May 15, 2024

View reviewed changes

remove multiple assert_all_finite calls

47e0f17

ethanglaser requested review from icfaust, homksei and Vika-F May 23, 2024 16:19

Merge branch 'intel:main' into dev/eglaser-fp32-support

5f8c23f

ethanglaser requested a review from avolkov-intel May 23, 2024 16:28

ethanglaser changed the title ~~TEST: Initial additions for fp32 and windows GPU test support~~ CI: Initial additions for fp32 and windows GPU test support May 23, 2024

ethanglaser and others added 5 commits June 6, 2024 07:03

Merge branch 'intel:main' into dev/eglaser-fp32-support

3030961

removing logreg skips due to resolution

aa810ec

add convert_to_supported for svm

59c6e26

pca dtype derived from results

769f37b

add forgotten queue

c4e8912

md-shafiul-alam reviewed Jun 10, 2024

View reviewed changes

md-shafiul-alam approved these changes Jun 10, 2024

View reviewed changes

icfaust approved these changes Jun 10, 2024

View reviewed changes

ethanglaser merged commit 563c65a into uxlfoundation:main Jun 10, 2024
16 checks passed

ethanglaser mentioned this pull request Nov 26, 2024

ci: further fp32 GPU green CI enabling #2187

Draft

13 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CI: Initial additions for fp32 and windows GPU test support #1778

CI: Initial additions for fp32 and windows GPU test support #1778

ethanglaser commented Mar 27, 2024 •

edited

Loading

icfaust commented Apr 16, 2024

samir-nasibli commented Apr 17, 2024

ethanglaser commented May 7, 2024 •

edited

Loading

icfaust left a comment

icfaust May 15, 2024

ethanglaser May 20, 2024

ethanglaser commented May 23, 2024

ethanglaser commented Jun 6, 2024

ethanglaser commented Jun 6, 2024

md-shafiul-alam Jun 10, 2024 •

edited

Loading

ethanglaser Jun 10, 2024

md-shafiul-alam Jun 10, 2024 •

edited

Loading

md-shafiul-alam left a comment

ethanglaser commented Jun 10, 2024

CI: Initial additions for fp32 and windows GPU test support #1778

CI: Initial additions for fp32 and windows GPU test support #1778

Conversation

ethanglaser commented Mar 27, 2024 • edited Loading

Description

icfaust commented Apr 16, 2024

samir-nasibli commented Apr 17, 2024

ethanglaser commented May 7, 2024 • edited Loading

icfaust left a comment

Choose a reason for hiding this comment

icfaust May 15, 2024

Choose a reason for hiding this comment

ethanglaser May 20, 2024

Choose a reason for hiding this comment

ethanglaser commented May 23, 2024

ethanglaser commented Jun 6, 2024

ethanglaser commented Jun 6, 2024

md-shafiul-alam Jun 10, 2024 • edited Loading

Choose a reason for hiding this comment

ethanglaser Jun 10, 2024

Choose a reason for hiding this comment

md-shafiul-alam Jun 10, 2024 • edited Loading

Choose a reason for hiding this comment

md-shafiul-alam left a comment

Choose a reason for hiding this comment

ethanglaser commented Jun 10, 2024

ethanglaser commented Mar 27, 2024 •

edited

Loading

ethanglaser commented May 7, 2024 •

edited

Loading

md-shafiul-alam Jun 10, 2024 •

edited

Loading

md-shafiul-alam Jun 10, 2024 •

edited

Loading