Ignore cudf's dataframe deprecation. #6229

bdice · 2025-01-16T17:46:49Z

Currently CI is failing due to rapidsai/cudf#17736.

The __dataframe__ protocol appears to be used internally by scikit-learn: https://github.com/scikit-learn/scikit-learn/blob/311bf6badd74bb69081eb90e2643f15706d3473c/sklearn/utils/validation.py#L389

Errors look like:

FAILED test_metrics.py::test_sklearn_search - FutureWarning: Using `__dataframe__` is deprecated

This PR ignores the FutureWarning to allow CI to pass.

copy-pr-bot · 2025-01-16T20:44:10Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

bdice · 2025-01-16T22:51:31Z

/ok to test

bdice · 2025-01-16T22:52:09Z

Help on this PR is welcome! Please feel free to push if you can fix any of the remaining test failures.

betatim · 2025-01-17T09:34:59Z

python/cuml/cuml/tests/test_kneighbors_classifier.py

@@ -218,6 +218,8 @@ def test_predict_large_n_classes(datatype):
    assert array_equal(y_hat.astype(np.int32), y_test.astype(np.int32))


+# Ignore FutureWarning: Using `__dataframe__` is deprecated
+@pytest.mark.filterwarnings("ignore::FutureWarning")


This makes sure we only ignore the dataframe warning and not all FutureWarnings. Also remvose the need for the comment I'd say

Suggested change

@pytest.mark.filterwarnings("ignore::FutureWarning")

@pytest.mark.filterwarnings("ignore:Support for loading dataframes via the `__dataframe__` interchange protocol is deprecated")

(same for the other occurrence)

I'm unable to reproduce getting a warning in this test (and don't see how one could be generated). I think this one can just be dropped.

@jcrist What version of cudf are you using? Only recent 25.02 nightlies will show this.

25.02.00a273. I see the warning at the other location, but not here.

I cannot reproduce this one either.

I filed #6239 as a follow-up with this filter removed. My hope is that this PR passes CI and we can merge it as-is, then follow up with that PR.

betatim · 2025-01-17T09:47:41Z

Looking at the failed CI jobs I see a lot of:

ConftestImportFailure: FutureWarning: The `rmm._cuda.stream` module is deprecated in 25.02 and will be removed in a future release. Use `rmm.pylibrmm.stream` instead. (from /__w/cuml/cuml/python/cuml/cuml/tests/conftest.py)
For more information see https://pluggy.readthedocs.io/en/stable/api_reference.html#pluggy.PluggyTeardownRaisedWarning
  config = pluginmanager.hook.pytest_cmdline_parse(
ImportError while loading conftest '/__w/cuml/cuml/python/cuml/cuml/tests/conftest.py'.
conftest.py:17: in <module>
    from cuml.testing.utils import create_synthetic_dataset
/opt/conda/envs/test/lib/python3.12/site-packages/cuml/__init__.py:17: in <module>
    from cuml.internals.base import Base, UniversalBase
/opt/conda/envs/test/lib/python3.12/site-packages/cuml/internals/__init__.py:18: in <module>
    from cuml.internals.base_helpers import BaseMetaClass, _tags_class_and_instance
/opt/conda/envs/test/lib/python3.12/site-packages/cuml/internals/base_helpers.py:20: in <module>
    from cuml.internals.api_decorators import (
/opt/conda/envs/test/lib/python3.12/site-packages/cuml/internals/api_decorators.py:24: in <module>
    from cuml.internals import input_utils as iu
/opt/conda/envs/test/lib/python3.12/site-packages/cuml/internals/input_utils.py:20: in <module>
    from cuml.internals.array import CumlArray
/opt/conda/envs/test/lib/python3.12/site-packages/cuml/internals/array.py:21: in <module>
    from cuml.internals.global_settings import GlobalSettings
/opt/conda/envs/test/lib/python3.12/site-packages/cuml/internals/global_settings.py:20: in <module>
    from cuml.internals.device_type import DeviceType
/opt/conda/envs/test/lib/python3.12/site-packages/cuml/internals/device_type.py:19: in <module>
    from cuml.internals.mem_type import MemoryType
/opt/conda/envs/test/lib/python3.12/site-packages/cuml/internals/mem_type.py:22: in <module>
    cudf = gpu_only_import("cudf")
/opt/conda/envs/test/lib/python3.12/site-packages/cuml/internals/safe_imports.py:362: in gpu_only_import
    return importlib.import_module(module)
/opt/conda/envs/test/lib/python3.12/site-packages/cudf/__init__.py:19: in <module>
    _setup_numba()
/opt/conda/envs/test/lib/python3.12/site-packages/cudf/utils/_numba.py:124: in _setup_numba
    shim_ptx_cuda_version = _get_cuda_build_version()
/opt/conda/envs/test/lib/python3.12/site-packages/cudf/utils/_numba.py:19: in _get_cuda_build_version
    from cudf._lib import strings_udf
/opt/conda/envs/test/lib/python3.12/site-packages/cudf/_lib/__init__.py:2: in <module>
    from . import strings_udf
strings_udf.pyx:1: in init cudf._lib.strings_udf
    ???
/opt/conda/envs/test/lib/python3.12/site-packages/rmm/_cuda/stream.py:31: in <module>
    warnings.warn(
E   FutureWarning: The `rmm._cuda.stream` module is deprecated in 25.02 and will be removed in a future release. Use `rmm.pylibrmm.stream` instead.

Which makes me think that somewhere in strings_udf.pyx there is an old import. However looking at https://github.com/rapidsai/cudf/blob/a4bbd0930a0e4922f69586560b064a0bd9e6aedc/python/cudf/cudf/_lib/strings_udf.pyx I can't immediately see it and the last edit is a few days ago. Maybe compiling locally with more debugging turned on so we can see which line in strings_udf.pyx is causing this can shed light on this

bdice · 2025-01-17T14:29:38Z

rapidsai/rmm#1775 would cause this warning, but we searched the RAPIDS code base extensively to make sure there were no internal uses of this that would trigger deprecations... I am looking now to see what we might have missed.

bdice · 2025-01-17T14:41:33Z

I still don't see anything and can't reproduce locally. I am trying to rerun.

Matt711 · 2025-01-17T15:15:26Z

I'm taking a look too.

bdice · 2025-01-17T16:13:32Z

Seems like rerunning CI has fixed the problem. I suspect there was some intermediate state where the RMM PR had been merged but not all artifacts / dependencies agreed on how it was supposed to be used until the dependency tree (notably cudf) was rebuilt? Not sure.

jcrist · 2025-01-21T18:13:24Z

python/cuml/cuml/tests/test_kneighbors_classifier.py

@@ -218,6 +218,8 @@ def test_predict_large_n_classes(datatype):
    assert array_equal(y_hat.astype(np.int32), y_test.astype(np.int32))


+# Ignore FutureWarning: Using `__dataframe__` is deprecated
+@pytest.mark.filterwarnings("ignore::FutureWarning")


I'm unable to reproduce getting a warning in this test (and don't see how one could be generated). I think this one can just be dropped.

jcrist · 2025-01-21T18:15:28Z

python/cuml/cuml/tests/test_metrics.py

@@ -163,6 +163,8 @@ def test_r2_score(datatype, use_handle):
    np.testing.assert_almost_equal(score, 0.98, decimal=7)


+# Ignore FutureWarning: Using `__dataframe__` is deprecated
+@pytest.mark.filterwarnings("ignore::FutureWarning")


AFAICT this test checks that GridSearchCV works with cudf, which is no longer true after __dataframe__ was deprecated. IMO we should delete the test (or ask the cudf team to reconsider). Filtering the warning only puts things off until __dataframe__ is removed when it'll just break again.

Are you able to reproduce this warning locally? If so, can you please try commenting out the __dataframe__ implementation in dataframe.py in cudf and try again? Is cuml using __dataframe__ explicitly or is it an implicit path such that a different path would be taken if this implementation doesn't exist?

That's a good point! From reading the sklearn code I think sklearn has a path that would be taken once the __dataframe__ path is fully removed, so maybe filtering out the warning for now is fine. ~~I'll need to get a local cudf build setup to try, will do later today.~~

Just realized dataframe.py isn't in cython, so this was an easy quick check. I can confirm that once the __dataframe__ code is removed from cudf then things work again (though I can't say how efficiently). Using filterwarnings here seems fine (though with the recommendation to a more specific filter that Tim made above).

The old __dataframe__ code path wasn't actually "efficient" in any meaningful way. There is no device data transport happening, only metadata. That's actually the core problem with this protocol: it doesn't specify who is responsible for actually transferring data across device boundaries, leading to consumers having to make per-library distinctions. That's discussed a bit more in rapidsai/cudf#17403.

Thanks for testing so quickly, I was just spinning up my own dev environment to be able to verify this claim myself!

Would prefer a more specific filter as well.

I filed #6239 as a follow-up with the proposal above to make the filter more specific. My hope is that this PR passes CI and we can merge it as-is, then follow up with that PR.

bdice · 2025-01-21T22:00:20Z

Thanks for all the reviews. If CI passes, I think we should merge this as-is so that CI is unblocked.

~~I am happy to file a follow-up PR~~ I filed #6239 to make the warning filter more specific and attempt to remove the one case where it may not be necessary.

csadorf

Approved with follow-ups pushed to #6239 .

bdice · 2025-01-21T23:12:36Z

CUDA 11.8 ARM wheel tests are failing with a message that is similar to some test failures we've seen popping up in cuVS.

FAILED test_dask_serialization.py::test_serialize_before_training - RuntimeError: 1 of 1 worker jobs failed: cuBLAS error encountered at: file=/tmp/pip-build-env-yckqnvi0/normal/lib/python3.12/site-packages/libraft/include/raft/linalg/detail/cublaslt_wrappers.hpp line=261: call='cublasLtMatmul(resource::get_cublaslt_handle(res), mm_desc->desc, alpha, a_ptr, mm_desc->a, b_ptr, mm_desc->b, beta, c_ptr, mm_desc->c, c_ptr, mm_desc->c, &(mm_desc->heuristics.algo), nullptr, 0, stream)', Reason=13:CUBLAS_STATUS_EXECUTION_FAILED

I do not know the root cause for this. Perhaps we can request an admin-merge on this PR or #6239 and handle the CUDA 11.8 ARM wheel tests separately.

Ignore cudf's __dataframe__ deprecation.

5e20789

bdice requested a review from a team as a code owner January 16, 2025 17:46

bdice requested review from cjnolet and csadorf January 16, 2025 17:46

github-actions bot added the Cython / Python Cython or Python issue label Jan 16, 2025

jakirkham added bug Something isn't working non-breaking Non-breaking change labels Jan 16, 2025

bdice marked this pull request as draft January 16, 2025 20:44

Ignore another FutureWarning.

8b72717

bdice mentioned this pull request Jan 16, 2025

Use GCC 13 in CUDA 12 conda builds. #6221

Merged

bdice mentioned this pull request Jan 16, 2025

Use GCC 13 in CUDA 12+ builds rapidsai/build-planning#129

Closed

betatim reviewed Jan 17, 2025

View reviewed changes

jameslamb mentioned this pull request Jan 17, 2025

introduce libcugraph wheels rapidsai/cugraph#4804

Merged

bdice mentioned this pull request Jan 17, 2025

Make the stream module a part of the public API rapidsai/rmm#1775

Merged

3 tasks

FIX skip test for change in behavior of nulls of cudf.pandas

75dd2da

jcrist reviewed Jan 21, 2025

View reviewed changes

dantegd marked this pull request as ready for review January 21, 2025 19:19

divyegala approved these changes Jan 21, 2025

View reviewed changes

bdice mentioned this pull request Jan 21, 2025

Ignore cudf's __dataframe__ deprecation with simpler filters. #6239

Open

bdice added bug Something isn't working and removed bug Something isn't working labels Jan 21, 2025

bdice self-assigned this Jan 21, 2025

jakirkham approved these changes Jan 21, 2025

View reviewed changes

csadorf approved these changes Jan 21, 2025

View reviewed changes

raydouglass merged commit 01e19bb into rapidsai:branch-25.02 Jan 21, 2025
61 of 63 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ignore cudf's dataframe deprecation. #6229

Ignore cudf's dataframe deprecation. #6229

bdice commented Jan 16, 2025 •

edited

Loading

copy-pr-bot bot commented Jan 16, 2025

bdice commented Jan 16, 2025

bdice commented Jan 16, 2025

betatim Jan 17, 2025 •

edited

Loading

jcrist Jan 21, 2025

bdice Jan 21, 2025

jcrist Jan 21, 2025

csadorf Jan 21, 2025

bdice Jan 21, 2025

betatim commented Jan 17, 2025 •

edited

Loading

bdice commented Jan 17, 2025

bdice commented Jan 17, 2025

Matt711 commented Jan 17, 2025

bdice commented Jan 17, 2025

jcrist Jan 21, 2025

jcrist Jan 21, 2025

vyasr Jan 21, 2025

jcrist Jan 21, 2025 •

edited

Loading

jcrist Jan 21, 2025

vyasr Jan 21, 2025

vyasr Jan 21, 2025

csadorf Jan 21, 2025

bdice Jan 21, 2025

bdice commented Jan 21, 2025 •

edited

Loading

csadorf left a comment

bdice commented Jan 21, 2025

	@pytest.mark.filterwarnings("ignore::FutureWarning")
	@pytest.mark.filterwarnings("ignore:Support for loading dataframes via the `__dataframe__` interchange protocol is deprecated")

Ignore cudf's __dataframe__ deprecation. #6229

Ignore cudf's __dataframe__ deprecation. #6229

Conversation

bdice commented Jan 16, 2025 • edited Loading

copy-pr-bot bot commented Jan 16, 2025

bdice commented Jan 16, 2025

bdice commented Jan 16, 2025

betatim Jan 17, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

betatim commented Jan 17, 2025 • edited Loading

bdice commented Jan 17, 2025

bdice commented Jan 17, 2025

Matt711 commented Jan 17, 2025

bdice commented Jan 17, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jcrist Jan 21, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bdice commented Jan 21, 2025 • edited Loading

csadorf left a comment

Choose a reason for hiding this comment

bdice commented Jan 21, 2025

Ignore cudf's dataframe deprecation. #6229

Ignore cudf's dataframe deprecation. #6229

bdice commented Jan 16, 2025 •

edited

Loading

betatim Jan 17, 2025 •

edited

Loading

betatim commented Jan 17, 2025 •

edited

Loading

jcrist Jan 21, 2025 •

edited

Loading

bdice commented Jan 21, 2025 •

edited

Loading