Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AutoTabPFNRegressor: ValueError: Buffer dtype mismatch, expected 'const float32_t' but got 'double' #173

Open
Mccti078 opened this issue Feb 5, 2025 · 3 comments
Labels
bug Something isn't working

Comments

@Mccti078
Copy link

Mccti078 commented Feb 5, 2025

Describe the bug

Hi Tabpfn,

I'm trying to use AutoTabPFNRegressor to train a tabular model. I keep getting the below error. I can't seem to fix it. For context, I have no problem using TabPFNRegressor for the same dataset.

My apologies if this is user error, but I am at a bit of a dead end.

Thank you

2025-02-05 20:44:50 INFO Using default preset for Post Hoc Ensemble.
2025-02-05 20:44:50 INFO Using categorical_feature_indices: [2]
2025-02-05 20:44:50 INFO Using task type: regression
2025-02-05 20:44:50 INFO Obtaining TabPFN models from a random portfolio.
2025-02-05 20:44:52 INFO Using 100 base models: ['default_tabpfn_model_0', 'random_tabpfn_model_1', 'random_rf_pfn_model_2', 'random_tabpfn_model_3', 'random_rf_pfn_model_4', 'random_rf_pfn_model_5', 'random_rf_pfn_model_6', 'random_rf_pfn_model_7', 'random_tabpfn_model_8', 'random_tabpfn_model_9', 'random_rf_pfn_model_10', 'random_rf_pfn_model_11', 'random_tabpfn_model_12', 'random_rf_pfn_model_13', 'random_tabpfn_model_14', 'random_tabpfn_model_15', 'random_rf_pfn_model_16', 'random_tabpfn_model_17', 'random_tabpfn_model_18', 'random_rf_pfn_model_19', 'random_rf_pfn_model_20', 'random_rf_pfn_model_21', 'random_tabpfn_model_22', 'random_rf_pfn_model_23', 'random_tabpfn_model_24', 'random_tabpfn_model_25', 'random_rf_pfn_model_26', 'random_tabpfn_model_27', 'random_tabpfn_model_28', 'random_rf_pfn_model_29', 'random_tabpfn_model_30', 'random_rf_pfn_model_31', 'random_tabpfn_model_32', 'random_tabpfn_model_33', 'random_rf_pfn_model_34', 'random_tabpfn_model_35', 'random_rf_pfn_model_36', 'random_rf_pfn_model_37', 'random_tabpfn_model_38', 'random_rf_pfn_model_39', 'random_tabpfn_model_40', 'random_tabpfn_model_41', 'random_tabpfn_model_42', 'random_rf_pfn_model_43', 'random_tabpfn_model_44', 'random_tabpfn_model_45', 'random_rf_pfn_model_46', 'random_tabpfn_model_47', 'random_rf_pfn_model_48', 'random_tabpfn_model_49', 'random_tabpfn_model_50', 'random_tabpfn_model_51', 'random_tabpfn_model_52', 'random_rf_pfn_model_53', 'random_tabpfn_model_54', 'random_tabpfn_model_55', 'random_rf_pfn_model_56', 'random_rf_pfn_model_57', 'random_rf_pfn_model_58', 'random_tabpfn_model_59', 'random_rf_pfn_model_60', 'random_tabpfn_model_61', 'random_rf_pfn_model_62', 'random_rf_pfn_model_63', 'random_tabpfn_model_64', 'random_rf_pfn_model_65', 'random_rf_pfn_model_66', 'random_rf_pfn_model_67', 'random_tabpfn_model_68', 'random_tabpfn_model_69', 'random_tabpfn_model_70', 'random_rf_pfn_model_71', 'random_tabpfn_model_72', 'random_rf_pfn_model_73', 'random_rf_pfn_model_74', 'random_rf_pfn_model_75', 'random_rf_pfn_model_76', 'random_rf_pfn_model_77', 'random_rf_pfn_model_78', 'random_tabpfn_model_79', 'random_rf_pfn_model_80', 'random_tabpfn_model_81', 'random_tabpfn_model_82', 'random_tabpfn_model_83', 'random_tabpfn_model_84', 'random_tabpfn_model_85', 'random_tabpfn_model_86', 'random_rf_pfn_model_87', 'random_rf_pfn_model_88', 'random_tabpfn_model_89', 'random_tabpfn_model_90', 'random_rf_pfn_model_91', 'random_rf_pfn_model_92', 'random_rf_pfn_model_93', 'random_tabpfn_model_94', 'random_tabpfn_model_95', 'random_tabpfn_model_96', 'random_tabpfn_model_97', 'random_rf_pfn_model_98', 'random_rf_pfn_model_99']
2025-02-05 20:44:52 INFO Starting 80-repeated holdout validation with holdout_frac=0.33.
2025-02-05 20:44:52 INFO Set time limit to 2500 seconds. We will early stop validation if needed.
2025-02-05 20:44:52 INFO Yield data for model default_tabpfn_model_0 and split 0 (repeat=1).
2025-02-05 20:45:43 INFO Yield data for model random_tabpfn_model_1 and split 0 (repeat=1).
2025-02-05 20:46:00 INFO Yield data for model random_rf_pfn_model_2 and split 0 (repeat=1).
2025-02-05 20:46:00 INFO Using default preset for Post Hoc Ensemble.
2025-02-05 20:46:00 INFO Using categorical_feature_indices: [2]
2025-02-05 20:46:00 INFO Using task type: regression
2025-02-05 20:46:00 INFO Obtaining TabPFN models from a random portfolio.

An error occurred: Buffer dtype mismatch, expected 'const float32_t' but got 'double'

Full traceback:
Traceback (most recent call last):
File "C:\Users\User\AppData\Local\Temp\ipykernel_6384\507384520.py", line 75, in run_tabpfn_tuned
model.fit(X_train_np, y_train_np)
File "C:\Users\User\tabpfn-extensions\src\tabpfn_extensions\post_hoc_ensembles\sklearn_interface.py", line 222, in fit
self.predictor_.fit(
File "C:\Users\User\tabpfn-extensions\src\tabpfn_extensions\post_hoc_ensembles\pfn_phe.py", line 333, in fit
self._ens_model.fit(X, y)
File "C:\Users\User\tabpfn-extensions\src\tabpfn_extensions\post_hoc_ensembles\greedy_weighted_ensemble.py", line 234, in fit
weights = self.get_weights(X, y)
File "C:\Users\User\tabpfn-extensions\src\tabpfn_extensions\post_hoc_ensembles\greedy_weighted_ensemble.py", line 173, in get_weights
oof_proba = self.get_oof_per_estimator(X, y)
File "C:\Users\User\tabpfn-extensions\src\tabpfn_extensions\post_hoc_ensembles\abstract_validation_utils.py", line 372, in get_oof_per_estimator
self._fill_predictions_in_place(
File "C:\Users\User\tabpfn-extensions\src\tabpfn_extensions\post_hoc_ensembles\abstract_validation_utils.py", line 127, in _fill_predictions_in_place
base_model.fit(fold_X_train, fold_y_train)
File "C:\Users\User\tabpfn-extensions\src\tabpfn_extensions\rf_pfn\SklearnBasedRandomForestTabPFN.py", line 98, in fit
super().fit(X, y)
File "C:\Users\User\anaconda3\envs\ml_testing\lib\site-packages\sklearn\base.py", line 1351, in wrapper
return fit_method(estimator, *args, **kwargs)
File "C:\Users\User\anaconda3\envs\ml_testing\lib\site-packages\sklearn\ensemble_forest.py", line 377, in fit
estimator._compute_missing_values_in_feature_mask(
File "C:\Users\User\anaconda3\envs\ml_testing\lib\site-packages\sklearn\tree_classes.py", line 228, in _compute_missing_values_in_feature_mask
missing_values_in_feature_mask = _any_isnan_axis0(X)
File "sklearn\tree\_utils.pyx", line 450, in sklearn.tree._utils._any_isnan_axis0
ValueError: Buffer dtype mismatch, expected 'const float32_t' but got 'double'

Steps/Code to Reproduce

No response

Expected Results

No response

Actual Results

No response

Versions

@Mccti078 Mccti078 added the bug Something isn't working label Feb 5, 2025
@LeoGrin
Copy link
Collaborator

LeoGrin commented Feb 6, 2025

Hi @Mccti078 !
Thanks for the report :) Would you be able to share a minimal reproducible example?

@Mccti078
Copy link
Author

Mccti078 commented Feb 8, 2025

Hello @LeoGrin yes absolutely. I have a jupyter notebook and a dataset I can share with you. Perhaps I could email it to you?

@LeoGrin
Copy link
Collaborator

LeoGrin commented Feb 10, 2025

Hey @Mccti078, thanks a lot! This should actually be fixed by PriorLabs/tabpfn-community#23, could you upgrade your tabpfn-extension package and try again?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants