You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi guys, I'trying out MLFlow Recipes for the first time in an Azure Databricks environment. Until yesterday, everything went fine from ingestion to prediction. Today however, I'm running into an Error saying that MLFlow can't find what looks to me like a temporary file while training my model. I really don't do anything fancy in all of these steps and want to use an LGBMClassifier for training. MLFlow version is 2.7.0, but it doesn't seem to work on any other version I tried.
experiment_name = "experiment_name"
if not mlflow.get_experiment_by_name(experiment_name):
mlflow.create_experiment(name=experiment_name )
else:
mlflow.set_experiment(experiment_name)
experiment = mlflow.get_experiment_by_name(experiment_name)
r = Recipe(profile="databricks")
r.clean()
r.inspect()
r.run("ingest")
r.run("split")
r.run("transform")
r.run("train")
Here's what my estimator function looks like in train.py. estimator_params are defined in recipe.yaml.
def estimator_fn(estimator_params: Dict[str, Any] = None):
from lightgbm import LGBMClassifier
if estimator_params is None:
estimator_params = {}
return LGBMClassifier(**estimator_params)
As I said, the same code worked fine for me yesterday, but today I'm running into this error:
Run MLFlow Recipe step: train
2023/09/13 11:09:36 INFO mlflow.recipes.step: Running step train...
2023/09/13 11:09:38 INFO mlflow.recipes.steps.train: Class imbalance of 0.50 is better than 0.3, no need to rebalance
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-f235e133-8940-41eb-b389-d9cf570c187a/lib/python3.10/site-packages/mlflow/recipes/step.py", line 132, in run
self.step_card = self._run(output_directory=output_directory)
File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-f235e133-8940-41eb-b389-d9cf570c187a/lib/python3.10/site-packages/mlflow/recipes/steps/train.py", line 373, in _run
logged_estimator = self._log_estimator_to_mlflow(fitted_estimator, X_train)
File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-f235e133-8940-41eb-b389-d9cf570c187a/lib/python3.10/site-packages/mlflow/recipes/steps/train.py", line 1270, in _log_estimator_to_mlflow
return mlflow.sklearn.log_model(
File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-f235e133-8940-41eb-b389-d9cf570c187a/lib/python3.10/site-packages/mlflow/sklearn/__init__.py", line 408, in log_model
return Model.log(
File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-f235e133-8940-41eb-b389-d9cf570c187a/lib/python3.10/site-packages/mlflow/models/model.py", line 568, in log
with TempDir() as tmp:
File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-f235e133-8940-41eb-b389-d9cf570c187a/lib/python3.10/site-packages/mlflow/utils/file_utils.py", line 383, in __enter__
self._path = os.path.abspath(create_tmp_dir())
File "/local_disk0/.ephemeral_nfs/envs/pythonEnv-f235e133-8940-41eb-b389-d9cf570c187a/lib/python3.10/site-packages/mlflow/utils/file_utils.py", line 830, in create_tmp_dir
return tempfile.mkdtemp(dir=repl_local_tmp_dir)
File "/usr/lib/python3.10/tempfile.py", line 507, in mkdtemp
_os.mkdir(file, 0o700)
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/repl_tmp_data/ReplId-68395-9c373-e0490-3/tmpuyeyu8co'
make: *** [Makefile:40: steps/train/outputs/model] Error 1
I really don't know what to do since the stacktrace seems to suggest some MLFlow internal error. Any help would be appreciated.
The text was updated successfully, but these errors were encountered:
Hi guys, I'trying out MLFlow Recipes for the first time in an Azure Databricks environment. Until yesterday, everything went fine from ingestion to prediction. Today however, I'm running into an Error saying that MLFlow can't find what looks to me like a temporary file while training my model. I really don't do anything fancy in all of these steps and want to use an LGBMClassifier for training. MLFlow version is 2.7.0, but it doesn't seem to work on any other version I tried.
Here's what my estimator function looks like in train.py. estimator_params are defined in recipe.yaml.
As I said, the same code worked fine for me yesterday, but today I'm running into this error:
I really don't know what to do since the stacktrace seems to suggest some MLFlow internal error. Any help would be appreciated.
The text was updated successfully, but these errors were encountered: