Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with importing sklearn (missing) and creating python env through Basilisk #9

Open
laurensvdwiel opened this issue Nov 30, 2024 · 0 comments

Comments

@laurensvdwiel
Copy link

Dear Dev team,

Following the guideline provided in https://dzhang32.github.io/dasper/articles/dasper.html

When at the https://dzhang32.github.io/dasper/articles/dasper.html#running-dasper code block I receive the following error on my HPC setup.

Error: BiocParallel errors
  2 remote errors, element index: 1, 2
  0 unevaluated and other errors
  first remote error:
Error in py_get_attr(x, name, FALSE): AttributeError: module 'sklearn' has no attribute 'ensemble'
Run `reticulate::py_last_error()` for details.

The issue is related to not being able to set a user-provided python environment and basilisk is not working as intended at:

dasper/R/utils.R

Lines 264 to 289 in ec5f82a

.outlier_score <- function(features, ...) {
cl <- basilisk::basiliskStart(env_sklearn)
outlier_scores <- basilisk::basiliskRun(cl, function() {
sklearn <- reticulate::import("sklearn")
od_model <- sklearn$ensemble$IsolationForest()
od_model <- od_model$set_params(...)
od_model_params <- od_model$get_params()
od_model <- od_model$fit(features)
outlier_scores <- od_model$decision_function(features)
suppressWarnings(
print(stringr::str_c(
Sys.time(), " - fitting outlier detection model with parameters: ",
stringr::str_c(names(od_model_params), "=", unname(od_model_params)) %>%
stringr::str_c(collapse = ", ")
))
)
return(outlier_scores)
})
basilisk::basiliskStop(cl)
return(outlier_scores)
}

I have created a local, user-based (and yes, horrible) fix, but at least now I am able to run the code. local Fix, changed :

.outlier_score <- function(features, ...) {
    reticulate::use_python("/home/bin/python3")
    sklearn <- reticulate::import("sklearn")
    od_model <- sklearn$ensemble$IsolationForest()
    od_model <- od_model$set_params(...)
    od_model_params <- od_model$get_params()
    od_model <- od_model$fit(features)
    outlier_scores <- od_model$decision_function(features)

    suppressWarnings(
        print(stringr::str_c(
       	    Sys.time(), " - fitting outlier detection model with parameters: ",
            stringr::str_c(names(od_model_params), "=", unname(od_model_params)) %>%
                stringr::str_c(collapse = ", ")
        ))
    )

   return(outlier_scores)
}

Note my python has already pandas and sklearn installed, but a newer version. forcing reticulate to use that and removing all cl/basilisk language provides a temporary fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant