Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plots for categorical splits don't show named categories #9927

Open
david-cortes opened this issue Dec 26, 2023 · 3 comments
Open

Plots for categorical splits don't show named categories #9927

david-cortes opened this issue Dec 26, 2023 · 3 comments

Comments

@david-cortes
Copy link
Contributor

When producing a plot of a tree with categorical splits, the plots will use the numbers of the categories:

import numpy as np, xgboost as xgb
rng = np.random.default_rng(seed=123)
X = rng.integers(4, size=(100,3))
y = rng.standard_normal(size=100)
dm = xgb.DMatrix(
    data=X,
    label=y,
    feature_types=["c"]*3
)
model = xgb.train(
    dtrain=dm,
    params={
        "tree_method" : "hist",
        "max_depth" : 2
    },
    num_boost_round=3
)
xgb.plot_tree(model)

image

Categorical features typically have named categories. Would be quite helpful to show those on the plots instead of the numbers, which might not be easy to mentally map to a given category.

For this, I guess that a potential solution could be to add an additional dmatrix/booster string attribure for "categorical_names" or so, like there is a "feature_name".

@trivialfis
Copy link
Member

XGBoost takes encoded categories instead of raw data, as a result, there's no name for them. We need to think of a way to pass the information from the encoder to XGB

@Tomtomgo
Copy link

@david-cortes did you end up finding a solution?

@david-cortes
Copy link
Contributor Author

@david-cortes did you end up finding a solution?

No. See #11088
Any potential solution on the R side will have to wait until such features are added to the core library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants