Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python-package][PySpark] Expose Training and Validation Metrics #11132

Open
ayoub317 opened this issue Dec 29, 2024 · 2 comments · May be fixed by #11133
Open

[python-package][PySpark] Expose Training and Validation Metrics #11132

ayoub317 opened this issue Dec 29, 2024 · 2 comments · May be fixed by #11133

Comments

@ayoub317
Copy link
Contributor

Unlike the JVM binding, where, after training the XGBoost Spark model, we can retrieve a summary of the training and evaluation sets passed, this functionality is not currently available in the PySpark Python XGBoost binding.

In the JVM package, this is defined through this class. We obtain the summary while training the model, which first passes through this method, and in the end reaches this line. After that, when creating the Spark[Classification|Regression|Ranker]Model, we pass through a constructor like this one.

I followed a similar approach to expose and retrieve these metrics in the PySpark XGBoost binding, and I will be submitting a review soon. Any feedback is welcome.

@trivialfis
Copy link
Member

Thank you for the PR. Do you plan to work on the summary class? I think the JVM package is mimicking the spark ML summary structure, there are similar classes for pyspark ml.

@ayoub317
Copy link
Contributor Author

ayoub317 commented Dec 29, 2024

Hello Jiaming, thanks for your quick response. I implemented the summary class and pushed the review.

Indeed ! And in PySpark they implemente a wrapper around the JVM one . Here, I implemented everything from scratch because the PySpark XGBoost binding is not a wrapper around the XGBoost JVM package binding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants