title | description | tagline | button_text | button_link | layout |
---|---|---|---|---|---|
RAPIDS + XGBoost |
Learn How to Deploy XGBoost on GPUs |
Deploy XGBoost on GPUs |
XGBOOST.IO |
default |
![xgboost]({{ site.baseurl }}{% link /assets/images/xgboost_logo.png %}){: .projects-logo}
{: .section-title-full}
{% capture intro_content %}
XGBoost is a well-known gradient boosted decision trees (GBDT) machine learning package used to tackle regression, classification, and ranking problems. It’s written in C++ and NVIDIA CUDA® with wrappers for Python, R, Java, Julia, and several other popular languages. XGBoost now includes seamless, drop-in GPU acceleration, which significantly speeds up model training and improves accuracy for better predictions. {: .subtitle}
The RAPIDS team works closely with the Distributed Machine Learning Common (DMLC) XGBoost organization to upstream code and ensure that all components of the GPU-accelerated analytics ecosystem work smoothly together. {: .subtitle}
{% endcapture %}
{% include section-single.html background="background-white" padding-top="0em" padding-bottom="1em" content-single=intro_content %}
{% capture start_left %}
{: .section-title-halfs} The project is well supported and documented by many tutorials, quick-start guides, and papers.
Try out XGBoost now, with cuDF and other RAPIDS libraries.
Try with Colaboratory {: target="_blank"}
To see how XGBoost integrates with cuDF, Dask, and the entire RAPIDS ecosystem, check out these RAPIDS notebooks{: target="_blank"} which walk through classification and regression examples.
{% endcapture %} {% capture start_right %}
{: .section-subtitle-top-1} Access current installation instructions, guides, FAQs, and more in the latest documentation{: target="_blank"}.
Take a deep dive into XGBoost’s algorithms with Tianqi Chen and Carlos Guestrin in their XGBoost Paper{: target="_blank"}.
Learn about the XGBoost algorithms used on GPUs in these blogs from Rory Mitchell, a RAPIDS team member and core XGBoost contributor.
Gradient Boosting, Decision Trees and XGBoost with CUDA{: target="_blank"}
Updates to the XGBoost GPU algorithms{: target="_blank"}
Bias Variance Decompositions using XGBoost{: target="_blank"}
{% endcapture %} {% include section-halfs.html background="background-white" padding-top="3em" padding-bottom="10em" content-left-half=start_left content-right-half=start_right %}
{% capture metric_title%}
{% endcapture %} {% capture metric_left %} ![xgboost]({{ site.baseurl }}{% link /assets/images/XGboost-benchmark.png %}){: .full-image-center} {% endcapture %} {% capture metric_right %}
XGBoost has integrated support to run across multiple GPUs, which can deliver even more significant performance improvements. For the 113-million-row airline dataset used in the gradient boosting machines (GBM) benchmarks suite, eight NVIDIA® Tesla® V100 GPUs completed training in 42.6 seconds, compared to over 39 minutes on eight CPUs—a 54.9X speedup.
You can run GBM benchmarking scripts from this GitHub repository{: target="_blank"} to measure performance on your own system and compare it to various GBM/GBDT implementations.
{% endcapture %} {% include slopecap.html background="background-gray" position="top" slope="down" %} {% include section-single.html background="background-gray" padding-top="5em" padding-bottom="1em" content-single=metric_title %} {% include section-halfs.html background="background-gray" padding-top="0em" padding-bottom="10em" content-left-half=metric_left content-right-half=metric_right %}
{% capture deploy_single %}
It’s easy to work across multiple GPUs and multiple nodes with distributed Dask and Apache Spark. {: .subtitle}
{% endcapture %} {% capture deploy_left %}
To take advantage of multiple GPU-accelerated nodes, you can use XGBoost’s native Dask integration. This distributes data, builds DMatrix objects, and sets up cross-node communication to run XGBoost training on a cluster. This blog post covers the XGBoost Dask API in more detail, including usage and performance. The official XGBoost repository{: target="_blank"} includes simple examples with distributed Dask and also more detailed API documentation{: target="_blank"}. {% endcapture %} {% capture deploy_mid %}
XGBoost supports a Java API, called XGBoost4J{: target="_blank"}. As of release 1.2, the XGBoost4J JARs include GPU support in the pre-built xgboost4j-spark-gpu JARs.
The team is continuing to work on deeper integration into the Spark ecosystem - learn more in this [devblog post](link to: https://news.developer.nvidia.com/gpu-accelerated-spark-xgboost/){: target="_blank"}.
{% endcapture %} {% capture deploy_right %}
With Dask-CUDA{: target="_blank"}, running across multiple GPUs on a single machine is easy. Two lines of code can spin up a LocalCUDACluster and parallelize ETL as well as training. See the Dask-CUDA docs{: target="_blank"} for more details.
NOTE: Older versions of XGBoost supported a thread-based "single-node, multi-GPU" pattern with the n_gpus
parameters. This parameter is now deprecated, and we encourage all users to shift to Dask or Spark for more scalable and maintainable multi-GPU training.
{% endcapture %} {% include slopecap.html background="background-purple" position="top" slope="up" %} {% include section-single.html background="background-purple" padding-top="5em" padding-bottom="0em" content-single=deploy_single %} {% include section-thirds.html background="background-purple" padding-top="0em" padding-bottom="10em" content-left-third=deploy_left content-middle-third=deploy_mid content-right-third=deploy_right %} {% include slopecap.html background="background-purple" position="bottom" slope="down" %}
{% capture download_single %}
The RAPIDS team is developing GPU enhancements to open-source XGBoost, working closely with the DCML/XGBoost organization to improve the larger ecosystem. Since RAPIDS is iterating ahead of upstream XGBoost releases, some enhancements will be available earlier from the RAPIDS branch{: target="_blank"}, or from RAPIDS-provided installers. {: .subtitle}
For the latest prerequisites and supported versions, check out our Getting Started page{: target="_blank"}.
{% endcapture %} {% capture download_left %}
The default RAPIDS conda metapackage includes a recent snapshot of XGBoost by default. This package is released on the same schedule as other RAPIDS packages and tested for full compatibility. You can find the latest install options with our RAPIDS Release Selector{: target="_blank"}.
{: .section-subtitle-top-2}
Install using Docker (the latest RAPIDS release). RAPIDS provides Docker images that include a recent version of GPU-accelerated XGBoost. Just follow the Docker installation instructions with our RAPIDS Release Selector page and you can start using XGBoost right away from a notebook or the command line.
{% endcapture %} {% capture download_right %}
Install using pip or other methods (the default upstream version). The default open-source XGBoost packages already include GPU support, Dask integration, and the ability to load data from a cuDF DataFrame. Follow the XGBoost instructions to install from source or use:
pip install xgboost
NOTE: Full RAPIDS integration first appeared in release 1.0 of XGBoost. Older pip packages will not include cuDF support.
{% endcapture %} {% include section-single.html background="background-white" padding-top="5em" padding-bottom="0em" content-single=download_single %} {% include section-halfs.html background="background-white" padding-top="3em" padding-bottom="2em" content-left-half=download_left content-right-half=download_right %}
{% capture config_single %}
With only a few minor code changes, you’ll be training models on a supercharged XGBoost. {: .subtitle}
{% endcapture %} {% capture config_left %}
If you haven’t developed your model yet, the best place to start is XGBoost's Getting Started documentation{: target="_blank"}. If you have an existing code to train models on CPU, converting it to run on GPUs is simple.
{: .section-subtitle-top-2}
Similar configuration options apply to R, Java, and Julia wrappers. The XGBoost Documentation{: target="_blank"} and XGBoost GPU Support{: target="_blank"} pages contain much more information on configuring and running models and on GPU-specific options and algorithms.
{% endcapture %}
{% capture config_right %}
When training a model with XGBoost, you have to specify a dictionary of training parameters. If you set the tree_method
parameter to gpu_hist
, XGBoost will run on your GPU.
For example, if your old code in Python looks like:
params = {'max_depth': 3, 'learning_rate': 0.1}
dtrain = xgb.DMatrix(X, y)
xgb.train(params, dtrain)
Change it to:
params = {‘tree_method’: ‘gpu_hist’, 'max_depth': 3,
'learning_rate': 0.1}
dtrain = xgb.DMatrix(X, y)
xgb.train(params, dtrain)
{% endcapture %}
{% include section-single.html background="background-white" padding-top="2em" padding-bottom="0em" content-single=config_single %} {% include section-halfs.html background="background-white" padding-top="0em" padding-bottom="5em" content-left-half=config_left content-right-half=config_right %}
{% capture df_single %}
The RAPIDS team is contributing to the XGBoost project and integrating new features to better optimize GPU performance. {: .subtitle}
{% endcapture %} {% capture df_left %}
The RAPIDS project has developed a seamless bridge between cuDF DataFrames, the primary data structure in RAPIDS, and DMatrix, XGBoost’s data structure. The DMatrix will be built from the GPU dataframe with no need to copy data through host memory. Starting in XGBoost 1.0, GPU data from cuPy and any other GPU array library with support for the __cuda_array_interface__ API
can also be used directly to build a DMatrix.
{% endcapture %} {% capture df_right %}
To create a DMatrix from a cuDF DataFrame, just pass the data frames to the constructor:
import xgboost as xgb
train_X_cudf = cudf.DataFrame(...)
train_y_cudf = cudf.Series(...)
dmatrix = xgb.DMatrix(train_X_cudf, label=train_y_cudf)
The package will automatically convert from cuDF’s format to XGBoost’s DMatrix format, keeping the data on GPU memory.
{% endcapture %} {% include section-single.html background="background-white" padding-top="2em" padding-bottom="0em" content-single=df_single %} {% include section-halfs.html background="background-white" padding-top="0sem" padding-bottom="10em" content-left-half=df_left content-right-half=df_right %}
{% capture end_bottom %}
{: .section-title-full .text-white}
{% endcapture %} {% include slopecap.html background="background-darkpurple" position="top" slope="up" %} {% include section-single.html background="background-darkpurple" padding-top="1em" padding-bottom="0em" content-single=end_bottom %}
{% include cta-footer-xgboost.html background="background-darkpurple" %}