Skip to content

Latest commit

 

History

History
462 lines (341 loc) · 17.4 KB

start.md

File metadata and controls

462 lines (341 loc) · 17.4 KB
title description tagline button_text button_link layout
Getting Started
Get started with RAPIDS using conda or docker.
Get RAPIDS Now
SELECT RELEASE
#get-rapids
default

Getting Started

{: .section-title-full}

{% capture intro %} The RAPIDS data science framework is a collection of libraries for executing end-to-end data science pipelines completely in the GPU. {: .subtitle}

{% endcapture %} {% include section-single.html background="background-white" padding-top="0em" padding-bottom="0em" content-single=intro %}

{% capture start_left %}

GPU Accelerated Data Science

RAPIDS uses optimized NVIDIA CUDA®{: target="_blank"} primitives and high-bandwidth GPU memory to accelerate data preparation and machine learning. The goal of RAPIDS is not only to accelerate the individual parts of the typical data science workflow, but to accelerate the complete end-to-end workflow.

It is designed to have a familiar look and feel to data scientists working in Python. Here's a code snippet where we read in a CSV file and output some descriptive statistics:

import cudf
df = cudf.read_csv('path/to/file.csv')
for column in df.columns:
    print(df[column].mean())

{% endcapture %} {% capture start_right %}

Test Drive RAPIDS Now

Jump right into a GPU powered RAPIDS notebook, online, with SageMaker Studio Lab (free account required):

Studio Lab

{% endcapture %} {% capture gs_overview %}

Installation Overview

{: .section-title-full}

In four steps, easily install RAPIDS on a local system or cloud instance with a CUDA enabled GPU for either Conda or Docker and then explore our user guides and examples. Pip packages are here with experimental access!

{% endcapture %} {% capture gs_left %}

  • Check system requirements
  • Choose a cloud or local system

{% endcapture %} {% capture gs_right%}

  • Select and install RAPIDS libraries
  • Check out examples and user guides

{% endcapture %} {% include section-halfs.html background="background-white" padding-top="1em" padding-bottom="3em" content-left-half=start_left content-right-half=start_right %} {% include section-single.html background="background-white" padding-top="3em" padding-bottom="0em" content-single=gs_overview %} {% include section-halfs.html background="background-white" padding-top="1em" padding-bottom="10em" content-left-half=gs_left content-right-half=gs_right %}

{% capture prov %} # Step 1: Provision A System {: .section-title-full}

{: .section-title-halfs} {% endcapture %} {% capture req_left%}

System Requirements

All provisioned systems need to be RAPIDS capable. Here's what is required:

GPU: NVIDIA Pascal™ or better with compute capability{: target="_blank"} 6.0+ More details {: target="_blank"}

OS: One of the following OS versions:

Ubuntu 18.04/20.04 or CentOS 7 / Rocky Linux 8 with gcc/++ 9.0+ {: .no-tb-margins } Windows 11 using WSL2 See separate install guide {: target="_blank"} {: .no-tb-margins } RHEL 7/8 support is provided through CentOS 7 / Rocky Linux 8 builds/installs

CUDA & NVIDIA Drivers: One of the following supported versions: {: .no-tb-margins }

  • 11.2{: target="_blank"} & v460.27.03+
  • 11.4{: target="_blank"} & v470.42.01+
  • 11.5{: target="_blank"} & v495.29.05+

{% endcapture %} {% capture req_mid %}

RAPIDS Cloud Systems

Learn how to deploy RAPIDS on
Cloud Service Providers {: target="_blank"}

AWS

Azure ML

GCP

Paperspace

{% endcapture %} {% capture req_right %}

RAPIDS Local Systems

Aside from the system requirements, other considerations for best performance include: {: .no-tb-margins }

  • SSD drive (NVMe preferred)
  • Approximately 2:1 ratio of host RAM to total GPU Memory (especially useful for Dask)
  • NVLink if with 2 or more GPUs

We suggest taking a look at the sample workflow in our Docker container, which illustrates how straightforward a basic XGBoost model training and testing workflow runs with RAPIDS. {% endcapture %} {% include slopecap.html background="background-gray" position="top" slope="down" %} {% include section-single.html background="background-gray" padding-top="3em" padding-bottom="0em" content-single=prov %} {% include section-thirds.html background="background-gray" padding-top="0em" padding-bottom="10em" content-left-third=req_left content-middle-third=req_mid content-right-third=req_right %}

{% capture env_overview %} # Step 2: Install Environment {: .section-title-full }

For most installations, you will need a Conda or Docker environments installed for RAPIDS. Note, these examples are structured for installing on Ubuntu. Please modify appropriately for CentOS / Rocky Linux. Windows 11 has a separate installation guide.

{% endcapture %} {% include slopecap.html background="background-white" position="top" slope="up" %} {% include section-single.html background="background-white" padding-top="3em" padding-bottom="1em" content-single=env_overview %}

{% capture env_right %}

Docker

{: .section-title-halfs}

RAPIDS requires both Docker CE v19.03+ and nvidia-container-toolkit{: target="_blank"} installed. {: .no-tb-margins }

  • Legacy Support: Docker CE v17-18 and nvidia-docker2{: target="_blank"}

1. Download and Install. Copy command below to download and install the latest Docker CE Edition:

curl https://get.docker.com | sh

{: .margin-bottom-3em}

2. Install Latest NVIDIA Docker. For example, this is the Ubuntu Example:

distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
curl -s -L https://nvidia.github.io/libnvidia-container/experimental/$distribution/libnvidia-container-experimental.list | sudo tee /etc/apt/sources.list.d/libnvidia-container-experimental.list
sudo apt-get update
sudo apt-get install -y nvidia-docker2

{: .margin-bottom-3em}

3. Start Docker. In new terminal window run:

sudo service docker stop
sudo service docker start

{: .margin-bottom-3em}

4a. Test NVIDIA Docker:

docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark

{: .margin-bottom-3em}

4b. Legacy Docker Users. Docker CE v18 & nvidia-docker2{: target="_blank"} users will need to replace the following for compatibility:

'docker run --gpus all' with 'docker run --runtime=nvidia'

{: .margin-bottom-3em}

{% endcapture %}

{% capture env_left %}

Conda

{: .section-title-halfs}

RAPIDS can use either a minimal conda installation with Miniconda{: target="_blank"} or a full installation of Anaconda{: target="_blank"}. Below is a quick installation guide using miniconda.

1. Download and Run Install Script. Copy the command below to download and run the miniconda install script:

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh

{: .margin-bottom-3em}

2. Customize Conda and Run the Install. Use the terminal window to finish installation. Note, we recommend enabling conda-init.

3. Start Conda. Open a new terminal window, which should now show Conda initialized. {: .padding-bottom-3em }

Build From Source

{: .section-title-halfs}

To build RAPIDS from source, check each libraries` readme. For example the cuDF README{: target="_blank"} has details for source environment setup and build instructions. Further links are provided in the selector tool. If additional help is needed reach out on our Slack Channel. {: .padding-bottom-3em }

Where is PIP?

{: .section-title-halfs}

Pip installation of RAPIDS is back! You can try our experimental pip packages here

{% endcapture %} {% include section-halfs.html background="background-white" padding-top="1em" padding-bottom="10em" content-left-half=env_left content-right-half=env_right %}

{% capture selector_header %} # Step 3: Install RAPIDS {: .section-title-full}

RAPIDS is available in conda packages, docker images, and from source builds. Use the tool below to select your preferred method, packages, and environment to install RAPIDS. Certain combinations may not be possible and are dimmed automatically. Be sure you've met the System Requirements above and see the Next Steps below. {: .padding-bottom-3em }

Release Selector

{: .section-title-full}

{% endcapture %} {% include slopecap.html background="background-purple" position="top" slope="up" %} {% include section-single.html background="background-purple" padding-top="3em" padding-bottom="0em" content-single=selector_header %} {% include selector.html background="background-purple" padding-top="0em" padding-bottom="10em" %}

{% capture next_steps %} # Step 4: Learn More {: .section-title-full}

Once installation has been successful, explore the capabilities of RAPIDS with the provided notebooks, tutorials, and guides below.

{% endcapture %} {% include slopecap.html background="background-gray" position="top" slope="up" %} {% include section-single.html background="background-gray" padding-top="5em" padding-bottom="0em" content-single=next_steps %}

{% capture use_left %}

RAPIDS on Conda

{: .section-title-halfs}

Get Example Notebooks

1. Install Jupyter Lab. If it or Jupyter Notebook is not already installed.

2. Get Notebooks. See links to the RAPIDS Notebooks and Community Notebooks below.

3. Run RAPIDS. Use Python directly or start JupyterLab as below:

jupyter-lab --allow-root --ip='0.0.0.0' --NotebookApp.token='**your token**'

{: .margin-bottom-3em}

4. Check out the RAPIDS tutorials and workflows examples.

5. Explore. See our integrations or install other favorite Data Science or Machine Learning libraries. {: .padding-bottom-3em }

RAPIDS User Guide Repositories

Go to RAPIDS Notebooks{: target="_blank"} or clone directly:

git clone https://github.com/rapidsai/notebooks.git
git submodule update --init --remote --no-single-branch --depth 1

{: .margin-bottom-3em}

Go To RAPIDS Community Notebooks{: target="_blank"} or clone directly:

git clone https://github.com/rapidsai-community/notebooks-contrib.git

{: .margin-bottom-3em}

Go To Cloud ML Notebooks{: target="_blank"} or clone directly:

git clone https://github.com/rapidsai/cloud-ml-examples.git

{: .margin-bottom-3em}

{% endcapture %} {% capture use_right %}

RAPIDS on Docker

{: .section-title-halfs}

Running Multi-Node/
Multi-GPU (MNMG) Environment

To start the container in an MNMG environment:

docker run -t -d --gpus all --shm-size=1g --ulimit memlock=-1 -v $PWD:/ws <container label>

{: .margin-bottom-3em}

The standard docker command may be sufficient, but the additional arguments ensures more stability. See the NCCL docs{: target="_blank"} and UCX docs{: target="_blank"} for more details on MNMG usage. {: .padding-bottom-3em }

Start / Stop Jupyter Lab Notebooks

Either the standard single GPU or the modified MNMG Docker command above should auto-run a Jupyter Lab Notebook server. If it does not, or a restart is needed, run the following command within the Docker container to launch the notebook server:

bash /rapids/utils/start-jupyter.sh

{: .margin-bottom-3em}

If, for whatever reason, you need to shut down the Jupyter Lab server, use:

bash /rapids/utils/stop-jupyter.sh

{: .margin-bottom-3em}

NOTE: Defaults will run JupyterLab{: target="_blank"} on your host machine at port: 8888. {: .padding-bottom-3em }

Explore RAPIDS Demo Notebooks

RAPIDS demo notebooks can be found in the notebooks directory:

/rapids/notebooks/cuml (Machine Learning Algorithms) {: .no-tb-margins }

/rapids/notebooks/cugraph (Graph Analytics) {: .no-tb-margins }

/rapids/notebooks/cuspatial (Spatial Analytics) {: .no-tb-margins }

/rapids/notebooks/cusignal (Signal Analytics) {: .no-tb-margins }

/rapids/notebooks/clx (Cyber Security Log Analytics) {: .no-tb-margins }

/rapids/notebooks/xgboost (XGBoost) {: .no-tb-margins }

You can get more RAPIDS tutorials and workflow examples by git cloning the RAPIDS Community Notebooks{: target="_blank"}. {: .padding-bottom-3em }

Advanced Usage

See the RAPIDS Container README{: target="_blank"} for more information about using custom datasets. Docker Hub{: target="_blank"} and NVIDIA GPU Cloud{: target="_blank"} host RAPIDS containers with full list of available tags{: target="_blank"}.

{% endcapture %}

{% include section-halfs.html background="background-gray" padding-top="1em" padding-bottom="10em" content-left-half=use_left content-right-half=use_right %}

{% include slopecap.html background="background-darkpurple" position="top" slope="down" %}

{% include cta-footer.html name="INSTALL RAPIDS NOW" button="SELECT RELEASE" link="start.html#get-rapids" %}