title	description	tagline	button_text	button_link	layout
RAPIDS + NVIDIA Merlin	Learn How to Use RAPIDS with Merlin	RAPIDS + NVIDIA Merlin	NVTabular	https://developer.nvidia.com/nvidia-merlin	default

![NVIDIA]({{ site.baseurl }}{% link /assets/images/NVLogo_2D_H.png %}){: .projects-logo}

Scale and Accelerate
Recommender Systems on GPUs

{: .section-title-full}

{% capture intro_content %}

NVIDIA Merlin is an open source library designed to accelerate recommender systems on NVIDIA GPUs. It enables data scientists, machine learning engineers, and researchers to build high-performing recommenders at scale. Merlin includes tools to address common ETL, training, and inference challenges. Each stage of the Merlin pipeline is optimized to support hundreds of terabytes of data, which is all accessible through easy-to-use APIs. With Merlin, better predictions and increased click-through rates are within reach. {: .subtitle}

{% endcapture %}

{% include section-single.html background="background-white" padding-top="0em" padding-bottom="0em" content-single=intro_content %}

{% capture yd_left %}

Accelerated ETL

As the ETL component of the Merlin ecosystem, NVTabular is a feature engineering and preprocessing library for tabular data. It is designed to quickly and easily manipulate terabyte scale datasets that are used to train deep learning based recommender systems. NVTabular uses RAPIDS’ dask_cudf to perform GPU-accelerated transformation.

Read more about NVTabular’s features {: target="_blank"}

{% endcapture %} {% capture yd_mid %}

Accelerated Training

When training deep learning recommender system models, data loading can be a bottleneck. Merlin accelerates the training of deep learning recommender systems using RAPIDS’ cuDF and DaskcuDF to read asynchronous parquet files. This is used to speed-up existing TensorFlow and PyTorch training pipelines or used with HugeCTR to train deep learning recommender systems written in CUDA C++.

Read more about accelerated training {: target="_blank"}

{% endcapture %} {% capture yd_right %}

Accelerated Inference

NVTabular and HugeCTR both support the Triton Inference Server to provide GPU-accelerated inference. The Triton Inference Server simplifies the deployment of AI models to production at scale. It is an inference serving software that is open source and lets teams deploy trained AI models from any framework. The NVTabular ETL workflow and trained deep learning models can be easily deployed to production with only a few steps.

Read more about inference from examples {: target="_blank"}

{% endcapture %}

{% include section-thirds.html background="background-white" padding-top="0em" padding-bottom="10em" content-left-third=yd_left content-middle-third=yd_mid content-right-third=yd_right %}

{% capture start_left %}

Getting Started

{: .section-title-halfs}

It is easy to get started with Merlin. There are many examples and blog posts to reference.

Try Now Online

Try on Kaggle with:
GPU-accelerated ETL with NVTabular {: target="_blank"}
Accelerated training pipelines in PyTorch and FastAI {: target="_blank"}

{% endcapture %}

{% capture start_right %}

Try Our Notebook Examples

{: .section-subtitle-top-1} NVTabular and HugeCTR both provide a collection of examples based on a variety of recommender system datasets that are publicly available. Checkout the NVTabular notebooks{: target="_blank"} and HugeCTR notebooks{: target="_blank"}.

Pull Our Docker Container

Merlin published docker containers with pre-installed versions of the latest release on NVIDIA's NGC repository{: target="_blank"}. Pull the container and try out Merlin yourself.

See The Latest Docs

Access our current installations, guides, and tutorials in latest documentations for NVTabular{: target="_blank"} and HugeCTR{: target="_blank"}.

Read Our Blogs

Learn more about recommender systems and Merlin on our Blog{: target="_blank"}.

{% endcapture %} {% include slopecap.html background="background-gray" position="top" slope="up" %} {% include section-halfs.html background="background-gray" padding-top="10em" padding-bottom="5em" content-left-half=start_left content-right-half=start_right %}

{% capture nv_l %}

Accelerate ETL
with NVTabular

![NVIDIA]({{ site.baseurl }}{% link /assets/images/merlin_tab_chart.png%}){: .full-image-center}

NVTabular is capable of scaling ETL over multiple GPUs and nodes. NVTabular can process the Criteo 1TB Clicks Ads dataset in 13.8 minutes on a GPU and 1.9 minutes on eight GPUs, which is the largest, publicly available recommendation dataset that contains 1.3TB of uncompressed click logs with roughly 4 billion users. NVTabular’s processing time is much faster compared to the original NumPy script that requires 5 days (7200 minutes) and an optimized spark cluster that requires 3 hours (180 minutes). That accounts for a speedup of 13 times to 95 times.

Read more in our blogpost {: target="_blank"} {% endcapture %}

{% capture nv_r %}

Accelerate DL Training
with HugeCTR

![NVIDIA]({{ site.baseurl }}{% link /assets/images/merlin_huge_chart.png %}){: .full-image-center}

MLPerf is a consortium of AI leaders from academia, research labs, and industry whose mission is to “build fair and useful benchmarks” that provide unbiased evaluations of training and inference. HugeCTR on DGX-A100 is the fastest commercial solution available to train Facebook’s Deep Learning Recommender Model on 4TB of data. It finishes the training in 3.33 min and is 13.5x faster than the best CPU-only solution.

Read more in our blogpost {: target="_blank"} {% endcapture %}

{% include section-halfs.html background="background-gray" padding-top="3em" padding-bottom="10em" content-left-half=nv_l content-right-half=nv_r %}

{% capture end_bottom %}

Get Started with NVIDIA Merlin

{: .section-title-full .text-white}

{% endcapture %} {% include slopecap.html background="background-darkpurple" position="top" slope="down" %} {% include section-single.html background="background-darkpurple" padding-top="3em" padding-bottom="0em" content-single=end_bottom %}

{% include cta-footer-merlin.html background="background-darkpurple" %}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

merlin.md

merlin.md

Scale and Accelerate
Recommender Systems on GPUs

Accelerated ETL

Accelerated Training

Accelerated Inference

Getting Started

Try Now Online

Try Our Notebook Examples

Pull Our Docker Container

See The Latest Docs

Read Our Blogs

Accelerate ETL
with NVTabular

Accelerate DL Training
with HugeCTR

Get Started with NVIDIA Merlin

Files

merlin.md

Latest commit

History

merlin.md

File metadata and controls

Scale and Accelerate Recommender Systems on GPUs

Accelerated ETL

Accelerated Training

Accelerated Inference

Getting Started

Try Now Online

Try Our Notebook Examples

Pull Our Docker Container

See The Latest Docs

Read Our Blogs

Accelerate ETL with NVTabular

Accelerate DL Training with HugeCTR

Get Started with NVIDIA Merlin

Scale and Accelerate
Recommender Systems on GPUs

Accelerate ETL
with NVTabular

Accelerate DL Training
with HugeCTR