diff --git a/src/SUMMARY.md b/src/SUMMARY.md index 7498a2a8..0b0c70ee 100644 --- a/src/SUMMARY.md +++ b/src/SUMMARY.md @@ -253,7 +253,8 @@ - [Redis](documentation/system_administrators/advanced/grid3_redis.md) - [IPFS](documentation/system_administrators/advanced/ipfs/ipfs_toc.md) - [IPFS on a Full VM](documentation/system_administrators/advanced/ipfs/ipfs_fullvm.md) - - [IPFS on a Micro VM](documentation/system_administrators/advanced/ipfs/ipfs_microvm.md) + - [IPFS on a Micro VM](documentation/system_administrators/advanced/ipfs/ipfs_microvm.md) + - [AI & ML Workloads](documentation/system_administrators/advanced/ai_ml_workloads.md) - [ThreeFold Token](documentation/threefold_token/threefold_token.md) - [TFT Bridges](documentation/threefold_token/tft_bridges/tft_bridges.md) - [TFChain-Stellar Bridge](documentation/threefold_token/tft_bridges/tfchain_stellar_bridge.md) diff --git a/src/documentation/system_administrators/advanced/advanced.md b/src/documentation/system_administrators/advanced/advanced.md index 92a95c31..460ad918 100644 --- a/src/documentation/system_administrators/advanced/advanced.md +++ b/src/documentation/system_administrators/advanced/advanced.md @@ -12,3 +12,4 @@ In this section, we delve into sophisticated topics and powerful functionalities - [IPFS](./ipfs/ipfs_toc.md) - [IPFS on a Full VM](./ipfs/ipfs_fullvm.md) - [IPFS on a Micro VM](./ipfs/ipfs_microvm.md) +- [AI & ML Workloads](./ai_ml_workloads.md) \ No newline at end of file diff --git a/src/documentation/system_administrators/advanced/ai_ml_workloads.md b/src/documentation/system_administrators/advanced/ai_ml_workloads.md new file mode 100644 index 00000000..fdd3445e --- /dev/null +++ b/src/documentation/system_administrators/advanced/ai_ml_workloads.md @@ -0,0 +1,125 @@ +

AI & ML Workloads

+ +

Table of Contents

+ +- [Introduction](#introduction) +- [Prerequisites](#prerequisites) +- [Prepare the System](#prepare-the-system) +- [Install the GPU Driver](#install-the-gpu-driver) +- [Set a Python Virtual Environment](#set-a-python-virtual-environment) +- [Install PyTorch and Test Cuda](#install-pytorch-and-test-cuda) +- [Set and Access Jupyter Notebook](#set-and-access-jupyter-notebook) +- [Run AI/ML Workloads](#run-aiml-workloads) + +*** + +## Introduction + +We present a basic method to deploy artificial intelligence (AI) and machine learning (ML) on the TFGrid. For this, we make use of dedicated nodes and GPU support. + +In the first part, we show the steps to install the Nvidia driver of a GPU card on a full VM Ubuntu 22.04 running on the TFGrid. + +In the second part, we show how to use PyTorch to run AI/ML tasks. + +## Prerequisites + +You need to reserve a [dedicated GPU node](../../dashboard/deploy/dedicated_machines.md) on the ThreeFold Grid. + +## Prepare the System + +- Update the system + ``` + dpkg --add-architecture i386 + apt-get update + apt-get dist-upgrade + reboot + ``` +- Check the GPU info + ``` + lspci | grep VGA + lshw -c video + ``` + +## Install the GPU Driver + +- Download the latest Nvidia driver + - Check which driver is recommended + ``` + apt install ubuntu-drivers-common + ubuntu-drivers devices + ``` + - Install the recommended driver (e.g. with 535) + ``` + apt install nvidia-driver-535 + ``` + - Reboot and reconnect to the VM +- Check the GPU status + ``` + nvidia-smi + ``` + +Now that the GPU node is set, let's work on setting PyTorch to run AI/ML workloads. + +## Set a Python Virtual Environment + +Before installing Python package with pip, you should create a virtual environment. + +- Install the prerequisites + ``` + apt update + apt install python3-pip python3-dev + pip3 install --upgrade pip + pip3 install virtualenv + ``` +- Create a virtual environment + ``` + mkdir ~/python_project + cd ~/python_project + virtualenv python_project_env + source python_project_env/bin/activate + ``` + +## Install PyTorch and Test Cuda + +Once you've created and activated a virtual environment for Pyhton, you can install different Python packages. + +- Install PyTorch and upgrade Numpy + ``` + pip3 install torch + pip3 install numpy --upgrade + ``` + +Before going further, you can check if Cuda is properly installed on your machine. + +- Check that Cuda is available on Python with PyTorch by using the following lines: + ``` + import torch + torch.cuda.is_available() + torch.cuda.device_count() # the output should be 1 + torch.cuda.current_device() # the output should be 0 + torch.cuda.device(0) + torch.cuda.get_device_name(0) + ``` + +## Set and Access Jupyter Notebook + +You can run Jupyter Notebook on the remote VM and access it on your local browser. + +- Install Jupyter Notebook + ``` + pip3 install notebook + ``` +- Run Jupyter Notebook in no-browser mode and take note of the URL and the token + ``` + jupyter notebook --no-browser --port=8080 --ip=0.0.0.0 + ``` +- On your local machine, copy and paste on a browser the given URL but make sure to change `127.0.0.1` with the WireGuard IP (here it is `10.20.4.2`) and to set the correct token. + ``` + http://10.20.4.2:8080/tree?token= + ``` + +## Run AI/ML Workloads + +After following the steps above, you should now be able to run Python codes that will make use of your GPU node to compute AI and ML workloads. + +Feel free to explore different ways to use this feature. For example, the [HuggingFace course](https://huggingface.co/learn/nlp-course/chapter1/1) on natural language processing is a good introduction to machine learning. \ No newline at end of file diff --git a/src/documentation/system_administrators/system_administrators.md b/src/documentation/system_administrators/system_administrators.md index d69401cd..0351bda4 100644 --- a/src/documentation/system_administrators/system_administrators.md +++ b/src/documentation/system_administrators/system_administrators.md @@ -81,4 +81,5 @@ For complementary information on ThreeFold grid and its cloud component, refer t - [Redis](./advanced/grid3_redis.md) - [IPFS](./advanced/ipfs/ipfs_toc.md) - [IPFS on a Full VM](./advanced/ipfs/ipfs_fullvm.md) - - [IPFS on a Micro VM](./advanced/ipfs/ipfs_microvm.md) \ No newline at end of file + - [IPFS on a Micro VM](./advanced/ipfs/ipfs_microvm.md) + - [AI & ML Workloads](./advanced/ai_ml_workloads.md) \ No newline at end of file