Skip to content

Commit

Permalink
user mannuals: gpu share with HAMi
Browse files Browse the repository at this point in the history
Signed-off-by: wangjianyu <[email protected]>
  • Loading branch information
wangjianyu committed Dec 19, 2024
1 parent a7184fa commit 13a3062
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 9 deletions.
18 changes: 9 additions & 9 deletions docs/user-manuals/device-scheduling-gpu-share-with-hami.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,6 @@ The scheduled GPU devices are bound to the container requires support from the r
| Runtime Environment | Installation |
| --------------------------------------------- | ------------------------------------------------------------ |
| Containerd >= 1.7.0 <br /> Koordinator >= 1.6 | Please make sure NRI is enabled in containerd. If not, please refer to [Enable NRI in Containerd](https://github.com/containerd/containerd/blob/main/docs/NRI.md) |
| others | Please make sure koord-runtime-proxy component is correctly installed in you cluser. If not, please refer to [Installation Runtime Proxy](installation-runtime-proxy). |

#### HAMi-Core Installation

Expand Down Expand Up @@ -59,7 +58,7 @@ spec:
- /bin/sh
- -c
- |
cp -f /lib64/libvgpu.so /data/bin && sleep 3600000
cp -f /k8s-vgpu/lib/nvidia/libvgpu.so /data/bin && sleep 3600000
image: docker.m.daocloud.io/projecthami/hami:v2.4.0
imagePullPolicy: Always
name: name
Expand All @@ -73,10 +72,6 @@ spec:
volumeMounts:
- mountPath: /data/bin
name: data-bin
hostNetwork: true
hostPID: true
runtimeClassName: nvidia
schedulerName: kube-scheduler
tolerations:
- operator: Exists
volumes:
Expand All @@ -94,7 +89,7 @@ DeviceScheduling is *Enabled* by default. You can use it without any modificatio
## Use GPU Share With HAMi
1. Create a Pod to apply for a GPU card with 50% computing power and 50% video memory, and specify the need for hami-core isolation through the Pod Label koordinator.sh/gpu-isolation-provider
1. Create a Pod to apply for a GPU card with 50% computing power and 50% gpu memory, and specify the need for hami-core isolation through the Pod Label koordinator.sh/gpu-isolation-provider
```yaml
apiVersion: v1
Expand Down Expand Up @@ -140,7 +135,7 @@ apiVersion: v1
kind: Pod
metadata:
annotations:
scheduling.koordinator.sh/device-allocated: '{"gpu":[{"minor":0,"resources":{"kubernetes.io/gpu-core":"50","kubernetes.io/gpu-memory":"12508288Ki","kubernetes.io/gpu-memory-ratio":"50"}}]}'
scheduling.koordinator.sh/device-allocated: '{"gpu":[{"minor":1,"resources":{"koordinator.sh/gpu-core":"50","koordinator.sh/gpu-memory":"11520Mi","koordinator.sh/gpu-memory-ratio":"50"}}]}'
name: pod-example
namespace: default
labels:
Expand All @@ -150,5 +145,10 @@ metadata:

You can find the concrete device allocate result through annotation `scheduling.koordinator.sh/device-allocated`.

2. 通过 kubectl exec 进入 Pod,NVIDIA-SMI 观察 Pod 能够使用的内存上限
2. Enter the pod and you can see that the upper limit of the gpu memory seen by the program inside the pod is the value shown in the allocation result above.

```bash
$ kubectl exec -it -n default pod-example bash
```

![image](/img/gpu-share-with-hami-result.png)
Binary file added static/img/gpu-share-with-hami-result.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 13a3062

Please sign in to comment.