Skip to content

Commit

Permalink
add yarn copilot docs
Browse files Browse the repository at this point in the history
Signed-off-by: 佑祎 <[email protected]>
  • Loading branch information
zwzhang0107 committed Nov 22, 2023
1 parent 20c4082 commit 2430714
Show file tree
Hide file tree
Showing 8 changed files with 581 additions and 51 deletions.
225 changes: 224 additions & 1 deletion docs/best-practices/colocation-of-hadoop-yarn.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,227 @@ sidebar_position: 4

# Running Hadoop YARN with K8s by Koordinator

The english version document is under construction, please refer to the chinese version doc.
## Introduction

Koordinator has supported hybrid orchestration workloads on Kubernetes, so that batch jobs can use the requested but
unused resource as `koord-batch` priority and `BE` QoS class to improve the cluster utilization. However, there are
still lots of applications running beyond K8s such as Apache Haddop YARN. As a resource management platform in BigData
ecosystem, YARN has supported numbers of computing engines including MapReduce, Spark, Flink, Presto, etc.

In order to extend the co-location scenario of Koordinator, now the community has provided Hadoop YARN extended suits
`Koordinator YARN Copilot` in BigData ecosystem, supporting running Hadoop YARN jobs by koord-batch resources with other
K8s pods. The `Koordinator YARN Copilot` has following characteristics:

- Open-Source native: implement against open-sourced version of Hadoop YARN; so there is no hack inside YARN modules.
- Unifed resource priority and QoS strategy: the suits aims to the `koord-batch` priority of Koordinator, and also managed by QoS strategies of koordlet.
- Resource sharing on node level: node resources of `koord-batch` priority can be requested by tasks of YARN or `Batch` pods both.
- Adaptive for multiple environments: the suits can be run under any environment, including public cloud or IDC.

## Prerequisite

- Kuberenetes >= 1.18
- Koordinator >= 1.4
- Koordinator YARN Copilot >= 0.1
- Hadoop YARN >= 3.2.1

## Installation
All charts can be simply installed by helm v3.5+, which is a simple command-line tool, and you can get it
from [here](https://github.com/helm/helm/releases).

![image](/img/hadoop-k8s.svg)

### Install Koordinator
Please make sure Koordinator components are correctly installed in your cluster. For more information about install and
upgrade, please refer to [Installation](/docs/installation).
```shell script
# Firstly add koordinator charts repository if you haven't do this.
$ helm repo add koordinator-sh https://koordinator-sh.github.io/charts/

# [Optional]
$ helm repo update

# Install the latest version.
$ helm install koordinator koordinator-sh/koordinator
```

### Install Hadoop YARN
Haddop YARN is consist of ResourceManger and NodeManager, and currently we recommend users deploy the ResourceManger
independently on hosts, while the NodeManager as pod.

Koordinator community provides a demo chart `hadoop-yarn` with Hadoop YARN ResourceManager and NodeManager, also
including HDFS components as optional for running example jobs easily. You can use the demo chart for quick start
of YARN co-location, otherwise you can refer to [Installation](https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/YARN.html)
for official guides if you want to build your own YARN cluster.

```shell script
# Firstly add koordinator charts repository if you haven't do this.
$ helm repo add koordinator-sh https://koordinator-sh.github.io/charts/

# [Optional]
$ helm repo update
# Install the latest version.
$ helm install hadoop-yarn koordinator-sh/hadoop-yarn

# check hadoop yarn pods running status
kubectl get pod -n hadoop-yarn
```

Some key information should be known before you install YARN:
- ResourceManager must be accessible in K8s pod, no matter it is deployed as host mode or pod mode.
- NodeManager must be deployed as pod mode with an annotation `yarn.hadoop.apache.org/node-id=${nm-hostname}:8041` to identify node ID in YARN.
- NodeManager must use CgroupsLCEResourcesHandler as linux container executor, and specifies cgroup hierarchy under k8s best-effort directory.
- NodeManager pods request resources as `koord-batch` priority, so Koordinator must be pre-installed with co-location enabled.

These features have already been configured in Haddop YARN chart in koordinator repo, and if you are using self-maintained
YARN, please check the [Koordinator repo](https://github.com/koordinator-sh/charts/blob/main/charts/hadoop-yarn) for
reference during installation.

### Install Koordinator YARN Copilot
Koordinator YARN Copilot is consist of `yarn-opeartor` and `copilot-agent`(WIP),

```shell script
# Firstly add koordinator charts repository if you haven't do this.
$ helm repo add koordinator-sh https://koordinator-sh.github.io/charts/

# [Optional]
$ helm repo update

# Install the latest version.
$ helm install koordinator-yarn-copilot koordinator-sh/koordinator-yarn-copilot
```

## Configuration
1. configuration of koord-manager

After installing through the helm chart, the ConfigMap slo-controller-config will be created in the koordinator-system
namespace. YARN tasks are managed under best-effort cgroup, which should be configured as host level application, and
here are the related [issue](https://github.com/koordinator-sh/koordinator/issues/1727) of YARN tasks management under
Koordinator.

Create a configmap.yaml file based on the following ConfigMap content:
```yaml
apiVersion: v1
data:
colocation-config: |
{
"enable": true
}
resource-threshold-config: |
{
"clusterStrategy": {
"enable": true
}
}
resource-qos-config: |
{
"clusterStrategy": {
"lsrClass": {
"cpuQOS": {
"enable": true
}
},
"lsClass": {
"cpuQOS": {
"enable": true
}
},
"beClass": {
"cpuQOS": {
"enable": true
}
}
}
}
host-application-config: |
{
"applications": [
{
"name": "yarn-task",
"priority": "koord-batch",
"qos": "BE",
"cgroupPath": {
"base": "KubepodsBesteffort",
"relativePath": "hadoop-yarn/"
}
}
]
}
kind: ConfigMap
metadata:
name: slo-controller-config
namespace: koordinator-system
```
To avoid changing other settings in the ConfigMap, we commend that you run the kubectl patch command to update the ConfigMap.
```bash
$ kubectl patch cm -n koordinator-system slo-controller-config --patch "$(cat configmap.yaml)"
```

2. configuration of koord-yarn-copilot
`koord-yarn-copilot` communicates with YARN ResourceManager during resource syncing, and the ConfigMap defines YARN
related configurations.
```yaml
apiVersion: v1
data:
core-site.xml: |
<configuration>
</configuration>
yarn-site.xml: |
<configuration>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>resource-manager.hadoop-yarn:8033</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>resource-manager.hadoop-yarn:8032</value>
</property>
</configuration>
kind: ConfigMap
metadata:
name: yarn-config
namespace: koordinator-system
```
You can change the default address and port at `yarnConfiguration.resourceManager` in chart values.

### (Optional) Advanced Settings
You can check the helm chart [hadoop-yarm](https://github.com/koordinator-sh/charts/blob/main/charts/hadoop-yarn), and
[koordinator-yarn-copilot](https://github.com/koordinator-sh/charts/blob/main/charts/koordinator-yarn-copilot) for more
advanced settings.

## Check YARN Available Resources
1. Check node allocatable batch resources of Koordinator on node.
```bash
$ kubectl get node -o yaml | grep batch-cpu
kubernetes.io/batch-cpu: "60646"
kubernetes.io/batch-cpu: "60486"
$ kubectl get node -o yaml | grep batch-memory
kubernetes.io/batch-memory: "245976973438"
kubernetes.io/batch-memory: "243254790644"
```

2. Check node allocatable resources in YARN
Visit YARN ResourceManager web UI address `${hadoop-yarn-rm-addr}:8088/cluster/nodes` in browser to get YARN NM status and allocatable
resources.

If you are using the hadoop-yarn demo chart in Koordinator repo, please execute the following command to make RM accessible locally.
```shell script
$ kubectl port-forward -n hadoop-yarn service/resource-manager 8088:8088
```
Then open the ui in your browser: `http://localhost:8088/cluster/nodes`

The `VCores Avail` and `Mem Avail` will be exactly same with batch resources of K8s nodes.

## Submit YARN Jobs
Spark, Flink and other computing engines support submitting jobs to YARN since they were published, check the
official manual like [Spark](https://spark.apache.org/docs/latest/running-on-yarn.html) and
[Flink](https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/yarn/) before you start
the work.

It is worth noting the hadoop-yarn demo chart in Koordinator repo has already integrated with Spark client, you can execute the following
command to submit an example job, and get the running status through web UI of ResourceManager.
```shell script
$ kubectl exec -n hadoop-yarn -it ${yarn-rm-pod-name} yarn-rm -- /opt/spark/bin/spark-submit --master yarn --deploy-mode cluster --class org.apache.spark.examples.SparkPi /opt/spark/examples/jars/spark-examples_2.12-3.3.3.jar 1000
```
76 changes: 76 additions & 0 deletions docs/designs/koordinator-yarn.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# Koordinator YARN Copilot

## Introduction
Koordinator has supported hybrid orchestration workloads on Kubernetes, so that batch jobs can use the requested but
unused resource as koord-batch priority and BE QoS class to improve the cluster utilization. However, there still lots
of applications running beyond K8s such as [Apache Hadoop YARN](https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html).
As a resource management platform in BigData ecosystem, YARN has supported numbers of computing engines including
MapReduce, Spark, Flink, Presto, etc. Although some computing engines has provided K8s operators that can submit jobs
into the K8s, the Hadoop YARN ecosystem is still active, which can be shown from that most cloud providers are still
selling commercial products like [E-MapReduce](https://www.aliyun.com/product/bigdata/emapreduce).

In order to extend the co-location scenario of, the Koordinator community, together with developers from Alibaba Cloud,
Xiaohongshu, and Ant Financial, set up the project for running Hadoop YARN jobs by koord-batch resources with other K8s
pods, which can improve the cluster resource utilization by providing `batch` resource to Haddop YARN. This project has
been widely used in Xiaohongshu product environment.

## Technical Details

### Principles
- Keep YARN as the portal of job submission.
- Based on the open source version of Hadoop YARN, no intrusive modifications into YARN.
- The co-location resources provided by Koordinator can be used by both K8s Pod and YARN tasks, which means different types of applications can run in the same node.
- QoS policies of Koordlet should be compatible for YARN tasks.

![image](/img/hadoop-k8s.svg)

### Resource Allocation
In Koordinator, batch resources of nodes are dynamically calculated by koord-manager based on the node resource load and
updated as K8s extended-resource on Node. The `koord-yarn-operator` component will synchronize the batch resource to
YARN RM, so that YARN tasks can request these batch resources. Since the K8s scheduler and the YARN scheduler share the
amount of batch allocatable resource, the allocated information of schedulers should be known by others.

1. `koord-manager` calculates the original batch total `origin_batch_totaland`, and records it as node annotation of K8s.
2. `koord-yarn-operator` collects the amount of resources that YARN nodes have allocated from YARN RM `yarn_requested`, and records it as node annotation of K8s.
3. Before `koord-manager` updates the total batch resources of K8s, the resources that have been allocated by YARN must be excluded: `k8s_batch_total = origin_batch_total – yarn_requested`.
4. Before `koord-yarn-operator` updates resources to YARN RM, also, the amount of resources that K8s has allocated must be excluded: `yarn_batch_total = origin_batch_total – k8s_batch_requested`.

![image](/img/koord-yarn-operator.svg)

Since there are multiple schedulers working in cluster, `batch` priority resources may be overcommited due to the
sequence of resource synchronization. `koordlet` will perform arbitration for the allocated resource on node side.
However, unlike the arbitration of `kubelet`, `koordlet` use the QoS policy as arbitration methods with
the goals of "avoiding interference" and "ensuring the resource quality of batch priority", rejecting or evicting pods
according to the realtime status of resource usage.

### Node Runtime
Node Manager works on node side in YARN cluster, which is responsible for the life cycle management of tasks.
Under the K8s co-location scenario, NM will be deployed as DaemonSet. The resource management of NM and YARN tasks will
be separated into different cgroups for the purpose of fine-grained control, so that NM only needs to request resources
according to its own consumption.

![image](/img/node-manager-runtime.svg)

Koordinator requires YARN NM to enable LinuxContainerExecutor and specify the cgroup path under best-effort hierarchy,
because `kubelet` and `koordlet` use cgroups for QoS managenet, so that all YARN tasks can also be managed like other K8s Pods.

### QoS Strategies
Currently, `koodlet` supports a series of QoS policies, which also need to be adapted for YARN tasks. For resource
isolation parameters, such as Group Identity, Memory QoS, L3 Cache isolation, etc., `koordlet` will be adapted
according to the cgroup hierarchy. For dynamic strategies such as eviction and suppression, `koordlet` will add a new
module `yarn-copilot-agent`, which is used for adaption for YARN tasks operation, including meta-information collection,
metrics collection, task eviction operations, etc. of YARN tasks.

All QoS policies are still managed in `koordlet`, and relevant modules in `koordlet` communicate with
`yarn-copilot-agent`. Also, the API design of `yarn-copilot-agent` will keep scalability and can be used for connecting
other resource frameworks in the future.

![image](/img/yarn-copilot-agent.svg)

`koordlet` will support all QoS policies for YARN scenarios in subsequent versions.

## Join US
Koordinator has release some features on K8s and YARN co-location in latest versions of each component, the community is
still working on the iteration of other features in following milestions. If you have and questions or want to participate
in co-construction, you are welcome to submit an [issue](https://github.com/koordinator-sh/yarn-copilot/issues) or
comment in the [discussion](https://github.com/koordinator-sh/koordinator/discussions/1297).
Loading

0 comments on commit 2430714

Please sign in to comment.