Skip to content

Commit

Permalink
removed all instances of master from scripts and playbooks
Browse files Browse the repository at this point in the history
Signed-off-by: John Lockman <[email protected]>
  • Loading branch information
j0hnL committed Jul 7, 2020
1 parent 30ac8de commit 44fce1e
Show file tree
Hide file tree
Showing 41 changed files with 46 additions and 46 deletions.
4 changes: 2 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ Contributions to Omnia are made through [Pull Requests (PRs)](https://help.githu
6. **Create a pull request:** [Create a pull request](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/creating-a-pull-request) with a title following this format Issue ###: Description (_i.e., Issue 1023: Reformat testutils_). It is important that you do a good job with the description to make the job of the code reviewer easier. A good description not only reduces review time, but also reduces the probability of a misunderstanding with the pull request.
* **Important:** When preparing a pull request it is important to stay up-to-date with the project repository. We recommend that you rebase against the upstream repo _frequently_. To do this, use the following commands:
```
git pull --rebase upstream master #upstream is dellhpc/omnia
git pull --rebase upstream devel #upstream is dellhpc/omnia
git push --force origin <pr-branch-name> #origin is your fork of the repository (e.g., <github_user_name>/omnia.git)
```
* **PR Description:** Be sure to fully describe the pull request. Ideally, your PR description will contain:
Expand All @@ -42,7 +42,7 @@ Contributions to Omnia are made through [Pull Requests (PRs)](https://help.githu
4. How to verify that the changes work correctly.

## Omnia Branches and Contribution Flow
The diagram below describes the contribution flow. Omnia has two lifetime branches: `devel` and `master`. The `master` branch is reserved for releases and their associated tags. The `devel` branch is where all development work occurs. The `devel` branch is also the default branch for the project.
The diagram below describes the contribution flow. Omnia has two lifetime branches: `devel` and `release`. The `release` branch is reserved for releases and their associated tags. The `devel` branch is where all development work occurs. The `devel` branch is also the default branch for the project.

![Omnia Branch Flowchart](docs/images/omnia-branch-structure.png "Flowchart of Omnia branches")

Expand Down
6 changes: 3 additions & 3 deletions docs/INSTALL.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ Omnia playbooks perform several tasks:
- Docker
- Kubelet

`master` playbook
`manager` playbook
* Install Helm v3
* (optional) add firewall rules for Slurm and kubernetes

Expand All @@ -72,10 +72,10 @@ Everything from this point on can be called by using the `init` tag
ansible-playbook -i host_inventory_file kubernetes/kubernetes.yml --tags "init"
```

`startmaster` playbook
`startmanager` playbook
* turn off swap
*Initialize Kubernetes
* Head/master
* Head/manager
- Start K8S pass startup token to compute/slaves
- Initialize software defined networking (Calico)

Expand Down
4 changes: 2 additions & 2 deletions docs/PREINSTALL.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ Omnia assumes that prior to installation:
* Systems have a base operating system (currently CentOS 7 or 8)
* Network(s) has been cabled and nodes can reach the internet
* SSH Keys for `root` have been installed on all nodes to allow for password-less SSH
* Ansible is installed on either the master node or a separate deployment node
* Ansible is installed on either the manager node or a separate deployment node
```
yum install ansible
```
Expand All @@ -22,6 +22,6 @@ Omnia assumes that servers are already connected to the network and have access
### Network Topology
Possible network configurations include:
* A flat topology where all nodes are connected to a switch which includes an uplink to the internet. This requires multiple externally-facing IP addresses
* A hierarchical topology where compute nodes are connected to a common switch, but the master node contains a second network connection which is connected to the internet. All outbound/inbound traffic would be routed through the master node. This requires setting up firewall rules for IP masquerade, see [here](https://www.server-world.info/en/note?os=CentOS_7&p=firewalld&f=2) for an example.
* A hierarchical topology where compute nodes are connected to a common switch, but the manager node contains a second network connection which is connected to the internet. All outbound/inbound traffic would be routed through the manager node. This requires setting up firewall rules for IP masquerade, see [here](https://www.server-world.info/en/note?os=CentOS_7&p=firewalld&f=2) for an example.
### IP and Hostname Assignment
The recommended setup is to assign IP addresses to individual servers. This can be done manually by logging onto each node, or via DHCP.
2 changes: 1 addition & 1 deletion docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,6 @@ It's not just new features and bug fixes that can be contributed to the Omnia pr
* Feedback
* Validation that it works for your particular configuration

If you would like to contribute, see [CONTRIBUTING](https://github.com/dellhpc/omnia/blob/master/CONTRIBUTING.md).
If you would like to contribute, see [CONTRIBUTING](https://github.com/dellhpc/omnia/blob/devel/CONTRIBUTING.md).

### [Omnia Contributors](CONTRIBUTORS.md)
2 changes: 1 addition & 1 deletion examples/README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Examples


The examples [K8s Submit](https://github.com/dellhpc/omnia/blob/master/examples/k8s-TensorFlow-resnet50-multinode-MPIOperator.yaml) and [SLURM submit](https://github.com/dellhpc/omnia/blob/master/examples/slurm-TensorFlow-resnet50-multinode-MPI.batch) are provide as examples for running the resnet50 benchmark with TensorFlow on 8 GPUs using 2 C4140s.
The examples [K8s Submit](https://github.com/dellhpc/omnia/blob/devel/examples/k8s-TensorFlow-resnet50-multinode-MPIOperator.yaml) and [SLURM submit](https://github.com/dellhpc/omnia/blob/devel/examples/slurm-TensorFlow-resnet50-multinode-MPI.batch) are provide as examples for running the resnet50 benchmark with TensorFlow on 8 GPUs using 2 C4140s.

## Submitting the example

Expand Down
6 changes: 3 additions & 3 deletions kubernetes/host_inventory_file
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ all:
children:
cluster:
children:
master:
manager:
hosts:
compute000:
workers:
Expand All @@ -14,7 +14,7 @@ all:
hosts:
compute002:
compute004:
compute005:
#compute005:
vars:
single_node: false
master_ip: 10.0.0.100
manager_ip: 10.0.0.100
2 changes: 1 addition & 1 deletion kubernetes/jupyterhub.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
#Playbook for installing JupyterHub v1.1.0 in Omnia

# Start K8s worker servers
- hosts: master
- hosts: manager
gather_facts: false
roles:
- jupyterhub
2 changes: 1 addition & 1 deletion kubernetes/kubeflow.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
#Playbook for installing Kubeflow v1.0 on Omnia

# Start K8s worker servers
- hosts: master
- hosts: manager
gather_facts: false
roles:
- kubeflow
14 changes: 7 additions & 7 deletions kubernetes/kubernetes.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,17 +30,17 @@
roles:
- computeGPU

# Apply Master Config
- hosts: master
# Apply Manager Config
- hosts: manager
gather_facts: false
roles:
- master
- manager

# Start K8s on master server
- hosts: master
# Start K8s on manager server
- hosts: manager
gather_facts: false
roles:
- startmaster
- startmanager

# Start K8s worker servers
- hosts: compute,gpus
Expand All @@ -49,7 +49,7 @@
- startworkers

# Start K8s worker servers
- hosts: master
- hosts: manager
gather_facts: false
roles:
- startservices
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -15,15 +15,15 @@
---
#- name: Firewall Rule K8s:6443/tcp
#command: firewall-cmd --zone=internal --add-port=6443/tcp --permanent
#tags: master
#tags: manager
#
#- name: Firewall Rule K8s:10250/tcp
#command: firewall-cmd --zone=internal --add-port=10250/tcp --permanent
#tags: master
#tags: manater
##
#- name: Firewall Reload
#command: firewall-cmd --reload
#tags: master
#tags: manager
#
- name: Create /root/bin (if it doesn't exist)
file:
Expand All @@ -36,10 +36,10 @@
url: https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3
dest: /root/bin/get_helm.sh
mode: 700
tags: master
tags: manager

- name: Install Helm
command: /root/bin/get_helm.sh
tags: master
tags: manager

# install and start up OpenSM - III
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
tags: init

- name: Initialize kubeadm
command: /bin/kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address={{ master_ip }}
command: /bin/kubeadm init --pod-network-cidr=10.244.0.0/16 --apiserver-advertise-address={{ manager_ip }}
#command: /bin/kubeadm init
register: init_output
tags: init
Expand All @@ -44,30 +44,30 @@

- name: CA Hash
shell: openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
register: K8S_MASTER_CA_HASH
register: K8S_MANAGER_CA_HASH
tags: init

- name: Add K8S Master IP, Token, and Hash to dummy host
- name: Add K8S Manager IP, Token, and Hash to dummy host
add_host:
name: "K8S_TOKEN_HOLDER"
token: "{{ K8S_TOKEN.stdout }}"
hash: "{{ K8S_MASTER_CA_HASH.stdout }}"
ip: "{{ master_ip }}"
hash: "{{ K8S_MANAGER_CA_HASH.stdout }}"
ip: "{{ manager_ip }}"
tags: init

- name:
debug:
msg: "[Master] K8S_TOKEN_HOLDER K8S token is {{ hostvars['K8S_TOKEN_HOLDER']['token'] }}"
msg: "[Manager] K8S_TOKEN_HOLDER K8S token is {{ hostvars['K8S_TOKEN_HOLDER']['token'] }}"
tags: init

- name:
debug:
msg: "[Master] K8S_TOKEN_HOLDER K8S Hash is {{ hostvars['K8S_TOKEN_HOLDER']['hash'] }}"
msg: "[Manager] K8S_TOKEN_HOLDER K8S Hash is {{ hostvars['K8S_TOKEN_HOLDER']['hash'] }}"
tags: init

- name:
debug:
msg: "[Master] K8S_MASTER_IP is {{ master_ip }}"
msg: "[Manager] K8S_MANAGER_IP is {{ manager_ip }}"
tags: init

- name: Setup Calico SDN network
Expand Down Expand Up @@ -115,7 +115,7 @@
shell: kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep admin-user | awk '{print $1}') > /root/k8s/token
tags: init

- name: Edge / Workstation Install allows pods to scheudle on master
- name: Edge / Workstation Install allows pods to scheudle on manager
shell: kubectl taint nodes --all node-role.kubernetes.io/master-
when: single_node
tags: init
Expand Down
2 changes: 1 addition & 1 deletion kubernetes/roles/startworkers/tasks/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@

#- name:
#debug:
#msg: "[Worker] K8S_MASTER_IP is {{ hostvars['K8S_TOKEN_HOLDER']['ip'] }}"
#msg: "[Worker] K8S_MANGER_IP is {{ hostvars['K8S_TOKEN_HOLDER']['ip'] }}"
#tags: init

- name: "Kubeadmn join"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@
- /root/rpmbuild/RPMS/x86_64/slurm*20*.rpm
tags: install

- name: Install SLURM RPMs on Master
- name: Install SLURM RPMs on Manager
yum:
name: "{{ item }}"
#name: "{{ query('fileglob', ['/home/rpms/slurm*20*.rpm']) }}" <-- how it should work to avoid loop
Expand Down Expand Up @@ -96,14 +96,14 @@
command: sacctmgr -i add user root DefaultAccount=defaultgroup
tags: install

- name: Start slurmctld on Master
- name: Start slurmctld on Manager
service:
name: slurmctld
state: restarted
enabled: yes
tags: install

- name: Enable Slurmdbd on Master
- name: Enable Slurmdbd on Manager
service:
name: slurmdbd
state: restarted
Expand Down
6 changes: 3 additions & 3 deletions slurm/slurm.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,11 +23,11 @@
roles:
- slurm-common

# Apply Master Config, start services
- hosts: master
# Apply Manager Config, start services
- hosts: manager
gather_facts: false
roles:
- slurm-master
- slurm-manager

# Start SLURM workers
- hosts: compute
Expand Down
6 changes: 3 additions & 3 deletions slurm/slurm_inventory_file
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
[master]
[manager]
friday

[master:vars]
[manager:vars]
slurm_url=https://download.schedmd.com/slurm/slurm-20.02.0.tar.bz2
slurm_md5=md5:8ed2257471ff24ca213b510a4c1c3563

Expand All @@ -14,5 +14,5 @@ compute[002:005]
compute

[cluster:children]
master
manager
workers
2 changes: 1 addition & 1 deletion tools/install_tools.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
# limitations under the License.
---

- hosts: master
- hosts: manager
tasks:
- name: Install Change Personality Script
copy:
Expand Down

0 comments on commit 44fce1e

Please sign in to comment.