Skip to content

Commit

Permalink
F/76 nomad 0.10 (#81)
Browse files Browse the repository at this point in the history
* chore: upgrade to fabio 1.5.13

* fix: missing terraform 0.12. migration

* chore: update to terraform_aws_consul module 0.7.3

* refactor: removed obsolete supervisor support

* refactor: making SSH key creation for tests configurable

* refactor: moved AMI cleanup into helper, making SSH cleanup configurable

* chore: upgrade to nomad 0.10.2, consul 1.6.2, tf consul 0.7.3

Upgrading the terraform module of consul includes refactoring
to support change to systemd init system.

* chore: update terratest dependencies

* doc: Add information to monitor nomad servers and nodes

* chore: cleanup install script

* chore: cleanup merge leftover

* feat: AMI with Consul Connect Support - UNTESTED

In order to support Consul Connect the CNI plugins must be installed.
They seem to provide the networking capabilities: bridge.

AMI can be build and starts up a cluster, but functionality of the
bridge network inside of nomad was not tested.

* doc: Consul Connect demo - NOT WORKING

Following the tutorial referenced in the README.md and
trying to setup.

This example requires the CNI enabled AMI.

* feat: added make target for CNI AMI

Command to execute CNI AMI creation:
- make -f ./modules/ami2/Makefile nomad-consul-docker-ecr-cni.json

* chore: removed tflint

tflint was removed as dependency and
should be introduced with a separate PR.

* chore: added cni into AMI naming

Making content of AMI more explicit with extending the name and
the used version of the CNI plugins.

Co-authored-by: Matthias Scholz <[email protected]>
  • Loading branch information
MatthiasScholz and MatthiasScholzTW authored Jan 6, 2020
1 parent 688653b commit ff36b0d
Show file tree
Hide file tree
Showing 29 changed files with 545 additions and 369 deletions.
5 changes: 5 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,11 @@ The picture shows the dependencies within the modules of the cos-stack and the d

## Troubleshooting

### Monitoring Server and Nodes

- `nomad monitor -log-level error|warn|info|debug|trace -node-id <node_id> | -server-id <server_id>`
- supported since [nomad 0.10.2](https://www.nomadproject.io/docs/commands/monitor.html)

### Nomad CLI complains about invalid Certificate

If you have deployed the cluster with https endpoints for the ui-albs and have created a selfsigned certificate you might get errors from the nomad cli complanig about an invalid certificate (`x509: certificate is..`). To fix this you have to integrate your custom root-CA you used for signing your certificate apropriately into your system.
Expand Down
83 changes: 83 additions & 0 deletions examples/jobs/consulconnect/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
# Usage Example: Consul Connect - NOT WORKING

This folder provides an example on how to use Consul Connect in the cluster.
It is derived from the this [tutorial](https://www.hashicorp.com/blog/consul-connect-integration-in-hashicorp-nomad/).

## !!! Errors !!!

Image: https://console.cloud.google.com/gcr/images/google-containers/GLOBAL/pause-amd64@sha256:163ac025575b775d1c0f9bf0bdd0f086883171eb475b5068e7defa4ca9e76516/details?tab=info

Local Pull Working:
- `docker pull gcr.io/google-containers/pause-amd64:3.0`

"""
failed to setup alloc:
pre-run hook "network" failed:
failed to create network for alloc:
Failed to pull `gcr.io/google_containers/pause-amd64:3.0`:
API error (500):
Get https://gcr.io/v2/: -> Google Container Registry
net/http:
request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
"""

### State of Investigation

Reason currently unclear. Further debugging necessary.

Checked:
- docker.config - ecr helper only limited to amazon,
-> all other Docker registries should be supported as well.
- Security Group Configuration - outbound traffic for nodes in private datacenter fully open

## Usage

This usage example was tested with `examples/root-example/README.md`.

- `nomad run api_service.nomad`
- `nomad run dashboard_service.nomad`

- `curl ???`

### Prerequisits

- consul >=1.6
- CNI plugins installed on the instance

### Limitations
- [Consul Connect Native](https://www.consul.io/docs/connect/native.html) is not yet supported.
- -> Integration into the application without sidecar not usable.
- Consul Connect HTTP and gRPC checks are not yet supported.
- -> No [health check](https://www.consul.io/docs/agent/checks.html) support.
- -> __Fabio usage unclear__.
- [Consul ACLs](https://learn.hashicorp.com/consul/security-networking/production-acls) are not yet supported.
- -> No additional access management only network separation.
- __Variable interpolation for group services and checks are not yet supported.__ ???

## Background

- envoy via [Docker](https://hub.docker.com/r/envoyproxy/envoy)

> Hashicorp:
> Prior to Nomad 0.10, Nomad’s networking model running all applications in _host networking mode_.
> Applications running on the same host could communicate with each other over localhost!
>
> Nomad 0.10 introduces network namespace support.
> This is a new network model within Nomad
> where task groups are a single network endpoint and
> share a network namespace.
### Job Specification

- `connect`
- `sidecar_service`

#### network stanza - New Networking Modes

- _none_
- isolated network without any network interfaces
- ___bridge__
- isolated network namespace with an interface that is bridged with the host
- _host_
- join the host network namespace and a shared network namespace is not created.
- _This matches the behavior in Nomad 0.9_
63 changes: 63 additions & 0 deletions examples/jobs/consulconnect/connect.nomad
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
job "countdash" {
datacenters = ["private-services"]

group "api" {
network {
mode = "bridge"
}

service {
name = "count-api"
port = "9001"

connect {
sidecar_service {}
}
}

task "web" {
driver = "docker"
config {
image = "hashicorpnomad/counter-api:v1"
}
}
}

group "dashboard" {
network {
mode ="bridge"
port "http" {
static = 9002
to = 9002
}
}

service {
name = "count-dashboard"
port = "9002"

tags = ["urlprefix-/dashboard"] # fabio

connect {
sidecar_service {
proxy {
upstreams {
destination_name = "count-api"
local_bind_port = 8080
}
}
}
}
}

task "dashboard" {
driver = "docker"
env {
COUNTING_SERVICE_URL = "http://${NOMAD_UPSTREAM_ADDR_count_api}"
}
config {
image = "hashicorpnomad/counter-dashboard:v1"
}
}
}
}
4 changes: 2 additions & 2 deletions examples/jobs/fabio.nomad
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,11 @@ job "fabio" {
task "fabio" {
driver = "exec" # Linux only!
config {
command = "fabio-1.5.10-go1.11.1-linux_amd64"
command = "fabio-1.5.13-go1.13.4-linux_amd64"
}

artifact {
source = "https://github.com/fabiolb/fabio/releases/download/v1.5.10/fabio-1.5.10-go1.11.1-linux_amd64"
source = "https://github.com/fabiolb/fabio/releases/download/v1.5.13/fabio-1.5.13-go1.13.4-linux_amd64"
}

resources {
Expand Down
11 changes: 0 additions & 11 deletions modules/ami/setup_nomad_consul.sh

This file was deleted.

4 changes: 4 additions & 0 deletions modules/ami2/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ aws_region:=us-east-1
.PHONY: ami
.PHONY: nomad-consul-docker-ecr.json
.PHONY: nomad-consul-docker.json
.PHONY: nomad-consul-docker-ecr-cni.json

ami: nomad-consul-docker-ecr.json

Expand All @@ -21,3 +22,6 @@ nomad-consul-docker-ecr.json:

nomad-consul-docker.json:
packer build -var aws_region=$(aws_region) nomad-consul-docker.json

nomad-consul-docker-ecr-cni.json:
packer build -var aws_region=$(aws_region) nomad-consul-docker-ecr-cni.json
66 changes: 3 additions & 63 deletions modules/ami2/install-nomad/install-nomad
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,6 @@ readonly DEFAULT_NOMAD_USER="nomad"
readonly SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
readonly SYSTEM_BIN_DIR="/usr/local/bin"

readonly SUPERVISOR_DIR="/etc/supervisor"
readonly SUPERVISOR_CONF_DIR="$SUPERVISOR_DIR/conf.d"

readonly SCRIPT_NAME="$(basename "$0")"

function print_usage {
Expand Down Expand Up @@ -68,71 +65,16 @@ function assert_not_empty {
fi
}

# Install steps are based on: http://unix.stackexchange.com/a/291098/215969
function install_supervisord_debian {
sudo apt-get install -y supervisor
sudo update-rc.d supervisor defaults

create_supervisor_config
sudo systemctl enable supervisor
}

function install_systemd_nomad_amazon_linux2 {
log_info "Installing systemd nomad.service"
# TODO: Add check for systemd?

# Moving service description script to the system
sudo cp "$SCRIPT_DIR/nomad.service" "/etc/systemd/system/"
# NOTE: Nomad is not yet configured - hence it makes no sense to try to start it!
sudo systemctl enable nomad.service
#sudo systemctl start nomad.service
#sudo systemctl status nomad.service

# FIXME: Adapt run-nomad script to reflect the configuration change
# Remove supervisord configration and adjust configuration file loaded in nomad.service
}


# Install steps are based on: http://stackoverflow.com/a/31576473/483528
function install_supervisord_amazon_linux2 {
echo "[DEBUG] [install-nomad] - supervisor - installing supervisor"
sudo pip install supervisor

echo "[DEBUG] [install-nomad] - supervisor - setup path"

sudo pip install supervisor
# On Amazon Linux, /usr/local/bin is not in PATH for the root user, so we add symlinks to /usr/bin, which is in PATH
if [[ ! -f "/usr/bin/supervisorctl" ]]; then
sudo ln -s /usr/local/bin/supervisorctl /usr/bin/supervisorctl
fi
if [[ ! -f "/usr/bin/supervisord" ]]; then
sudo ln -s /usr/local/bin/supervisord /usr/bin/supervisord
fi

echo "[DEBUG] [install-nomad] - supervisor - setup systemctl"
ls -l "/etc/systemd/"
sudo cp "$SCRIPT_DIR/supervisor.service" "/etc/systemd/system/"
echo "[DEBUG] [install-nomad] - supervisor - setup systemctl - enable"
sudo systemctl enable supervisor.service


echo "[DEBUG] [install-nomad] - supervisor - create supervisor config"
create_supervisor_config
#sudo chkconfig --add supervisor
#sudo chkconfig supervisor on

#echo "[DEBUG] [install-nomad] - supervisor - systemctl"
sudo systemctl start supervisor.service
# FIXME: DEBUG information
sudo systemctl status supervisor.service

#echo "[DEBUG] [install-nomad] - supervisor - journald"
#sudo journalctrl -xe
}

function create_supervisor_config {
sudo mkdir -p "$SUPERVISOR_CONF_DIR"
sudo cp "$SCRIPT_DIR/supervisord.conf" "$SUPERVISOR_DIR/supervisord.conf"
# NOTE: Nomad is not yet configured - hence it makes no sense to try to start it!
# DEBUG: sudo systemctl start nomad.service
# DEBUG: sudo systemctl status nomad.service
}

function has_yum {
Expand All @@ -149,11 +91,9 @@ function install_dependencies {
if $(has_apt_get); then
sudo apt-get update -y
sudo apt-get install -y awscli curl unzip jq
install_supervisord_debian
elif $(has_yum); then
# Amazon Linux 2 comes without pip
sudo yum install -y aws curl unzip jq python2-pip
install_supervisord_amazon_linux2
install_systemd_nomad_amazon_linux2
else
log_error "Could not find apt-get or yum. Cannot install dependencies on this OS."
Expand Down
Loading

0 comments on commit ff36b0d

Please sign in to comment.