Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(new source): Initial kubernetes_logs implementation #2653

Merged
merged 118 commits into from
Jul 22, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
118 commits
Select commit Hold shift + click to select a range
1356830
Add kubernetes-integration-tests feature
MOZGIII May 7, 2020
264128e
Enable kubernetes tests
MOZGIII May 7, 2020
18a0e55
Add skaffold for quick development iterations
MOZGIII May 21, 2020
4bc782a
Add kubernetes mod and cargo feature
MOZGIII May 21, 2020
ebc5663
Add an HTTP client tailored for k8s API in an in-cluster environment
MOZGIII May 21, 2020
6709988
Add a decoder for chained k8s responses
MOZGIII May 21, 2020
15a7297
Add block_on_std to test_util
MOZGIII May 22, 2020
bc7a275
Add tools for processing HTTP bodies as streams of k8s responses
MOZGIII May 21, 2020
e8127b2
Add Watcher trait
MOZGIII May 21, 2020
c45dbd5
Add ApiWatcher implementation
MOZGIII May 21, 2020
f02bcb9
Add MockWatcher implementation
MOZGIII May 21, 2020
0fd88b7
Add a reflector implementation
MOZGIII May 21, 2020
1b472f9
Add a placeholder for kubernetes logs source
MOZGIII May 21, 2020
f4f23f9
Add paths provider implementation based on pods state driven by refle…
MOZGIII May 21, 2020
d0fbebb
Add parser for k8s log formats
MOZGIII May 21, 2020
0eabab2
Add partial events merger
MOZGIII May 22, 2020
66bcd79
Add pod metadata annotator
MOZGIII May 21, 2020
e46acdf
Add kubernetes logs source implementation
MOZGIII May 21, 2020
9c49908
Change reflector errors to be less restrictive
MOZGIII May 25, 2020
cccb735
Better error handling
MOZGIII May 25, 2020
bb81580
Improve error handling
MOZGIII May 25, 2020
ad8f2f8
Better error handling for event pipeline
MOZGIII May 25, 2020
34f7731
Abstract API watcher around watch request builder
MOZGIII May 25, 2020
d584a12
Add assertions to the reflector internal state in-between the invocat…
MOZGIII May 25, 2020
36919fc
Fix the comment at optional tranform mod
MOZGIII May 25, 2020
9636f3f
Add a to do comment to make transform utils part of core
MOZGIII May 25, 2020
366b8ca
Fix typo at kustomization.yaml
MOZGIII May 25, 2020
cc4a1a8
More flexible interface at resource version
MOZGIII May 26, 2020
234b1e3
Fix the mod comment at resource version
MOZGIII May 26, 2020
7c8358d
Remove manual minikube init from scripts/skaffold.sh
MOZGIII May 26, 2020
ca0d979
Specify shell at scripts/minikube-docker-env.sh
MOZGIII May 26, 2020
8d41c33
Switch scripts/copy-docker-image-to-minikube.sh to use scripts/miniku…
MOZGIII May 26, 2020
d21279a
Switch to using device inodes for file fingerprinting
MOZGIII May 28, 2020
24d5604
Fix grammar at in-cluster config
MOZGIII Jun 1, 2020
5ed884c
Add the variable that's missing to the error at in-cluster config
MOZGIII Jun 1, 2020
1c95b30
Move in_cluster mod declaration to the top of the file
MOZGIII Jun 1, 2020
363ef53
Cut some ununsed code from src/kubernetes/resource_version.rs
MOZGIII Jun 1, 2020
3035739
Fix typo at src/kubernetes/reflector.rs
MOZGIII Jun 1, 2020
9ae9034
Move the file key to the top of the file
MOZGIII Jun 1, 2020
19e31e1
Fix comment at parsers test
MOZGIII Jun 1, 2020
f0a9bfe
Add a comment for Config::self_node_name
MOZGIII Jun 1, 2020
dc68cf2
Allow disabling automatic partial merge
MOZGIII Jun 1, 2020
6dd5799
Allow customizing the fields names used by pod metadata annotator
MOZGIII Jun 1, 2020
be7adac
Abstract reflector around state
MOZGIII Jun 2, 2020
685f0f6
Add support for delayed delete at reflector
MOZGIII Jun 2, 2020
2b57f5e
Reimplement the tests to add delayed deletion testing capability
MOZGIII Jun 4, 2020
def5d4a
Improve the request preparation code at kubernetes::client::Client
MOZGIII Jun 5, 2020
0d70b38
Add reexports at src/kubernetes/mod.rs
MOZGIII Jun 5, 2020
242a243
Adjust and use watcher error constructors
MOZGIII Jun 5, 2020
c1ea628
Eliminate unused transform
MOZGIII Jun 5, 2020
1be23b5
Add a test for stream error behavior
MOZGIII Jun 5, 2020
3c16cfb
Add tests and derives at transform::util::pick
MOZGIII Jun 5, 2020
a8ad244
Require Watcher::Stream to be Send
MOZGIII Jun 8, 2020
145f0b0
Add instrumentation
MOZGIII Jun 8, 2020
d868ac5
Add state maintenance and move delayed delete logic into to a state w…
MOZGIII Jun 9, 2020
5be1a81
Ignore instrumenting state tests
MOZGIII Jun 9, 2020
1cc6efc
Clean up reflector tests
MOZGIII Jun 10, 2020
3b86b27
Add flush debouncing to the evmap state
MOZGIII Jun 10, 2020
bbe3fc9
Proper delay control at main test flow
MOZGIII Jun 10, 2020
f61c2ec
Add evmap tests with and without debounce
MOZGIII Jun 10, 2020
bab7767
Fix a typo
MOZGIII Jun 11, 2020
38da9ce
Document the controversial join_host_port
MOZGIII Jun 11, 2020
b4b5495
Improve instrumenting watcher events
MOZGIII Jun 12, 2020
473bb12
Improve api watcher events
MOZGIII Jun 12, 2020
484e4ee
Rename init to new
MOZGIII Jun 12, 2020
851dad6
Use Strings at parser tests
MOZGIII Jun 12, 2020
e2d7542
Corrected partial message detection hueristics at docker parser
MOZGIII Jun 12, 2020
8409d38
Hint for where's what at parser test case
MOZGIII Jun 12, 2020
3111ac4
Bump base image at skaffold/docker/Dockerfile to debian:bullseye-slim
MOZGIII Jun 15, 2020
22839e4
Better script layout at skaffold/docker/Dockerfile
MOZGIII Jun 15, 2020
dfaed5e
Convert an annotation failure warn log to internal event
MOZGIII Jun 15, 2020
0cdae0a
Correct the shutdown logic
MOZGIII Jun 15, 2020
1115c0d
Specify STOPSIGNAL at skaffold/docker/Dockerfile
MOZGIII Jun 15, 2020
ca03bb4
Set terminationGracePeriodSeconds at distribution/kubernetes/vector-n…
MOZGIII Jun 15, 2020
741d0bb
Fix paths generation on Windows
MOZGIII Jun 15, 2020
d11ef18
Add a to do to unignore instrumenting state tests
MOZGIII Jun 15, 2020
53d6958
Add Kubernetes section to the CONTRIBUTING.md
MOZGIII Jun 15, 2020
8170521
Solve the issue with the config generation
MOZGIII Jun 15, 2020
a6a515a
Better document the intent for the kubernetes_logs source
MOZGIII Jun 15, 2020
a18720c
Print vector version upon docker build at skaffold/docker/Dockerfile
MOZGIII Jun 16, 2020
c0327c3
Add more build-time output at skaffold/docker/Dockerfile
MOZGIII Jun 16, 2020
1e44894
Add patchelf at skaffold/docker/Dockerfile
MOZGIII Jun 16, 2020
bed7aa4
Optimize commands order at skaffold/docker/Dockerfile for layer caching
MOZGIII Jun 16, 2020
fe4b2c5
Add kubectl at CONTRIBUTING.md
MOZGIII Jun 17, 2020
f5dc48f
Add additional details at CONTRIBUTING.md
MOZGIII Jun 18, 2020
7c7fbbb
Fix the data dir description
MOZGIII Jun 19, 2020
2032d99
Move transform utils to the source mod to avoid introducing global items
MOZGIII Jun 22, 2020
8d12cf2
Fix the typo at CONTRIBUTING.md
MOZGIII Jun 22, 2020
a74d976
Eliminate the ApiWatcher::invoke_boxed_stream
MOZGIII Jun 22, 2020
9b80bb9
Add code docs for join_host_port
MOZGIII Jun 22, 2020
4052c8e
Add a lifecycle system to manage futures sanely and reliably
MOZGIII Jun 23, 2020
6840d39
Reorganized the mod and use clauses at src/sources/kubernetes_logs/mo…
MOZGIII Jun 23, 2020
43a0735
Switch fingerprinter to checksum
MOZGIII Jun 23, 2020
d01dd88
Revert "Switch fingerprinter to checksum"
MOZGIII Jun 23, 2020
343193e
Invert the condition at Debounce::signal
MOZGIII Jun 23, 2020
3c6046f
Fix typo at src/internal_events/kubernetes/instrumenting_state.rs
MOZGIII Jun 25, 2020
182469b
Remove the rate limit from KubernetesLogsEventReceived
MOZGIII Jun 25, 2020
17d3f8f
Adjust the log style at KubernetesLogsEventReceived
MOZGIII Jun 25, 2020
36679b0
Move pod metadata annotation to access file path directly
MOZGIII Jun 25, 2020
4809994
Add test to ensure MultiResponseDecoder doesn't leak memory
MOZGIII Jun 25, 2020
f0f2e36
Simplified the picker logic and added tests
MOZGIII Jun 27, 2020
a839e3c
Add a special case for `\n` at the MultiResponseDecoder::finish
MOZGIII Jun 27, 2020
91b7757
Workaround for watch API failures
MOZGIII Jun 27, 2020
7d2559e
Add a long line test case for the CRI format parser
MOZGIII Jun 27, 2020
70471dc
Log the buffer state at response decoding error
MOZGIII Jun 28, 2020
487b66d
Handle data parsing errors as stream ends
MOZGIII Jun 28, 2020
295f488
Ensure we only read logs under container name subdirectory
MOZGIII Jun 29, 2020
c651606
Promote trace to an error at src/kubernetes/multi_response_decoder.rs
MOZGIII Jun 29, 2020
8b7b221
Add a test case for bookmark parsing error
MOZGIII Jun 29, 2020
968e311
Remove meaningless leading \n from the boorkmark test
MOZGIII Jun 29, 2020
a64eae3
Update k8s-openapi to a branch with a fix for the bookmark parsing issue
MOZGIII Jun 29, 2020
670e46c
Switch kubernetes_logs source to Fingerprinter::FirstLineChecksum
MOZGIII Jun 29, 2020
3fd9d5b
Revert "Handle data parsing errors as stream ends"
MOZGIII Jul 1, 2020
f5b961a
Switch k8s-openapi git repo to our fork
MOZGIII Jul 1, 2020
8fd1fb3
Use cargo patch instead of per-crate git spec for k8s-openapi
MOZGIII Jul 1, 2020
2e4a7e1
Switch to the upstream of k8s-openapi
MOZGIII Jul 7, 2020
4a05afc
Fix clippy offences
MOZGIII Jul 14, 2020
86233a7
Bump k8s-openapi and switch to crates.io version
MOZGIII Jul 19, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 80 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ expanding into more specifics.
1. [Tips and Tricks](#tips-and-tricks)
1. [Benchmarking](#benchmarking)
1. [Profiling](#profiling)
1. [Kubernetes](#kubernetes)
1. [Humans](#humans)
1. [Documentation](#documentation)
1. [Changelog](#changelog)
Expand Down Expand Up @@ -547,6 +548,85 @@ cat stacks.folded | inferno-flamegraph > flamegraph.svg
And that's it! You now have a flamegraph SVG file that can be opened and
navigated in your favorite web browser.

### Kubernetes

There is a special flow for when you develop portions of Vector that are
designed to work with Kubernetes, like `kubernetes_logs` source or the
`deployment/kubernetes/*.yaml` configs.

This flow facilitates building Vector and deploying it into a cluster.

#### Requirements

There are some extra requirements besides what you'd normally need to work on
Vector:

* `linux` system (create an issue if you want to work with another OS and we'll
help);
* [`skaffold`](https://skaffold.dev/)
* [`docker`](https://www.docker.com/)
* [`kubectl`](https://kubernetes.io/docs/tasks/tools/install-kubectl/)
* [`kustomize`](https://kustomize.io/)
* [`minikube`](https://minikube.sigs.k8s.io/)-powered or other k8s cluster
* [`cargo watch`](https://github.com/passcod/cargo-watch)

#### The dev flow

Once you have the requirements, use the `scripts/skaffold.sh dev` command.

That's it, just one command should take care of everything!

It will

1. build the `vector` binary in development mode,
2. build a docker image from this binary via `skaffold/docker/Dockerfile`,
3. deploy `vector` into the Kubernetes cluster at your current kubectl context
using the built docker image and a mix of our production deployment
configuration from the `distribution/kubernetes/*.yaml` and the special
dev-flow configuration at `skaffold/manifests/*.yaml`; see
`kustomization.yaml` for the exact specification.

As the result of invoking the `scripts/skaffold.sh dev`, you should see
a `skaffold` process running on your local machine, printing the logs from the
deployed `vector` instance.

To stop the process, press `Ctrl+C`, and wait for `skaffold` to clean up
the cluster state and exit.

`scripts/skaffold.sh` wraps `skaffold`, you can use other `skaffold` subcommands
if it fits you better.

#### Troubleshooting

You might need to tweak `skaffold`, here are some hints:

* `skaffold` will try to detect whether a local cluster is used; if a local
cluster is used, `skaffold` won't push the docker images it builds to a
registry.
See [this page](https://skaffold.dev/docs/environment/local-cluster/)
for how you can troubleshoot and tweak this behavior.

* `skaffold` can rewrite the image name so that you don't try to push a docker
image to a repo that you don't have access to.
See [this page](https://skaffold.dev/docs/environment/image-registries/)
for more info.

* For the rest of the `skaffold` tweaks you might want to apply check out
[this page](https://skaffold.dev/docs/environment/).

#### Going through the dev flow manually

Is some cases `skaffold` may not work. It's possible to go through the dev flow
manually, without `skaffold`.

One of the important thing `skaffold` does is it patches the configuration to
tie things together. If you want to go without it, you'll have to take care of
that yourself, thus some additional knowledge of Kubernetes inner workings is
required.

Essentially, the steps you have to take to deploy manually are the same that
`skaffold` will perform, and they're outlined at the previous section.

## Humans

After making your change, you'll want to prepare it for Vector's users
Expand Down
86 changes: 71 additions & 15 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

9 changes: 9 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -142,6 +142,7 @@ strip-ansi-escapes = { version = "0.1.0"}
colored = "1.9"
warp = { package = "warp", version = "0.2", default-features = false, optional = true }
evmap = { version = "7", features = ["bytes"], optional = true }
evmap10 = { package = "evmap", version = "10", features = ["bytes"], optional = true }
logfmt = { version = "0.0.2", optional = true }
notify = "4.0.14"
once_cell = "1.3"
Expand All @@ -152,6 +153,7 @@ pulsar = { version = "1.0.0", default-features = false, features = ["tokio-runti
task-compat = "0.1"
cidr-utils = "0.4.2"
pin-project = "0.4.22"
k8s-openapi = { version = "0.9", features = ["v1_15"], optional = true }

# For WASM
vector-wasm = { path = "lib/vector-wasm", optional = true }
Expand Down Expand Up @@ -228,6 +230,10 @@ leveldb-cmake = ["leveldb", "leveldb/leveldb-sys-3"]
wasm = ["lucetc", "lucet-runtime", "lucet-wasi", "vector-wasm", "anyhow"]
wasm-timings = ["wasm"]

# Enables kubernetes dependencies and shared code. Kubernetes-related sources,
# transforms and sinks should depend on this feature.
kubernetes = ["k8s-openapi", "evmap10"]

# Sources
sources = [
"sources-docker",
Expand All @@ -246,6 +252,7 @@ sources = [
"sources-syslog",
"sources-tls",
"sources-vector",
"sources-kubernetes-logs",
]
sources-docker = ["bollard"]
sources-file = ["bytesize"]
Expand All @@ -263,6 +270,7 @@ sources-stdin = ["bytesize"]
sources-syslog = ["sources-socket", "syslog_loose"]
sources-tls = ["sources-http", "sources-logplex", "sources-socket", "sources-splunk_hec"]
sources-vector = ["sources-socket"]
sources-kubernetes-logs = ["kubernetes", "transforms-merge", "transforms-json_parser", "transforms-regex_parser"]

# Transforms
transforms = [
Expand Down Expand Up @@ -425,6 +433,7 @@ kafka-integration-tests = ["sources-kafka", "sinks-kafka"]
loki-integration-tests = ["sinks-loki"]
pulsar-integration-tests = ["sinks-pulsar"]
splunk-integration-tests = ["sinks-splunk_hec", "warp"]
kubernetes-integration-tests = ["sources-kubernetes-logs"]

shutdown-tests = ["sources","sinks-console","sinks-prometheus","sinks-blackhole","unix","rdkafka","transforms-log_to_metric","transforms-lua"]
disable-resolv-conf = []
Expand Down
9 changes: 5 additions & 4 deletions distribution/kubernetes/vector-namespaced.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,12 @@ data:
# Configuration for vector.
# Docs: https://vector.dev/docs/

# Configure the controlled by the deployment.
# Data dir is location controlled at the `DaemonSet`.
data_dir = "/vector-data-dir"

# Ingest logs from Kubernetes.
[sources.kubernetes]
type = "kubernetes"
[sources.kubernetes_logs]
type = "kubernetes_logs"
---
apiVersion: apps/v1
kind: DaemonSet
Expand All @@ -28,11 +28,11 @@ spec:
metadata:
labels:
name: vector
vector.dev/exclude: "true"
spec:
containers:
- name: vector
image: timberio/vector:latest-alpine
imagePullPolicy: Always
args:
- --config
- /etc/vector/*.toml
Expand Down Expand Up @@ -61,6 +61,7 @@ spec:
- name: config-dir
mountPath: /etc/vector/
readOnly: true
terminationGracePeriodSeconds: 60
tolerations:
# This toleration is to have the daemonset runnable on master nodes.
# Remove it if your masters can't run pods.
Expand Down
10 changes: 10 additions & 0 deletions kustomization.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# This is a part of our skaffold setup for development.
# Do not use in production.

namespace: vector

resources:
- distribution/kubernetes/vector-global.yaml
- skaffold/manifests/namespace.yaml
- skaffold/manifests/config.yaml
- distribution/kubernetes/vector-namespaced.yaml
3 changes: 2 additions & 1 deletion scripts/copy-docker-image-to-minikube.sh
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,8 @@ docker save "${IMAGES[@]}" | gzip >"$IMAGES_ARCHIVE"
# Start a subshell to preserve the env state.
(
# Switch to minikube docker.
eval "$(minikube --shell bash docker-env)"
# shellcheck source=minikube-docker-env.sh disable=SC1091
. scripts/minikube-docker-env.sh

# Load images.
docker load -i "$IMAGES_ARCHIVE"
Expand Down
9 changes: 9 additions & 0 deletions scripts/minikube-docker-env.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
#!/usr/bin/env bash
set -euo pipefail

if ! COMMANDS="$(minikube --shell bash docker-env)"; then
echo "Unable to obtain docker env from minikube; is minikube started?" >&2
exit 7
fi

eval "$COMMANDS"
Loading