From b13ce7dc10ca3cef9951a9f6f987fb98256fbdc6 Mon Sep 17 00:00:00 2001 From: Kirk Rodrigues <2454684+kirkrodrigues@users.noreply.github.com> Date: Mon, 20 Jan 2025 07:33:36 -0500 Subject: [PATCH 01/31] WIP --- docs/src/user-guide/guides-overview.md | 25 +++++++++++++++++++ .../user-guide/guides-using-object-storage.md | 20 +++++++++++++++ 2 files changed, 45 insertions(+) create mode 100644 docs/src/user-guide/guides-overview.md create mode 100644 docs/src/user-guide/guides-using-object-storage.md diff --git a/docs/src/user-guide/guides-overview.md b/docs/src/user-guide/guides-overview.md new file mode 100644 index 000000000..e055a8df6 --- /dev/null +++ b/docs/src/user-guide/guides-overview.md @@ -0,0 +1,25 @@ +# Guides + +The guides below describe how to use CLP in a variety of use cases. + +::::{grid} 1 1 2 2 +:gutter: 2 + +:::{grid-item-card} +:link: guides-using-object-storage +Using object storage +^^^ +Using CLP to ingest logs from object storage and store archives on object storage. +::: +:::: + +:::{toctree} +:hidden: +:caption: Core +:glob: + +core-overview +core-container +core-clp-s +core-unstructured/index +::: diff --git a/docs/src/user-guide/guides-using-object-storage.md b/docs/src/user-guide/guides-using-object-storage.md new file mode 100644 index 000000000..50ee0eda9 --- /dev/null +++ b/docs/src/user-guide/guides-using-object-storage.md @@ -0,0 +1,20 @@ +# Using object storage + +CLP can both ingest logs from object storage (e.g., S3) and store archives on object storage. This +guide explains how to configure CLP for both use cases. + +:::{note} +Currently, only the [clp-json][release-choices] release supports object storage. Support for clp-text will be added +in a future release. +::: + +:::{note} +Currently, CLP only supports using S3 as object storage. Support for other object storage services +will be added in a future release. +::: + +# Ingesting logs from object storage + +# Storing archives on object storage + +[release-choices]: http://localhost:8080/user-guide/quick-start-cluster-setup/index.html#choosing-a-release From 72398d8a31382ce4fb6a39360e7ceffc3897486e Mon Sep 17 00:00:00 2001 From: Kirk Rodrigues <2454684+kirkrodrigues@users.noreply.github.com> Date: Mon, 20 Jan 2025 07:35:06 -0500 Subject: [PATCH 02/31] WIP --- docs/src/user-guide/index.md | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/docs/src/user-guide/index.md b/docs/src/user-guide/index.md index a38b606b3..46efcc8b7 100644 --- a/docs/src/user-guide/index.md +++ b/docs/src/user-guide/index.md @@ -15,6 +15,13 @@ Quick start A quick start guide for setting up a CLP cluster, compressing your logs, and searching them. ::: +:::{grid-item-card} +:link: guides-overview +Guides +^^^ +Guides for using CLP in a variety of use cases. +::: + :::{grid-item-card} :link: core-overview Core @@ -47,6 +54,15 @@ quick-start-compression/index quick-start-search/index ::: +:::{toctree} +:hidden: +:caption: Guides +:glob: + +guides-overview +guides-using-object-storage +::: + :::{toctree} :hidden: :caption: Core From 9f620b8887dfd06818eed7fc684180f15869431d Mon Sep 17 00:00:00 2001 From: Kirk Rodrigues <2454684+kirkrodrigues@users.noreply.github.com> Date: Mon, 20 Jan 2025 07:54:51 -0500 Subject: [PATCH 03/31] WIP --- docs/src/user-guide/guides-overview.md | 13 +------------ docs/src/user-guide/guides-using-object-storage.md | 13 +++++++++++-- docs/src/user-guide/quick-start-compression/json.md | 8 +++++++- docs/src/user-guide/quick-start-compression/text.md | 2 +- 4 files changed, 20 insertions(+), 16 deletions(-) diff --git a/docs/src/user-guide/guides-overview.md b/docs/src/user-guide/guides-overview.md index e055a8df6..c9800fbf7 100644 --- a/docs/src/user-guide/guides-overview.md +++ b/docs/src/user-guide/guides-overview.md @@ -1,4 +1,4 @@ -# Guides +# Overview The guides below describe how to use CLP in a variety of use cases. @@ -12,14 +12,3 @@ Using object storage Using CLP to ingest logs from object storage and store archives on object storage. ::: :::: - -:::{toctree} -:hidden: -:caption: Core -:glob: - -core-overview -core-container -core-clp-s -core-unstructured/index -::: diff --git a/docs/src/user-guide/guides-using-object-storage.md b/docs/src/user-guide/guides-using-object-storage.md index 50ee0eda9..6d9c112bb 100644 --- a/docs/src/user-guide/guides-using-object-storage.md +++ b/docs/src/user-guide/guides-using-object-storage.md @@ -1,6 +1,6 @@ # Using object storage -CLP can both ingest logs from object storage (e.g., S3) and store archives on object storage. This +CLP can both compress logs from object storage (e.g., S3) and store archives on object storage. This guide explains how to configure CLP for both use cases. :::{note} @@ -13,7 +13,16 @@ Currently, CLP only supports using S3 as object storage. Support for other objec will be added in a future release. ::: -# Ingesting logs from object storage +# Compressing logs from object storage + +To ingest logs from S3, you can use the `s3` subcommand of the `compress.sh` script: + +```bash +sbin/compress.sh s3 s3:/// +``` + +* `` is the name of the S3 bucket containing your logs. +* `` is the path prefix of all logs you wish to compress. # Storing archives on object storage diff --git a/docs/src/user-guide/quick-start-compression/json.md b/docs/src/user-guide/quick-start-compression/json.md index 430363f83..272ce33bf 100644 --- a/docs/src/user-guide/quick-start-compression/json.md +++ b/docs/src/user-guide/quick-start-compression/json.md @@ -3,9 +3,15 @@ To compress JSON logs, from inside the package directory, run: ```bash -sbin/compress.sh --timestamp-key '' [ ...] +sbin/compress.sh fs --timestamp-key '' [ ...] ``` +* `fs` is a subcommand for compressing logs from the filesystem. + :::{tip} + To learn how to compress logs from object storage, see + [Using object storage](../guides-using-object-storage.md). + ::: + * `` is the field path of the kv-pair that contains the timestamp in each log event. * E.g., if your log events look like `{"timestamp": {"iso8601": "2024-01-01 00:01:02.345", ...}}`, you should enter diff --git a/docs/src/user-guide/quick-start-compression/text.md b/docs/src/user-guide/quick-start-compression/text.md index 29e798b9d..18179a65b 100644 --- a/docs/src/user-guide/quick-start-compression/text.md +++ b/docs/src/user-guide/quick-start-compression/text.md @@ -3,7 +3,7 @@ To compress unstructured text logs, from inside the package directory, run: ```bash -sbin/compress.sh [ ...] +sbin/compress.sh fs [ ...] ``` `` are paths to unstructured text log files or directories containing such files. From 2bd546f959be1ef991b435629fcb0b976686b93b Mon Sep 17 00:00:00 2001 From: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com> Date: Mon, 20 Jan 2025 12:00:42 -0500 Subject: [PATCH 04/31] Update guides-using-object-storage.md --- docs/src/user-guide/guides-using-object-storage.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/docs/src/user-guide/guides-using-object-storage.md b/docs/src/user-guide/guides-using-object-storage.md index 6d9c112bb..f28849275 100644 --- a/docs/src/user-guide/guides-using-object-storage.md +++ b/docs/src/user-guide/guides-using-object-storage.md @@ -15,7 +15,7 @@ will be added in a future release. # Compressing logs from object storage -To ingest logs from S3, you can use the `s3` subcommand of the `compress.sh` script: +To compress logs from S3, use the `s3` subcommand of the `compress.sh` script: ```bash sbin/compress.sh s3 s3:/// @@ -24,6 +24,12 @@ sbin/compress.sh s3 s3:/// * `` is the name of the S3 bucket containing your logs. * `` is the path prefix of all logs you wish to compress. +:::{note} +The `s3` subcommand only supports a single URL but will compress any logs that have the given path prefix. + +If you wish to compress a single log file, specify the entire path to the log file. However, there is no way, currently, to compress a single log file if that log file's path is a prefix of another log file. This limitation will be addressed in a future release. +::: + # Storing archives on object storage [release-choices]: http://localhost:8080/user-guide/quick-start-cluster-setup/index.html#choosing-a-release From a2986765d1053789b39a92f643219bc5807fb51f Mon Sep 17 00:00:00 2001 From: Kirk Rodrigues <2454684+kirkrodrigues@users.noreply.github.com> Date: Mon, 20 Jan 2025 21:27:00 -0500 Subject: [PATCH 05/31] WIP --- .../user-guide/guides-using-object-storage.md | 74 +++++++++++++++++-- 1 file changed, 68 insertions(+), 6 deletions(-) diff --git a/docs/src/user-guide/guides-using-object-storage.md b/docs/src/user-guide/guides-using-object-storage.md index f28849275..35626332b 100644 --- a/docs/src/user-guide/guides-using-object-storage.md +++ b/docs/src/user-guide/guides-using-object-storage.md @@ -4,8 +4,8 @@ CLP can both compress logs from object storage (e.g., S3) and store archives on guide explains how to configure CLP for both use cases. :::{note} -Currently, only the [clp-json][release-choices] release supports object storage. Support for clp-text will be added -in a future release. +Currently, only the [clp-json][release-choices] release supports object storage. Support for +clp-text will be added in a future release. ::: :::{note} @@ -13,7 +13,7 @@ Currently, CLP only supports using S3 as object storage. Support for other objec will be added in a future release. ::: -# Compressing logs from object storage +## Compressing logs from object storage To compress logs from S3, use the `s3` subcommand of the `compress.sh` script: @@ -25,11 +25,73 @@ sbin/compress.sh s3 s3:/// * `` is the path prefix of all logs you wish to compress. :::{note} -The `s3` subcommand only supports a single URL but will compress any logs that have the given path prefix. +The `s3` subcommand only supports a single URL but will compress any logs that have the given path +prefix. -If you wish to compress a single log file, specify the entire path to the log file. However, there is no way, currently, to compress a single log file if that log file's path is a prefix of another log file. This limitation will be addressed in a future release. +If you wish to compress a single log file, specify the entire path to the log file. However, if that +log file's path is a prefix of another log file's path, then both log files will be compressed. This +limitation will be addressed in a future release. ::: -# Storing archives on object storage +## Storing archives on object storage +To configure CLP to store archives on S3, update the `archive_output.storage` key in +`/etc/clp-config.yml`: + +```yaml +archive_output: + storage: + type: "s3" + staging_directory: "var/data/staged-archives" # Or a path of your choosing + s3_config: + region: "" + bucket: "" + key-prefix: "" + credentials: + access_key_id: "" + secret_access_key: "" + + # archive_output's other config keys +``` + +* `s3_config` configures both the S3 bucket where archives should be stored as well as credentials + for accessing it. + * `` is the AWS region [code][aws-region-codes] for the bucket. + * `` is the bucket's name. + * `` is the "directory" where all archives will be stored within the bucket and + must end with `/`. + * `credentials` contains the S3 credentials necessary for accessing the bucket. + +To configure CLP to be able to view compressed log files from S3, you'll need to configure a bucket +where CLP can store intermediate files that the log viewer can open. To do so, update the +`stream_output.storage` key in `/etc/clp-config.yml`: + +```yaml +stream_output: + storage: + type: "s3" + staging_directory: "var/data/staged-streams" # Or a path of your choosing + s3_config: + region: "" + bucket: "" + key-prefix: "" + credentials: + access_key_id: "" + secret_access_key: "" + + # stream_output's other config keys +``` + +The configuration keys above function identically to those in `archive_output.storage`, except they +should be configured to use a different S3 path (i.e., a different key-prefix in the same bucket or +a different bucket entirely). + +:::{note} +To view compressed log files, clp-text currently converts them into IR streams that the log viewer +can open, while clp-json converts them into JSONL streams. These streams only need to be stored for +as long as the streams are being viewed in the viewer, however CLP currently doesn't explicitly +delete the streams. This limitation will be addressed in a future release. +::: + +[aws-region-codes]: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.RegionsAndAvailabilityZones.html#Concepts.RegionsAndAvailabilityZones.Availability [release-choices]: http://localhost:8080/user-guide/quick-start-cluster-setup/index.html#choosing-a-release From 4ea2018d108e2357b7a1fd90542345a3aedbddc3 Mon Sep 17 00:00:00 2001 From: Kirk Rodrigues <2454684+kirkrodrigues@users.noreply.github.com> Date: Tue, 21 Jan 2025 08:02:47 -0500 Subject: [PATCH 06/31] Add details about configuring an IAM user; Fix config keys; General refactoring. --- .../user-guide/guides-using-object-storage.md | 121 +++++++++++++++--- 1 file changed, 101 insertions(+), 20 deletions(-) diff --git a/docs/src/user-guide/guides-using-object-storage.md b/docs/src/user-guide/guides-using-object-storage.md index 35626332b..70c150573 100644 --- a/docs/src/user-guide/guides-using-object-storage.md +++ b/docs/src/user-guide/guides-using-object-storage.md @@ -1,11 +1,16 @@ # Using object storage -CLP can both compress logs from object storage (e.g., S3) and store archives on object storage. This -guide explains how to configure CLP for both use cases. +CLP can: + +* [compress logs from object storage](#compressing-logs-from-object-storage) (e.g., S3); +* [store archives on object storage](#storing-archives-on-object-storage); and +* [view the compressed logs from object storage](#viewing-compressed-logs-from-object-storage). + +This guide explains how to configure CLP for all three use cases. :::{note} Currently, only the [clp-json][release-choices] release supports object storage. Support for -clp-text will be added in a future release. +`clp-text` will be added in a future release. ::: :::{note} @@ -15,17 +20,83 @@ will be added in a future release. ## Compressing logs from object storage -To compress logs from S3, use the `s3` subcommand of the `compress.sh` script: +To compress logs from S3, you'll need to: + +1. Set up an AWS IAM user that CLP can use to access the bucket containing your logs. +2. Use the `s3` subcommand of `sbin/compress.sh` to compress your logs. + +### Setting up an AWS IAM user + +To set up a user: + +1. Create a user by following [this guide][aws-create-iam-user]. + * If you already have a user to use for ingesting logs, you can skip this step. +2. Attach the following policy to the user by following [this guide][add-iam-policy]. + + ```json + { + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Action": "s3:GetObject", + "Resource": [ + "arn:aws:s3::://*" + ] + }, + { + "Effect": "Allow", + "Action": [ + "s3:ListBucket" + ], + "Resource": [ + "arn:aws:s3:::" + ], + "Condition": { + "StringLike": { + "s3:prefix": "/*" + } + } + } + ] + } + ``` + + Replace the fields in angle brackets (`<>`) with the appropriate values: + * `` should be the name of the S3 bucket containing your logs. + * `` should be the prefix of all logs you wish to compress. + + :::{warning} + To follow the [principle of least privilege][least-privilege-principle], ensure the user doesn't + have other unnecessary permission policies attached. If the user does have other policies, + consider creating a new user with only the permission policy above. + ::: + +### Using `sbin/compress.sh s3` + +You can use the `s3` subcommand as follows: ```bash -sbin/compress.sh s3 s3:/// +sbin/compress.sh s3 --aws-credentials-file s3:/// ``` +* `` is the path to an AWS credentials file like the following: + + ```ini + [default] + aws_access_key_id = + aws_secret_access_key = + ``` + + * CLP expects the credentials to be in the `default` section. + * `` and `` are the access key ID and secret access + key of the IAM user you set up in the previous section. + * `` is the name of the S3 bucket containing your logs. -* `` is the path prefix of all logs you wish to compress. +* `` is the path prefix of all logs you wish to compress. :::{note} -The `s3` subcommand only supports a single URL but will compress any logs that have the given path +The `s3` subcommand only supports a single URL but will compress any logs that have the given key prefix. If you wish to compress a single log file, specify the entire path to the log file. However, if that @@ -44,9 +115,9 @@ archive_output: type: "s3" staging_directory: "var/data/staged-archives" # Or a path of your choosing s3_config: - region: "" - bucket: "" - key-prefix: "" + region_code: "" + bucket: "" + key_prefix: "" credentials: access_key_id: "" secret_access_key: "" @@ -54,14 +125,21 @@ archive_output: # archive_output's other config keys ``` -* `s3_config` configures both the S3 bucket where archives should be stored as well as credentials +* `s3_config` configures both the S3 bucket where archives should be stored and the credentials for accessing it. - * `` is the AWS region [code][aws-region-codes] for the bucket. - * `` is the bucket's name. - * `` is the "directory" where all archives will be stored within the bucket and + * `` is the AWS region [code][aws-region-codes] for the bucket. + * `` is the bucket's name. + * `` is the "directory" where all archives will be stored within the bucket and must end with `/`. * `credentials` contains the S3 credentials necessary for accessing the bucket. + :::{note} + These credentials can be for a different IAM user than the one set up in the previous section, + as long as they can access the bucket. + ::: + +## Viewing compressed logs from object storage + To configure CLP to be able to view compressed log files from S3, you'll need to configure a bucket where CLP can store intermediate files that the log viewer can open. To do so, update the `stream_output.storage` key in `/etc/clp-config.yml`: @@ -72,9 +150,9 @@ stream_output: type: "s3" staging_directory: "var/data/staged-streams" # Or a path of your choosing s3_config: - region: "" - bucket: "" - key-prefix: "" + region_code: "" + bucket: "" + key_prefix: "" credentials: access_key_id: "" secret_access_key: "" @@ -89,9 +167,12 @@ a different bucket entirely). :::{note} To view compressed log files, clp-text currently converts them into IR streams that the log viewer can open, while clp-json converts them into JSONL streams. These streams only need to be stored for -as long as the streams are being viewed in the viewer, however CLP currently doesn't explicitly -delete the streams. This limitation will be addressed in a future release. +as long as the streams are being viewed, but CLP currently doesn't explicitly delete the streams. +This limitation will be addressed in a future release. ::: +[add-iam-policy]: https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html#embed-inline-policy-console +[aws-create-iam-user]: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html [aws-region-codes]: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.RegionsAndAvailabilityZones.html#Concepts.RegionsAndAvailabilityZones.Availability -[release-choices]: http://localhost:8080/user-guide/quick-start-cluster-setup/index.html#choosing-a-release +[least-privilege-principle]: https://en.wikipedia.org/wiki/Principle_of_least_privilege +[release-choices]: quick-start-cluster-setup/index.md#choosing-a-release From 19e01c349665b1db505c668b8df1135fa428709f Mon Sep 17 00:00:00 2001 From: Haiqi Xu <14502009+haiqi96@users.noreply.github.com> Date: Tue, 21 Jan 2025 17:25:10 -0500 Subject: [PATCH 07/31] draft --- .../user-guide/guides-using-object-storage.md | 74 +++++++++++++++++++ 1 file changed, 74 insertions(+) diff --git a/docs/src/user-guide/guides-using-object-storage.md b/docs/src/user-guide/guides-using-object-storage.md index 70c150573..94efc2a8f 100644 --- a/docs/src/user-guide/guides-using-object-storage.md +++ b/docs/src/user-guide/guides-using-object-storage.md @@ -106,6 +106,41 @@ limitation will be addressed in a future release. ## Storing archives on object storage +To store compressed archives on S3, you'll need to: + +1. Set up an AWS IAM user that allows CLP to write to the bucket where archives should be stored. +2. Configure the S3 information in `clp-config.yml`. + +### Setting up an AWS IAM user +1. Create a user by following [this guide][aws-create-iam-user]. + * If you already created a user in the previous section, you can reuse it and proceed to step 2. + * You can also create a new user different from the previous section to follow the [principle of least privilege][least-privilege-principle]. +2. Attach the following policy to the user by following [this guide][add-iam-policy]. + + ```json + { + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Action": [ + "s3:GetObject", + "s3:PutObject" + ], + "Resource": [ + "arn:aws:s3::://*" + ] + } + ] + } + ``` + + Replace the fields in angle brackets (`<>`) with the appropriate values: + * `` should be the name of the S3 bucket to store compressed archives. + * `` should be the path prefix where you want the compressed archives to be stored under. + +### Configuring `clp-config.yml` + To configure CLP to store archives on S3, update the `archive_output.storage` key in `/etc/clp-config.yml`: @@ -140,6 +175,44 @@ archive_output: ## Viewing compressed logs from object storage +To view compressed logs S3, you'll need to: +1. Set up cross-origin resource sharing (CORS) for the bucket to store stream files. +2. Set up an AWS IAM user that allows CLP to store stream files to the bucket. +3. Configure the S3 information in `clp-config.yml`. + +### Setting up cross-origin resource sharing + +CLP's log viewer webui requires the S3 bucket to support CORS for log viewing. + +1. Set up the cross-origin resource sharing by following [this guide][aws-cors-guide]. + * Use the following CORS configuration + + ```json + [ + { + "AllowedHeaders": [ + "*" + ], + "AllowedMethods": [ + "GET" + ], + "AllowedOrigins": [ + "http://localhost:3000" + ], + "ExposeHeaders": [ + "Access-Control-Allow-Origin" + ] + } + ] + ``` + :::{note} + By default, CLP hosts the log-viewer webui on http://localhost:3000. If you want to host the log-viewer webui with different URLs, you need to update the AllowedOrigins list to include those URLs. + +### Setting up an AWS IAM user + + +### Configuring `clp-config.yml` + To configure CLP to be able to view compressed log files from S3, you'll need to configure a bucket where CLP can store intermediate files that the log viewer can open. To do so, update the `stream_output.storage` key in `/etc/clp-config.yml`: @@ -172,6 +245,7 @@ This limitation will be addressed in a future release. ::: [add-iam-policy]: https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html#embed-inline-policy-console +[aws-cors-guide]: https://docs.aws.amazon.com/AmazonS3/latest/userguide/enabling-cors-examples.html [aws-create-iam-user]: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html [aws-region-codes]: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.RegionsAndAvailabilityZones.html#Concepts.RegionsAndAvailabilityZones.Availability [least-privilege-principle]: https://en.wikipedia.org/wiki/Principle_of_least_privilege From 542cba2f84f9ffd4f577d961a29da451742a9daf Mon Sep 17 00:00:00 2001 From: Kirk Rodrigues <2454684+kirkrodrigues@users.noreply.github.com> Date: Wed, 22 Jan 2025 07:01:37 -0500 Subject: [PATCH 08/31] Add prerequisites section; Move AWS user setup into prerequisites; Finish log viewing section. --- .../user-guide/guides-using-object-storage.md | 298 +++++++++++------- 1 file changed, 176 insertions(+), 122 deletions(-) diff --git a/docs/src/user-guide/guides-using-object-storage.md b/docs/src/user-guide/guides-using-object-storage.md index 94efc2a8f..f04aa1eb9 100644 --- a/docs/src/user-guide/guides-using-object-storage.md +++ b/docs/src/user-guide/guides-using-object-storage.md @@ -6,7 +6,9 @@ CLP can: * [store archives on object storage](#storing-archives-on-object-storage); and * [view the compressed logs from object storage](#viewing-compressed-logs-from-object-storage). -This guide explains how to configure CLP for all three use cases. +This guide explains how to configure CLP for all three use cases. Note that you can choose to use +object storage for any combination of the three use cases (e.g., compress logs from S3 and view the +compressed logs from S3, but store archives on the local filesystem). :::{note} Currently, only the [clp-json][release-choices] release supports object storage. Support for @@ -18,59 +20,89 @@ Currently, CLP only supports using S3 as object storage. Support for other objec will be added in a future release. ::: +## Prerequisites + +1. An S3 bucket and [key prefix][aws-key-prefixes] containing the logs you wish to compress. + * An S3 URL is a combination of a bucket name and a key prefix as shown below: + + :::{mermaid} + %%{ + init: { + "theme": "base", + "themeVariables": { + "primaryColor": "#0066cc", + "primaryTextColor": "#fff", + "primaryBorderColor": "transparent", + "lineColor": "#9580ff", + "secondaryColor": "#9580ff", + "tertiaryColor": "#fff" + } + } + }%% + graph TD + A["s3://my-bucket-name/my-logs-dir/"] --"Bucket name"--> B[my-bucket-name] + A --"Key prefix"--> C[path/to/my/file.txt] + ::: + +2. An S3 bucket and key prefix where you wish to store compressed archives. +3. An S3 bucket and key prefix where you wish to store intermediate files for viewing compressed + logs. +4. An AWS IAM user with the necessary permissions to access the S3 prefixes mentioned above. + * To create a user, follow [this guide][aws-create-iam-user]. + * You may use a different IAM user for each use case to follow the + [principle of least privilege][least-privilege-principle], or you can use the same user for + all three. + * For brevity, we'll refer to this user as the "CLP IAM user" in the rest of this guide. + +:::{note} +CLP currently requires IAM user (long-term) credentials to access the relevant S3 buckets. Support +for other authentication methods (e.g., temporary credentials) will be added in a future release. +::: + ## Compressing logs from object storage To compress logs from S3, you'll need to: -1. Set up an AWS IAM user that CLP can use to access the bucket containing your logs. +1. Enable the CLP IAM user to access the S3 path containing your logs. 2. Use the `s3` subcommand of `sbin/compress.sh` to compress your logs. -### Setting up an AWS IAM user - -To set up a user: - -1. Create a user by following [this guide][aws-create-iam-user]. - * If you already have a user to use for ingesting logs, you can skip this step. -2. Attach the following policy to the user by following [this guide][add-iam-policy]. - - ```json - { - "Version": "2012-10-17", - "Statement": [ - { - "Effect": "Allow", - "Action": "s3:GetObject", - "Resource": [ - "arn:aws:s3::://*" - ] - }, - { - "Effect": "Allow", - "Action": [ - "s3:ListBucket" - ], - "Resource": [ - "arn:aws:s3:::" - ], - "Condition": { - "StringLike": { - "s3:prefix": "/*" - } +### IAM user configuration + +Attach the following policy to the CLP IAM user by following [this guide][add-iam-policy]. + +```json +{ + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Action": "s3:GetObject", + "Resource": [ + "arn:aws:s3::://*" + ] + }, + { + "Effect": "Allow", + "Action": [ + "s3:ListBucket" + ], + "Resource": [ + "arn:aws:s3:::" + ], + "Condition": { + "StringLike": { + "s3:prefix": "/*" } } - ] - } - ``` - - Replace the fields in angle brackets (`<>`) with the appropriate values: - * `` should be the name of the S3 bucket containing your logs. - * `` should be the prefix of all logs you wish to compress. + } + ] +} +``` - :::{warning} - To follow the [principle of least privilege][least-privilege-principle], ensure the user doesn't - have other unnecessary permission policies attached. If the user does have other policies, - consider creating a new user with only the permission policy above. - ::: +Replace the fields in angle brackets (`<>`) with the appropriate values: + +* `` should be the name of the S3 bucket containing your logs. +* `` should be the prefix of all logs you wish to compress. ### Using `sbin/compress.sh s3` @@ -90,13 +122,15 @@ sbin/compress.sh s3 --aws-credentials-file s3:// * CLP expects the credentials to be in the `default` section. * `` and `` are the access key ID and secret access - key of the IAM user you set up in the previous section. + key of the CLP IAM user. + * If you don't want to use a credentials file, you can specify the credentials on the command + line using the `--aws-access-key-id` and `--aws-secret-access-key` flags. * `` is the name of the S3 bucket containing your logs. -* `` is the path prefix of all logs you wish to compress. +* `` is the prefix of all logs you wish to compress. :::{note} -The `s3` subcommand only supports a single URL but will compress any logs that have the given key +The `s3` subcommand only supports a single URL but will compress any logs that have the given prefix. If you wish to compress a single log file, specify the entire path to the log file. However, if that @@ -108,38 +142,37 @@ limitation will be addressed in a future release. To store compressed archives on S3, you'll need to: -1. Set up an AWS IAM user that allows CLP to write to the bucket where archives should be stored. -2. Configure the S3 information in `clp-config.yml`. +1. Enable the CLP IAM user to access the S3 path where archives should be stored. +2. Configure CLP to store archives under the relevant S3 path. + +### IAM user configuration + +Attach the following policy to the CLP IAM user by following [this guide][add-iam-policy]. + +```json +{ + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Action": [ + "s3:GetObject", + "s3:PutObject" + ], + "Resource": [ + "arn:aws:s3::://*" + ] + } + ] +} +``` -### Setting up an AWS IAM user -1. Create a user by following [this guide][aws-create-iam-user]. - * If you already created a user in the previous section, you can reuse it and proceed to step 2. - * You can also create a new user different from the previous section to follow the [principle of least privilege][least-privilege-principle]. -2. Attach the following policy to the user by following [this guide][add-iam-policy]. +Replace the fields in angle brackets (`<>`) with the appropriate values: - ```json - { - "Version": "2012-10-17", - "Statement": [ - { - "Effect": "Allow", - "Action": [ - "s3:GetObject", - "s3:PutObject" - ], - "Resource": [ - "arn:aws:s3::://*" - ] - } - ] - } - ``` +* `` should be the name of the S3 bucket to store compressed archives. +* `` should be the prefix where you want the compressed archives to be stored under. - Replace the fields in angle brackets (`<>`) with the appropriate values: - * `` should be the name of the S3 bucket to store compressed archives. - * `` should be the path prefix where you want the compressed archives to be stored under. - -### Configuring `clp-config.yml` +### Configuring CLP's archive storage location To configure CLP to store archives on S3, update the `archive_output.storage` key in `/etc/clp-config.yml`: @@ -160,62 +193,83 @@ archive_output: # archive_output's other config keys ``` +Replace the fields in angle brackets (`<>`) with the appropriate values: + * `s3_config` configures both the S3 bucket where archives should be stored and the credentials for accessing it. * `` is the AWS region [code][aws-region-codes] for the bucket. * `` is the bucket's name. * `` is the "directory" where all archives will be stored within the bucket and must end with `/`. - * `credentials` contains the S3 credentials necessary for accessing the bucket. - - :::{note} - These credentials can be for a different IAM user than the one set up in the previous section, - as long as they can access the bucket. - ::: + * `credentials` contains the CLP IAM user's credentials. ## Viewing compressed logs from object storage -To view compressed logs S3, you'll need to: -1. Set up cross-origin resource sharing (CORS) for the bucket to store stream files. -2. Set up an AWS IAM user that allows CLP to store stream files to the bucket. -3. Configure the S3 information in `clp-config.yml`. - -### Setting up cross-origin resource sharing - -CLP's log viewer webui requires the S3 bucket to support CORS for log viewing. - -1. Set up the cross-origin resource sharing by following [this guide][aws-cors-guide]. - * Use the following CORS configuration - - ```json - [ - { - "AllowedHeaders": [ - "*" - ], - "AllowedMethods": [ - "GET" - ], - "AllowedOrigins": [ - "http://localhost:3000" - ], - "ExposeHeaders": [ - "Access-Control-Allow-Origin" - ] - } - ] - ``` - :::{note} - By default, CLP hosts the log-viewer webui on http://localhost:3000. If you want to host the log-viewer webui with different URLs, you need to update the AllowedOrigins list to include those URLs. +To view compressed logs from S3, you'll need to: + +1. Enable the CLP IAM user to access the S3 path where stream files (logs in a format viewable by + the log viewer) should be stored. +2. Set up a cross-origin resource sharing (CORS) policy for the S3 path in (1). +3. Configure CLP to store stream files under the S3 path in (1). + +### IAM user configuration + +Attach the following policy to the CLP IAM user by following [this guide][add-iam-policy]. -### Setting up an AWS IAM user +```json +{ + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Action": [ + "s3:GetObject", + "s3:PutObject" + ], + "Resource": [ + "arn:aws:s3::://*" + ] + } + ] +} +``` + +### Cross-origin resource sharing (CORS) configuration + +For CLP's log viewer to be able to view the compressed logs from S3 over the internet, the S3 bucket +must have a CORS policy configured. + +Add the CORS configuration below to your bucket by following [this guide][aws-cors-guide]: + +```json +[ + { + "AllowedHeaders": [ + "*" + ], + "AllowedMethods": [ + "GET" + ], + "AllowedOrigins": [ + "*" + ], + "ExposeHeaders": [ + "Access-Control-Allow-Origin" + ] + } +] +``` +:::{tip} +The CORS policy above allows requests from any host (origin). If you already know what hosts will +access CLP's web interface, you can enhance security by changing `AllowedOrigins` from `*` to the +specific list of hosts that will access the web interface. +::: -### Configuring `clp-config.yml` +### Configuring CLP's stream storage location -To configure CLP to be able to view compressed log files from S3, you'll need to configure a bucket -where CLP can store intermediate files that the log viewer can open. To do so, update the -`stream_output.storage` key in `/etc/clp-config.yml`: +To configure CLP to store stream files on S3, update the `stream_output.storage` key in +`/etc/clp-config.yml`: ```yaml stream_output: @@ -234,8 +288,7 @@ stream_output: ``` The configuration keys above function identically to those in `archive_output.storage`, except they -should be configured to use a different S3 path (i.e., a different key-prefix in the same bucket or -a different bucket entirely). +should be configured to use a different S3 path. :::{note} To view compressed log files, clp-text currently converts them into IR streams that the log viewer @@ -247,6 +300,7 @@ This limitation will be addressed in a future release. [add-iam-policy]: https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html#embed-inline-policy-console [aws-cors-guide]: https://docs.aws.amazon.com/AmazonS3/latest/userguide/enabling-cors-examples.html [aws-create-iam-user]: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html +[aws-key-prefixes]: https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-prefixes.html [aws-region-codes]: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.RegionsAndAvailabilityZones.html#Concepts.RegionsAndAvailabilityZones.Availability [least-privilege-principle]: https://en.wikipedia.org/wiki/Principle_of_least_privilege [release-choices]: quick-start-cluster-setup/index.md#choosing-a-release From 22feafbab64201a8e8df3e97e28ccc013ba68816 Mon Sep 17 00:00:00 2001 From: Kirk Rodrigues <2454684+kirkrodrigues@users.noreply.github.com> Date: Wed, 22 Jan 2025 07:17:35 -0500 Subject: [PATCH 09/31] Split docs into their own pages to reduce cognitive load. --- docs/src/user-guide/guides-overview.md | 2 +- .../user-guide/guides-using-object-storage.md | 306 ------------------ .../archive-storage.md | 64 ++++ .../guides-using-object-storage/compress.md | 78 +++++ .../guides-using-object-storage/index.md | 105 ++++++ .../stream-storage.md | 93 ++++++ docs/src/user-guide/index.md | 2 +- .../quick-start-compression/json.md | 2 +- 8 files changed, 343 insertions(+), 309 deletions(-) delete mode 100644 docs/src/user-guide/guides-using-object-storage.md create mode 100644 docs/src/user-guide/guides-using-object-storage/archive-storage.md create mode 100644 docs/src/user-guide/guides-using-object-storage/compress.md create mode 100644 docs/src/user-guide/guides-using-object-storage/index.md create mode 100644 docs/src/user-guide/guides-using-object-storage/stream-storage.md diff --git a/docs/src/user-guide/guides-overview.md b/docs/src/user-guide/guides-overview.md index c9800fbf7..23835a9ff 100644 --- a/docs/src/user-guide/guides-overview.md +++ b/docs/src/user-guide/guides-overview.md @@ -6,7 +6,7 @@ The guides below describe how to use CLP in a variety of use cases. :gutter: 2 :::{grid-item-card} -:link: guides-using-object-storage +:link: guides-using-object-storage/index Using object storage ^^^ Using CLP to ingest logs from object storage and store archives on object storage. diff --git a/docs/src/user-guide/guides-using-object-storage.md b/docs/src/user-guide/guides-using-object-storage.md deleted file mode 100644 index f04aa1eb9..000000000 --- a/docs/src/user-guide/guides-using-object-storage.md +++ /dev/null @@ -1,306 +0,0 @@ -# Using object storage - -CLP can: - -* [compress logs from object storage](#compressing-logs-from-object-storage) (e.g., S3); -* [store archives on object storage](#storing-archives-on-object-storage); and -* [view the compressed logs from object storage](#viewing-compressed-logs-from-object-storage). - -This guide explains how to configure CLP for all three use cases. Note that you can choose to use -object storage for any combination of the three use cases (e.g., compress logs from S3 and view the -compressed logs from S3, but store archives on the local filesystem). - -:::{note} -Currently, only the [clp-json][release-choices] release supports object storage. Support for -`clp-text` will be added in a future release. -::: - -:::{note} -Currently, CLP only supports using S3 as object storage. Support for other object storage services -will be added in a future release. -::: - -## Prerequisites - -1. An S3 bucket and [key prefix][aws-key-prefixes] containing the logs you wish to compress. - * An S3 URL is a combination of a bucket name and a key prefix as shown below: - - :::{mermaid} - %%{ - init: { - "theme": "base", - "themeVariables": { - "primaryColor": "#0066cc", - "primaryTextColor": "#fff", - "primaryBorderColor": "transparent", - "lineColor": "#9580ff", - "secondaryColor": "#9580ff", - "tertiaryColor": "#fff" - } - } - }%% - graph TD - A["s3://my-bucket-name/my-logs-dir/"] --"Bucket name"--> B[my-bucket-name] - A --"Key prefix"--> C[path/to/my/file.txt] - ::: - -2. An S3 bucket and key prefix where you wish to store compressed archives. -3. An S3 bucket and key prefix where you wish to store intermediate files for viewing compressed - logs. -4. An AWS IAM user with the necessary permissions to access the S3 prefixes mentioned above. - * To create a user, follow [this guide][aws-create-iam-user]. - * You may use a different IAM user for each use case to follow the - [principle of least privilege][least-privilege-principle], or you can use the same user for - all three. - * For brevity, we'll refer to this user as the "CLP IAM user" in the rest of this guide. - -:::{note} -CLP currently requires IAM user (long-term) credentials to access the relevant S3 buckets. Support -for other authentication methods (e.g., temporary credentials) will be added in a future release. -::: - -## Compressing logs from object storage - -To compress logs from S3, you'll need to: - -1. Enable the CLP IAM user to access the S3 path containing your logs. -2. Use the `s3` subcommand of `sbin/compress.sh` to compress your logs. - -### IAM user configuration - -Attach the following policy to the CLP IAM user by following [this guide][add-iam-policy]. - -```json -{ - "Version": "2012-10-17", - "Statement": [ - { - "Effect": "Allow", - "Action": "s3:GetObject", - "Resource": [ - "arn:aws:s3::://*" - ] - }, - { - "Effect": "Allow", - "Action": [ - "s3:ListBucket" - ], - "Resource": [ - "arn:aws:s3:::" - ], - "Condition": { - "StringLike": { - "s3:prefix": "/*" - } - } - } - ] -} -``` - -Replace the fields in angle brackets (`<>`) with the appropriate values: - -* `` should be the name of the S3 bucket containing your logs. -* `` should be the prefix of all logs you wish to compress. - -### Using `sbin/compress.sh s3` - -You can use the `s3` subcommand as follows: - -```bash -sbin/compress.sh s3 --aws-credentials-file s3:/// -``` - -* `` is the path to an AWS credentials file like the following: - - ```ini - [default] - aws_access_key_id = - aws_secret_access_key = - ``` - - * CLP expects the credentials to be in the `default` section. - * `` and `` are the access key ID and secret access - key of the CLP IAM user. - * If you don't want to use a credentials file, you can specify the credentials on the command - line using the `--aws-access-key-id` and `--aws-secret-access-key` flags. - -* `` is the name of the S3 bucket containing your logs. -* `` is the prefix of all logs you wish to compress. - -:::{note} -The `s3` subcommand only supports a single URL but will compress any logs that have the given -prefix. - -If you wish to compress a single log file, specify the entire path to the log file. However, if that -log file's path is a prefix of another log file's path, then both log files will be compressed. This -limitation will be addressed in a future release. -::: - -## Storing archives on object storage - -To store compressed archives on S3, you'll need to: - -1. Enable the CLP IAM user to access the S3 path where archives should be stored. -2. Configure CLP to store archives under the relevant S3 path. - -### IAM user configuration - -Attach the following policy to the CLP IAM user by following [this guide][add-iam-policy]. - -```json -{ - "Version": "2012-10-17", - "Statement": [ - { - "Effect": "Allow", - "Action": [ - "s3:GetObject", - "s3:PutObject" - ], - "Resource": [ - "arn:aws:s3::://*" - ] - } - ] -} -``` - -Replace the fields in angle brackets (`<>`) with the appropriate values: - -* `` should be the name of the S3 bucket to store compressed archives. -* `` should be the prefix where you want the compressed archives to be stored under. - -### Configuring CLP's archive storage location - -To configure CLP to store archives on S3, update the `archive_output.storage` key in -`/etc/clp-config.yml`: - -```yaml -archive_output: - storage: - type: "s3" - staging_directory: "var/data/staged-archives" # Or a path of your choosing - s3_config: - region_code: "" - bucket: "" - key_prefix: "" - credentials: - access_key_id: "" - secret_access_key: "" - - # archive_output's other config keys -``` - -Replace the fields in angle brackets (`<>`) with the appropriate values: - -* `s3_config` configures both the S3 bucket where archives should be stored and the credentials - for accessing it. - * `` is the AWS region [code][aws-region-codes] for the bucket. - * `` is the bucket's name. - * `` is the "directory" where all archives will be stored within the bucket and - must end with `/`. - * `credentials` contains the CLP IAM user's credentials. - -## Viewing compressed logs from object storage - -To view compressed logs from S3, you'll need to: - -1. Enable the CLP IAM user to access the S3 path where stream files (logs in a format viewable by - the log viewer) should be stored. -2. Set up a cross-origin resource sharing (CORS) policy for the S3 path in (1). -3. Configure CLP to store stream files under the S3 path in (1). - -### IAM user configuration - -Attach the following policy to the CLP IAM user by following [this guide][add-iam-policy]. - -```json -{ - "Version": "2012-10-17", - "Statement": [ - { - "Effect": "Allow", - "Action": [ - "s3:GetObject", - "s3:PutObject" - ], - "Resource": [ - "arn:aws:s3::://*" - ] - } - ] -} -``` - -### Cross-origin resource sharing (CORS) configuration - -For CLP's log viewer to be able to view the compressed logs from S3 over the internet, the S3 bucket -must have a CORS policy configured. - -Add the CORS configuration below to your bucket by following [this guide][aws-cors-guide]: - -```json -[ - { - "AllowedHeaders": [ - "*" - ], - "AllowedMethods": [ - "GET" - ], - "AllowedOrigins": [ - "*" - ], - "ExposeHeaders": [ - "Access-Control-Allow-Origin" - ] - } -] -``` - -:::{tip} -The CORS policy above allows requests from any host (origin). If you already know what hosts will -access CLP's web interface, you can enhance security by changing `AllowedOrigins` from `*` to the -specific list of hosts that will access the web interface. -::: - -### Configuring CLP's stream storage location - -To configure CLP to store stream files on S3, update the `stream_output.storage` key in -`/etc/clp-config.yml`: - -```yaml -stream_output: - storage: - type: "s3" - staging_directory: "var/data/staged-streams" # Or a path of your choosing - s3_config: - region_code: "" - bucket: "" - key_prefix: "" - credentials: - access_key_id: "" - secret_access_key: "" - - # stream_output's other config keys -``` - -The configuration keys above function identically to those in `archive_output.storage`, except they -should be configured to use a different S3 path. - -:::{note} -To view compressed log files, clp-text currently converts them into IR streams that the log viewer -can open, while clp-json converts them into JSONL streams. These streams only need to be stored for -as long as the streams are being viewed, but CLP currently doesn't explicitly delete the streams. -This limitation will be addressed in a future release. -::: - -[add-iam-policy]: https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html#embed-inline-policy-console -[aws-cors-guide]: https://docs.aws.amazon.com/AmazonS3/latest/userguide/enabling-cors-examples.html -[aws-create-iam-user]: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html -[aws-key-prefixes]: https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-prefixes.html -[aws-region-codes]: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.RegionsAndAvailabilityZones.html#Concepts.RegionsAndAvailabilityZones.Availability -[least-privilege-principle]: https://en.wikipedia.org/wiki/Principle_of_least_privilege -[release-choices]: quick-start-cluster-setup/index.md#choosing-a-release diff --git a/docs/src/user-guide/guides-using-object-storage/archive-storage.md b/docs/src/user-guide/guides-using-object-storage/archive-storage.md new file mode 100644 index 000000000..e2ff3bd7b --- /dev/null +++ b/docs/src/user-guide/guides-using-object-storage/archive-storage.md @@ -0,0 +1,64 @@ +# Storing archives + +To store compressed archives on S3, you'll need to: + +1. Enable the CLP IAM user to access the S3 path where archives should be stored. +2. Configure CLP to store archives under the relevant S3 path. + +## IAM user configuration + +Attach the following policy to the CLP IAM user by following [this guide][add-iam-policy]. + +```json +{ + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Action": [ + "s3:GetObject", + "s3:PutObject" + ], + "Resource": [ + "arn:aws:s3::://*" + ] + } + ] +} +``` + +Replace the fields in angle brackets (`<>`) with the appropriate values: + +* `` should be the name of the S3 bucket to store compressed archives. +* `` should be the prefix where you want the compressed archives to be stored under. + +## Configuring CLP's archive storage location + +To configure CLP to store archives on S3, update the `archive_output.storage` key in +`/etc/clp-config.yml`: + +```yaml +archive_output: + storage: + type: "s3" + staging_directory: "var/data/staged-archives" # Or a path of your choosing + s3_config: + region_code: "" + bucket: "" + key_prefix: "" + credentials: + access_key_id: "" + secret_access_key: "" + + # archive_output's other config keys +``` + +Replace the fields in angle brackets (`<>`) with the appropriate values: + +* `s3_config` configures both the S3 bucket where archives should be stored and the credentials + for accessing it. + * `` is the AWS region [code][aws-region-codes] for the bucket. + * `` is the bucket's name. + * `` is the "directory" where all archives will be stored within the bucket and + must end with `/`. + * `credentials` contains the CLP IAM user's credentials. diff --git a/docs/src/user-guide/guides-using-object-storage/compress.md b/docs/src/user-guide/guides-using-object-storage/compress.md new file mode 100644 index 000000000..2487e2fb8 --- /dev/null +++ b/docs/src/user-guide/guides-using-object-storage/compress.md @@ -0,0 +1,78 @@ +# Compressing logs + +To compress logs from S3, you'll need to: + +1. Enable the CLP IAM user to access the S3 path containing your logs. +2. Use the `s3` subcommand of `sbin/compress.sh` to compress your logs. + +## IAM user configuration + +Attach the following policy to the CLP IAM user by following [this guide][add-iam-policy]. + +```json +{ + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Action": "s3:GetObject", + "Resource": [ + "arn:aws:s3::://*" + ] + }, + { + "Effect": "Allow", + "Action": [ + "s3:ListBucket" + ], + "Resource": [ + "arn:aws:s3:::" + ], + "Condition": { + "StringLike": { + "s3:prefix": "/*" + } + } + } + ] +} +``` + +Replace the fields in angle brackets (`<>`) with the appropriate values: + +* `` should be the name of the S3 bucket containing your logs. +* `` should be the prefix of all logs you wish to compress. + +## Using `sbin/compress.sh s3` + +You can use the `s3` subcommand as follows: + +```bash +sbin/compress.sh s3 --aws-credentials-file s3:/// +``` + +* `` is the path to an AWS credentials file like the following: + + ```ini + [default] + aws_access_key_id = + aws_secret_access_key = + ``` + + * CLP expects the credentials to be in the `default` section. + * `` and `` are the access key ID and secret access + key of the CLP IAM user. + * If you don't want to use a credentials file, you can specify the credentials on the command + line using the `--aws-access-key-id` and `--aws-secret-access-key` flags. + +* `` is the name of the S3 bucket containing your logs. +* `` is the prefix of all logs you wish to compress. + +:::{note} +The `s3` subcommand only supports a single URL but will compress any logs that have the given +prefix. + +If you wish to compress a single log file, specify the entire path to the log file. However, if that +log file's path is a prefix of another log file's path, then both log files will be compressed. This +limitation will be addressed in a future release. +::: diff --git a/docs/src/user-guide/guides-using-object-storage/index.md b/docs/src/user-guide/guides-using-object-storage/index.md new file mode 100644 index 000000000..a22bbc10a --- /dev/null +++ b/docs/src/user-guide/guides-using-object-storage/index.md @@ -0,0 +1,105 @@ +# Using object storage + +CLP can: + +* compress logs from object storage (e.g., S3); +* store archives on object storage; and +* view the compressed logs from object storage. + +This guide explains how to configure CLP for all three use cases. Note that you can choose to use +object storage for any combination of the three use cases (e.g., compress logs from S3 and view the +compressed logs from S3, but store archives on the local filesystem). + +:::{note} +Currently, only the [clp-json][release-choices] release supports object storage. Support for +`clp-text` will be added in a future release. +::: + +:::{note} +Currently, CLP only supports using S3 as object storage. Support for other object storage services +will be added in a future release. +::: + +## Prerequisites + +1. An S3 bucket and [key prefix][aws-key-prefixes] containing the logs you wish to compress. + * An S3 URL is a combination of a bucket name and a key prefix as shown below: + + :::{mermaid} + %%{ + init: { + "theme": "base", + "themeVariables": { + "primaryColor": "#0066cc", + "primaryTextColor": "#fff", + "primaryBorderColor": "transparent", + "lineColor": "#9580ff", + "secondaryColor": "#9580ff", + "tertiaryColor": "#fff" + } + } + }%% + graph TD + A["s3://my-bucket-name/my-logs-dir/"] --"Bucket name"--> B[my-bucket-name] + A --"Key prefix"--> C[path/to/my/file.txt] + ::: + +2. An S3 bucket and key prefix where you wish to store compressed archives. +3. An S3 bucket and key prefix where you wish to store intermediate files for viewing compressed + logs. +4. An AWS IAM user with the necessary permissions to access the S3 prefixes mentioned above. + * To create a user, follow [this guide][aws-create-iam-user]. + * You may use a different IAM user for each use case to follow the + [principle of least privilege][least-privilege-principle], or you can use the same user for + all three. + * For brevity, we'll refer to this user as the "CLP IAM user" in the rest of this guide. + +:::{note} +CLP currently requires IAM user (long-term) credentials to access the relevant S3 buckets. Support +for other authentication methods (e.g., temporary credentials) will be added in a future release. +::: + +## Use cases + +The following subsections below explain how to set up each use case: + +::::{grid} 1 1 1 1 +:gutter: 2 + +:::{grid-item-card} +:link: compress +Compression +^^^ +Compressing logs from object storage +::: + +:::{grid-item-card} +:link: archive-storage +Archive storage +^^^ +Storing archives on object storage +::: + +:::{grid-item-card} +:link: stream-storage +Stream storage +^^^ +Viewing compressed logs from object storage +::: +:::: + +:::{toctree} +:hidden: + +compress +archive-storage +stream-storage +::: + +[add-iam-policy]: https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html#embed-inline-policy-console +[aws-cors-guide]: https://docs.aws.amazon.com/AmazonS3/latest/userguide/enabling-cors-examples.html +[aws-create-iam-user]: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html +[aws-key-prefixes]: https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-prefixes.html +[aws-region-codes]: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.RegionsAndAvailabilityZones.html#Concepts.RegionsAndAvailabilityZones.Availability +[least-privilege-principle]: https://en.wikipedia.org/wiki/Principle_of_least_privilege +[release-choices]: ../quick-start-cluster-setup/index.md#choosing-a-release diff --git a/docs/src/user-guide/guides-using-object-storage/stream-storage.md b/docs/src/user-guide/guides-using-object-storage/stream-storage.md new file mode 100644 index 000000000..481ca0739 --- /dev/null +++ b/docs/src/user-guide/guides-using-object-storage/stream-storage.md @@ -0,0 +1,93 @@ +# Viewing compressed logs + +To view compressed logs from S3, you'll need to: + +1. Enable the CLP IAM user to access the S3 path where stream files (logs in a format viewable by + the log viewer) should be stored. +2. Set up a cross-origin resource sharing (CORS) policy for the S3 path in (1). +3. Configure CLP to store stream files under the S3 path in (1). + +## IAM user configuration + +Attach the following policy to the CLP IAM user by following [this guide][add-iam-policy]. + +```json +{ + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Action": [ + "s3:GetObject", + "s3:PutObject" + ], + "Resource": [ + "arn:aws:s3::://*" + ] + } + ] +} +``` + +## Cross-origin resource sharing (CORS) configuration + +For CLP's log viewer to be able to view the compressed logs from S3 over the internet, the S3 bucket +must have a CORS policy configured. + +Add the CORS configuration below to your bucket by following [this guide][aws-cors-guide]: + +```json +[ + { + "AllowedHeaders": [ + "*" + ], + "AllowedMethods": [ + "GET" + ], + "AllowedOrigins": [ + "*" + ], + "ExposeHeaders": [ + "Access-Control-Allow-Origin" + ] + } +] +``` + +:::{tip} +The CORS policy above allows requests from any host (origin). If you already know what hosts will +access CLP's web interface, you can enhance security by changing `AllowedOrigins` from `*` to the +specific list of hosts that will access the web interface. +::: + +## Configuring CLP's stream storage location + +To configure CLP to store stream files on S3, update the `stream_output.storage` key in +`/etc/clp-config.yml`: + +```yaml +stream_output: + storage: + type: "s3" + staging_directory: "var/data/staged-streams" # Or a path of your choosing + s3_config: + region_code: "" + bucket: "" + key_prefix: "" + credentials: + access_key_id: "" + secret_access_key: "" + + # stream_output's other config keys +``` + +The configuration keys above function identically to those in `archive_output.storage`, except they +should be configured to use a different S3 path. + +:::{note} +To view compressed log files, clp-text currently converts them into IR streams that the log viewer +can open, while clp-json converts them into JSONL streams. These streams only need to be stored for +as long as the streams are being viewed, but CLP currently doesn't explicitly delete the streams. +This limitation will be addressed in a future release. +::: diff --git a/docs/src/user-guide/index.md b/docs/src/user-guide/index.md index 46efcc8b7..642eac4a6 100644 --- a/docs/src/user-guide/index.md +++ b/docs/src/user-guide/index.md @@ -60,7 +60,7 @@ quick-start-search/index :glob: guides-overview -guides-using-object-storage +guides-using-object-storage/index ::: :::{toctree} diff --git a/docs/src/user-guide/quick-start-compression/json.md b/docs/src/user-guide/quick-start-compression/json.md index 272ce33bf..d32b0abb7 100644 --- a/docs/src/user-guide/quick-start-compression/json.md +++ b/docs/src/user-guide/quick-start-compression/json.md @@ -9,7 +9,7 @@ sbin/compress.sh fs --timestamp-key '' [ ...] * `fs` is a subcommand for compressing logs from the filesystem. :::{tip} To learn how to compress logs from object storage, see - [Using object storage](../guides-using-object-storage.md). + [Using object storage](../guides-using-object-storage/index). ::: * `` is the field path of the kv-pair that contains the timestamp in each log event. From baeffb12ac0ef069f5daa447ed50bab586e0eb8e Mon Sep 17 00:00:00 2001 From: Kirk Rodrigues <2454684+kirkrodrigues@users.noreply.github.com> Date: Wed, 22 Jan 2025 07:19:23 -0500 Subject: [PATCH 10/31] Move object storage compression tip. --- docs/src/user-guide/quick-start-compression/json.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/src/user-guide/quick-start-compression/json.md b/docs/src/user-guide/quick-start-compression/json.md index d32b0abb7..6091762a8 100644 --- a/docs/src/user-guide/quick-start-compression/json.md +++ b/docs/src/user-guide/quick-start-compression/json.md @@ -7,11 +7,6 @@ sbin/compress.sh fs --timestamp-key '' [ ...] ``` * `fs` is a subcommand for compressing logs from the filesystem. - :::{tip} - To learn how to compress logs from object storage, see - [Using object storage](../guides-using-object-storage/index). - ::: - * `` is the field path of the kv-pair that contains the timestamp in each log event. * E.g., if your log events look like `{"timestamp": {"iso8601": "2024-01-01 00:01:02.345", ...}}`, you should enter @@ -27,6 +22,11 @@ sbin/compress.sh fs --timestamp-key '' [ ...] * Each JSON log file should contain each log event as a [separate JSON object][json-log-format], i.e., _not_ as an array. +:::{tip} +To compress logs from object storage, see +[Using object storage](../guides-using-object-storage/index). +::: + # Sample logs For some sample logs, check out the open-source [datasets](../resources-datasets.md). From b95b45f9cd20255dc1cce86a16bdb4fe189ae7a4 Mon Sep 17 00:00:00 2001 From: Kirk Rodrigues <2454684+kirkrodrigues@users.noreply.github.com> Date: Wed, 22 Jan 2025 07:20:02 -0500 Subject: [PATCH 11/31] Touch-ups. --- docs/src/user-guide/guides-using-object-storage/index.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/src/user-guide/guides-using-object-storage/index.md b/docs/src/user-guide/guides-using-object-storage/index.md index a22bbc10a..928cb9350 100644 --- a/docs/src/user-guide/guides-using-object-storage/index.md +++ b/docs/src/user-guide/guides-using-object-storage/index.md @@ -68,21 +68,21 @@ The following subsections below explain how to set up each use case: :::{grid-item-card} :link: compress -Compression +Compressing logs ^^^ Compressing logs from object storage ::: :::{grid-item-card} :link: archive-storage -Archive storage +Storing archives ^^^ Storing archives on object storage ::: :::{grid-item-card} :link: stream-storage -Stream storage +Viewing compressed logs ^^^ Viewing compressed logs from object storage ::: From c99bce348b3c946eb87cda40e8f1de9c7e929fe5 Mon Sep 17 00:00:00 2001 From: Kirk Rodrigues <2454684+kirkrodrigues@users.noreply.github.com> Date: Wed, 22 Jan 2025 07:27:45 -0500 Subject: [PATCH 12/31] Explain staging_directory. --- .../user-guide/guides-using-object-storage/archive-storage.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/src/user-guide/guides-using-object-storage/archive-storage.md b/docs/src/user-guide/guides-using-object-storage/archive-storage.md index e2ff3bd7b..75d2fb5ae 100644 --- a/docs/src/user-guide/guides-using-object-storage/archive-storage.md +++ b/docs/src/user-guide/guides-using-object-storage/archive-storage.md @@ -55,6 +55,8 @@ archive_output: Replace the fields in angle brackets (`<>`) with the appropriate values: +* `staging_directory` is the local filesystem directory where archives will be temporarily stored + before being uploaded to S3. * `s3_config` configures both the S3 bucket where archives should be stored and the credentials for accessing it. * `` is the AWS region [code][aws-region-codes] for the bucket. From bdcf423092dfe8404f990f65214a50bd7a10e5ee Mon Sep 17 00:00:00 2001 From: Kirk Rodrigues <2454684+kirkrodrigues@users.noreply.github.com> Date: Wed, 22 Jan 2025 07:29:57 -0500 Subject: [PATCH 13/31] Fix links. --- .../user-guide/guides-using-object-storage/archive-storage.md | 3 +++ docs/src/user-guide/guides-using-object-storage/compress.md | 2 ++ docs/src/user-guide/guides-using-object-storage/index.md | 3 --- .../user-guide/guides-using-object-storage/stream-storage.md | 3 +++ 4 files changed, 8 insertions(+), 3 deletions(-) diff --git a/docs/src/user-guide/guides-using-object-storage/archive-storage.md b/docs/src/user-guide/guides-using-object-storage/archive-storage.md index 75d2fb5ae..1e72907e5 100644 --- a/docs/src/user-guide/guides-using-object-storage/archive-storage.md +++ b/docs/src/user-guide/guides-using-object-storage/archive-storage.md @@ -64,3 +64,6 @@ Replace the fields in angle brackets (`<>`) with the appropriate values: * `` is the "directory" where all archives will be stored within the bucket and must end with `/`. * `credentials` contains the CLP IAM user's credentials. + +[add-iam-policy]: https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html#embed-inline-policy-console +[aws-region-codes]: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.RegionsAndAvailabilityZones.html#Concepts.RegionsAndAvailabilityZones.Availability diff --git a/docs/src/user-guide/guides-using-object-storage/compress.md b/docs/src/user-guide/guides-using-object-storage/compress.md index 2487e2fb8..a39f821f6 100644 --- a/docs/src/user-guide/guides-using-object-storage/compress.md +++ b/docs/src/user-guide/guides-using-object-storage/compress.md @@ -76,3 +76,5 @@ If you wish to compress a single log file, specify the entire path to the log fi log file's path is a prefix of another log file's path, then both log files will be compressed. This limitation will be addressed in a future release. ::: + +[add-iam-policy]: https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html#embed-inline-policy-console diff --git a/docs/src/user-guide/guides-using-object-storage/index.md b/docs/src/user-guide/guides-using-object-storage/index.md index 928cb9350..985286bc3 100644 --- a/docs/src/user-guide/guides-using-object-storage/index.md +++ b/docs/src/user-guide/guides-using-object-storage/index.md @@ -96,10 +96,7 @@ archive-storage stream-storage ::: -[add-iam-policy]: https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html#embed-inline-policy-console -[aws-cors-guide]: https://docs.aws.amazon.com/AmazonS3/latest/userguide/enabling-cors-examples.html [aws-create-iam-user]: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html [aws-key-prefixes]: https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-prefixes.html -[aws-region-codes]: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.RegionsAndAvailabilityZones.html#Concepts.RegionsAndAvailabilityZones.Availability [least-privilege-principle]: https://en.wikipedia.org/wiki/Principle_of_least_privilege [release-choices]: ../quick-start-cluster-setup/index.md#choosing-a-release diff --git a/docs/src/user-guide/guides-using-object-storage/stream-storage.md b/docs/src/user-guide/guides-using-object-storage/stream-storage.md index 481ca0739..80bbcab57 100644 --- a/docs/src/user-guide/guides-using-object-storage/stream-storage.md +++ b/docs/src/user-guide/guides-using-object-storage/stream-storage.md @@ -91,3 +91,6 @@ can open, while clp-json converts them into JSONL streams. These streams only ne as long as the streams are being viewed, but CLP currently doesn't explicitly delete the streams. This limitation will be addressed in a future release. ::: + +[aws-cors-guide]: https://docs.aws.amazon.com/AmazonS3/latest/userguide/enabling-cors-examples.html +[add-iam-policy]: https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html#embed-inline-policy-console From 82caf114422f92859a41e91a3ea49ee1245719fa Mon Sep 17 00:00:00 2001 From: Kirk Rodrigues <2454684+kirkrodrigues@users.noreply.github.com> Date: Wed, 22 Jan 2025 08:48:14 -0500 Subject: [PATCH 14/31] Apply the Rabbit's suggestions. --- docs/src/user-guide/guides-overview.md | 2 +- .../archive-storage.md | 2 +- .../guides-using-object-storage/compress.md | 28 ++++++++++--------- 3 files changed, 17 insertions(+), 15 deletions(-) diff --git a/docs/src/user-guide/guides-overview.md b/docs/src/user-guide/guides-overview.md index 23835a9ff..5e8179bf7 100644 --- a/docs/src/user-guide/guides-overview.md +++ b/docs/src/user-guide/guides-overview.md @@ -1,6 +1,6 @@ # Overview -The guides below describe how to use CLP in a variety of use cases. +The guides below describe how to use CLP in different use cases. ::::{grid} 1 1 2 2 :gutter: 2 diff --git a/docs/src/user-guide/guides-using-object-storage/archive-storage.md b/docs/src/user-guide/guides-using-object-storage/archive-storage.md index 1e72907e5..b87b791d0 100644 --- a/docs/src/user-guide/guides-using-object-storage/archive-storage.md +++ b/docs/src/user-guide/guides-using-object-storage/archive-storage.md @@ -62,7 +62,7 @@ Replace the fields in angle brackets (`<>`) with the appropriate values: * `` is the AWS region [code][aws-region-codes] for the bucket. * `` is the bucket's name. * `` is the "directory" where all archives will be stored within the bucket and - must end with `/`. + must end with trailing forward slash (e.g., `archives/`). * `credentials` contains the CLP IAM user's credentials. [add-iam-policy]: https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html#embed-inline-policy-console diff --git a/docs/src/user-guide/guides-using-object-storage/compress.md b/docs/src/user-guide/guides-using-object-storage/compress.md index a39f821f6..87382c29d 100644 --- a/docs/src/user-guide/guides-using-object-storage/compress.md +++ b/docs/src/user-guide/guides-using-object-storage/compress.md @@ -53,17 +53,18 @@ sbin/compress.sh s3 --aws-credentials-file s3:// * `` is the path to an AWS credentials file like the following: - ```ini - [default] - aws_access_key_id = - aws_secret_access_key = - ``` - - * CLP expects the credentials to be in the `default` section. - * `` and `` are the access key ID and secret access - key of the CLP IAM user. - * If you don't want to use a credentials file, you can specify the credentials on the command - line using the `--aws-access-key-id` and `--aws-secret-access-key` flags. + ```ini + [default] + aws_access_key_id = + aws_secret_access_key = + ``` + + * CLP expects the credentials to be in the `default` section. + * `` and `` are the access key ID and secret access + key of the CLP IAM user. + * If you don't want to use a credentials file, you can specify the credentials on the command + line using the `--aws-access-key-id` and `--aws-secret-access-key` flags (note that this may + expose your credentials to other users running on the system). * `` is the name of the S3 bucket containing your logs. * `` is the prefix of all logs you wish to compress. @@ -73,8 +74,9 @@ The `s3` subcommand only supports a single URL but will compress any logs that h prefix. If you wish to compress a single log file, specify the entire path to the log file. However, if that -log file's path is a prefix of another log file's path, then both log files will be compressed. This -limitation will be addressed in a future release. +log file's path is a prefix of another log file's path, then both log files will be compressed +(e.g., with two files "logs/syslog" and "logs/syslog.1", a prefix like "logs/syslog" will cause +both logs to be compressed). This limitation will be addressed in a future release. ::: [add-iam-policy]: https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html#embed-inline-policy-console From d7d28007efa95cd70535260a0359f99f23faea39 Mon Sep 17 00:00:00 2001 From: Kirk Rodrigues <2454684+kirkrodrigues@users.noreply.github.com> Date: Wed, 22 Jan 2025 15:20:54 -0500 Subject: [PATCH 15/31] Remove misleading S3 path diagram; Add step for creating IAM user credentials. --- .../guides-using-object-storage/index.md | 33 +++++-------------- 1 file changed, 8 insertions(+), 25 deletions(-) diff --git a/docs/src/user-guide/guides-using-object-storage/index.md b/docs/src/user-guide/guides-using-object-storage/index.md index 985286bc3..2225cd9b4 100644 --- a/docs/src/user-guide/guides-using-object-storage/index.md +++ b/docs/src/user-guide/guides-using-object-storage/index.md @@ -23,27 +23,6 @@ will be added in a future release. ## Prerequisites 1. An S3 bucket and [key prefix][aws-key-prefixes] containing the logs you wish to compress. - * An S3 URL is a combination of a bucket name and a key prefix as shown below: - - :::{mermaid} - %%{ - init: { - "theme": "base", - "themeVariables": { - "primaryColor": "#0066cc", - "primaryTextColor": "#fff", - "primaryBorderColor": "transparent", - "lineColor": "#9580ff", - "secondaryColor": "#9580ff", - "tertiaryColor": "#fff" - } - } - }%% - graph TD - A["s3://my-bucket-name/my-logs-dir/"] --"Bucket name"--> B[my-bucket-name] - A --"Key prefix"--> C[path/to/my/file.txt] - ::: - 2. An S3 bucket and key prefix where you wish to store compressed archives. 3. An S3 bucket and key prefix where you wish to store intermediate files for viewing compressed logs. @@ -53,11 +32,14 @@ will be added in a future release. [principle of least privilege][least-privilege-principle], or you can use the same user for all three. * For brevity, we'll refer to this user as the "CLP IAM user" in the rest of this guide. +5. IAM user (long-term) credentials for the IAM user(s) created in step (4) above. + * To create these credentials, follow [this guide][aws-create-access-keys]. -:::{note} -CLP currently requires IAM user (long-term) credentials to access the relevant S3 buckets. Support -for other authentication methods (e.g., temporary credentials) will be added in a future release. -::: + :::{note} + CLP currently requires IAM user (long-term) credentials to access the relevant S3 buckets. + Support for other authentication methods (e.g., temporary credentials) will be added in a future + release. + ::: ## Use cases @@ -96,6 +78,7 @@ archive-storage stream-storage ::: +[aws-create-access-keys]: https://docs.aws.amazon.com/keyspaces/latest/devguide/create.keypair.html [aws-create-iam-user]: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html [aws-key-prefixes]: https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-prefixes.html [least-privilege-principle]: https://en.wikipedia.org/wiki/Principle_of_least_privilege From 2f9ae7f6899a19b21623285d2e17b1cc42ac03eb Mon Sep 17 00:00:00 2001 From: Kirk Rodrigues <2454684+kirkrodrigues@users.noreply.github.com> Date: Wed, 22 Jan 2025 15:38:30 -0500 Subject: [PATCH 16/31] Change from 'viewing compressed logs' to 'caching stream files'. --- .../guides-using-object-storage/index.md | 13 ++++--- .../stream-storage.md | 36 +++++++++++-------- 2 files changed, 28 insertions(+), 21 deletions(-) diff --git a/docs/src/user-guide/guides-using-object-storage/index.md b/docs/src/user-guide/guides-using-object-storage/index.md index 2225cd9b4..37cd00f77 100644 --- a/docs/src/user-guide/guides-using-object-storage/index.md +++ b/docs/src/user-guide/guides-using-object-storage/index.md @@ -4,11 +4,11 @@ CLP can: * compress logs from object storage (e.g., S3); * store archives on object storage; and -* view the compressed logs from object storage. +* cache stream files (used for viewing compressed logs) on object storage. This guide explains how to configure CLP for all three use cases. Note that you can choose to use -object storage for any combination of the three use cases (e.g., compress logs from S3 and view the -compressed logs from S3, but store archives on the local filesystem). +object storage for any combination of the three use cases (e.g., compress logs from S3 and cache the +stream files on S3, but store archives on the local filesystem). :::{note} Currently, only the [clp-json][release-choices] release supports object storage. Support for @@ -24,8 +24,7 @@ will be added in a future release. 1. An S3 bucket and [key prefix][aws-key-prefixes] containing the logs you wish to compress. 2. An S3 bucket and key prefix where you wish to store compressed archives. -3. An S3 bucket and key prefix where you wish to store intermediate files for viewing compressed - logs. +3. An S3 bucket and key prefix where you wish to cache stream files. 4. An AWS IAM user with the necessary permissions to access the S3 prefixes mentioned above. * To create a user, follow [this guide][aws-create-iam-user]. * You may use a different IAM user for each use case to follow the @@ -64,9 +63,9 @@ Storing archives on object storage :::{grid-item-card} :link: stream-storage -Viewing compressed logs +Caching stream files ^^^ -Viewing compressed logs from object storage +Caching stream files on object storage ::: :::: diff --git a/docs/src/user-guide/guides-using-object-storage/stream-storage.md b/docs/src/user-guide/guides-using-object-storage/stream-storage.md index 80bbcab57..fe7f8a31c 100644 --- a/docs/src/user-guide/guides-using-object-storage/stream-storage.md +++ b/docs/src/user-guide/guides-using-object-storage/stream-storage.md @@ -1,11 +1,19 @@ -# Viewing compressed logs +# Caching stream files -To view compressed logs from S3, you'll need to: +The [log viewer][yscope-log-viewer] currently supports viewing [IR][uber-clp-blog-1] and JSONL +stream files but not CLP archives; thus, to view the compressed logs from a CLP archive, CLP first +converts the compressed logs into stream files. These streams can be cached on the filesystem, or on +object storage as explained below. -1. Enable the CLP IAM user to access the S3 path where stream files (logs in a format viewable by - the log viewer) should be stored. +:::{note} +A future version of the log viewer will support viewing CLP archives directly. +::: + +To cache the stream files on S3, you'll need to: + +1. Enable the CLP IAM user to access the S3 path where stream files should be stored. 2. Set up a cross-origin resource sharing (CORS) policy for the S3 path in (1). -3. Configure CLP to store stream files under the S3 path in (1). +3. Configure CLP to cache stream files under the S3 path from step (1). ## IAM user configuration @@ -31,8 +39,8 @@ Attach the following policy to the CLP IAM user by following [this guide][add-ia ## Cross-origin resource sharing (CORS) configuration -For CLP's log viewer to be able to view the compressed logs from S3 over the internet, the S3 bucket -must have a CORS policy configured. +For CLP's log viewer to be able to open the cached stream files from S3 over the internet, the S3 +bucket must have a CORS policy configured. Add the CORS configuration below to your bucket by following [this guide][aws-cors-guide]: @@ -57,13 +65,13 @@ Add the CORS configuration below to your bucket by following [this guide][aws-co :::{tip} The CORS policy above allows requests from any host (origin). If you already know what hosts will -access CLP's web interface, you can enhance security by changing `AllowedOrigins` from `*` to the -specific list of hosts that will access the web interface. +access CLP's web interface, you can enhance security by changing `AllowedOrigins` from `["*"]` to +the specific list of hosts that will access the web interface. ::: ## Configuring CLP's stream storage location -To configure CLP to store stream files on S3, update the `stream_output.storage` key in +To configure CLP to cache stream files on S3, update the `stream_output.storage` key in `/etc/clp-config.yml`: ```yaml @@ -86,11 +94,11 @@ The configuration keys above function identically to those in `archive_output.st should be configured to use a different S3 path. :::{note} -To view compressed log files, clp-text currently converts them into IR streams that the log viewer -can open, while clp-json converts them into JSONL streams. These streams only need to be stored for -as long as the streams are being viewed, but CLP currently doesn't explicitly delete the streams. -This limitation will be addressed in a future release. +CLP currently doesn't explicitly delete the cached streams. This limitation will be addressed in a +future release. ::: [aws-cors-guide]: https://docs.aws.amazon.com/AmazonS3/latest/userguide/enabling-cors-examples.html [add-iam-policy]: https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html#embed-inline-policy-console +[uber-clp-blog-1]: https://www.uber.com/en-US/blog/reducing-logging-cost-by-two-orders-of-magnitude-using-clp +[yscope-log-viewer]: https://github.com/y-scope/yscope-log-viewer From 90dbe51daaeaf862d97c6b52dcd6227ba25c700f Mon Sep 17 00:00:00 2001 From: Kirk Rodrigues <2454684+kirkrodrigues@users.noreply.github.com> Date: Wed, 22 Jan 2025 16:02:54 -0500 Subject: [PATCH 17/31] Diction. --- .../user-guide/guides-using-object-storage/stream-storage.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/src/user-guide/guides-using-object-storage/stream-storage.md b/docs/src/user-guide/guides-using-object-storage/stream-storage.md index fe7f8a31c..a4898121e 100644 --- a/docs/src/user-guide/guides-using-object-storage/stream-storage.md +++ b/docs/src/user-guide/guides-using-object-storage/stream-storage.md @@ -39,7 +39,7 @@ Attach the following policy to the CLP IAM user by following [this guide][add-ia ## Cross-origin resource sharing (CORS) configuration -For CLP's log viewer to be able to open the cached stream files from S3 over the internet, the S3 +For CLP's log viewer to be able to access the cached stream files from S3 over the internet, the S3 bucket must have a CORS policy configured. Add the CORS configuration below to your bucket by following [this guide][aws-cors-guide]: From f6171e7723709d664e2a49ff7eee3851a139959e Mon Sep 17 00:00:00 2001 From: Kirk Rodrigues <2454684+kirkrodrigues@users.noreply.github.com> Date: Wed, 22 Jan 2025 22:40:13 -0500 Subject: [PATCH 18/31] Fix S3 permissions for ingestion; Use virtual-host style URL; Clarify bucket storage prefixes. --- .../guides-using-object-storage/archive-storage.md | 5 +++-- .../user-guide/guides-using-object-storage/compress.md | 10 +++++++--- .../guides-using-object-storage/stream-storage.md | 6 ++++++ 3 files changed, 16 insertions(+), 5 deletions(-) diff --git a/docs/src/user-guide/guides-using-object-storage/archive-storage.md b/docs/src/user-guide/guides-using-object-storage/archive-storage.md index b87b791d0..5da4744fb 100644 --- a/docs/src/user-guide/guides-using-object-storage/archive-storage.md +++ b/docs/src/user-guide/guides-using-object-storage/archive-storage.md @@ -29,8 +29,9 @@ Attach the following policy to the CLP IAM user by following [this guide][add-ia Replace the fields in angle brackets (`<>`) with the appropriate values: -* `` should be the name of the S3 bucket to store compressed archives. -* `` should be the prefix where you want the compressed archives to be stored under. +* `` should be the name of the S3 bucket where compressed archives should be stored. +* `` should be the prefix (used like a directory path) where compressed archives should + be stored. ## Configuring CLP's archive storage location diff --git a/docs/src/user-guide/guides-using-object-storage/compress.md b/docs/src/user-guide/guides-using-object-storage/compress.md index 87382c29d..b3a43a3f1 100644 --- a/docs/src/user-guide/guides-using-object-storage/compress.md +++ b/docs/src/user-guide/guides-using-object-storage/compress.md @@ -17,7 +17,7 @@ Attach the following policy to the CLP IAM user by following [this guide][add-ia "Effect": "Allow", "Action": "s3:GetObject", "Resource": [ - "arn:aws:s3::://*" + "arn:aws:s3:::/*" ] }, { @@ -30,7 +30,7 @@ Attach the following policy to the CLP IAM user by following [this guide][add-ia ], "Condition": { "StringLike": { - "s3:prefix": "/*" + "s3:prefix": "*" } } } @@ -48,7 +48,9 @@ Replace the fields in angle brackets (`<>`) with the appropriate values: You can use the `s3` subcommand as follows: ```bash -sbin/compress.sh s3 --aws-credentials-file s3:/// +sbin/compress.sh s3 \ + --aws-credentials-file \ + https://.s3..amazonaws.com/ ``` * `` is the path to an AWS credentials file like the following: @@ -67,6 +69,7 @@ sbin/compress.sh s3 --aws-credentials-file s3:// expose your credentials to other users running on the system). * `` is the name of the S3 bucket containing your logs. +* `` is the AWS region [code][aws-region-codes] for the S3 bucket containing your logs. * `` is the prefix of all logs you wish to compress. :::{note} @@ -80,3 +83,4 @@ both logs to be compressed). This limitation will be addressed in a future relea ::: [add-iam-policy]: https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html#embed-inline-policy-console +[aws-region-codes]: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.RegionsAndAvailabilityZones.html#Concepts.RegionsAndAvailabilityZones.Availability diff --git a/docs/src/user-guide/guides-using-object-storage/stream-storage.md b/docs/src/user-guide/guides-using-object-storage/stream-storage.md index a4898121e..3e59ecd11 100644 --- a/docs/src/user-guide/guides-using-object-storage/stream-storage.md +++ b/docs/src/user-guide/guides-using-object-storage/stream-storage.md @@ -37,6 +37,12 @@ Attach the following policy to the CLP IAM user by following [this guide][add-ia } ``` +Replace the fields in angle brackets (`<>`) with the appropriate values: + +* `` should be the name of the S3 bucket where cached streams should be stored. +* `` should be the prefix (used like a directory path) where cached streams should be + stored. + ## Cross-origin resource sharing (CORS) configuration For CLP's log viewer to be able to access the cached stream files from S3 over the internet, the S3 From 11e6991e0424cbe045a2daa4c19994998004d50b Mon Sep 17 00:00:00 2001 From: Kirk Rodrigues <2454684+kirkrodrigues@users.noreply.github.com> Date: Wed, 22 Jan 2025 22:41:50 -0500 Subject: [PATCH 19/31] Add timestamp-key to compression example. --- docs/src/user-guide/guides-using-object-storage/compress.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/src/user-guide/guides-using-object-storage/compress.md b/docs/src/user-guide/guides-using-object-storage/compress.md index b3a43a3f1..dd6b9dda9 100644 --- a/docs/src/user-guide/guides-using-object-storage/compress.md +++ b/docs/src/user-guide/guides-using-object-storage/compress.md @@ -50,10 +50,12 @@ You can use the `s3` subcommand as follows: ```bash sbin/compress.sh s3 \ --aws-credentials-file \ + --timestamp-key \ https://.s3..amazonaws.com/ ``` * `` is the path to an AWS credentials file like the following: +* `` is the field path of the kv-pair that contains the timestamp in each log event. ```ini [default] From 9c4ee8a4c7befd3604a42c19f2d1d5275df9ed67 Mon Sep 17 00:00:00 2001 From: Kirk Rodrigues <2454684+kirkrodrigues@users.noreply.github.com> Date: Wed, 22 Jan 2025 23:30:22 -0500 Subject: [PATCH 20/31] Clarify key-prefix for ingestion. --- .../guides-using-object-storage/compress.md | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/docs/src/user-guide/guides-using-object-storage/compress.md b/docs/src/user-guide/guides-using-object-storage/compress.md index dd6b9dda9..4ff6ebae3 100644 --- a/docs/src/user-guide/guides-using-object-storage/compress.md +++ b/docs/src/user-guide/guides-using-object-storage/compress.md @@ -42,6 +42,14 @@ Replace the fields in angle brackets (`<>`) with the appropriate values: * `` should be the name of the S3 bucket containing your logs. * `` should be the prefix of all logs you wish to compress. + + :::{note} + If you want to enforce that only logs under a directory-like prefix, e.g., `logs/`, can be + compressed, you can append a trailing slash (`/`) after the `` value. This will + prevent CLP from compressing logs with prefixes like `logs-private`. However, note that to + compress all logs under the `logs/` prefix, you will need to include the trailing slash when + invoking `sbin/compress.sh` below. + ::: ## Using `sbin/compress.sh s3` @@ -72,7 +80,8 @@ sbin/compress.sh s3 \ * `` is the name of the S3 bucket containing your logs. * `` is the AWS region [code][aws-region-codes] for the S3 bucket containing your logs. -* `` is the prefix of all logs you wish to compress. +* `` is the prefix of all logs you wish to compress and must include the `` + value from the IAM policy above. :::{note} The `s3` subcommand only supports a single URL but will compress any logs that have the given From 59b72e9cf8426627626a46848bf6fbdb95511946 Mon Sep 17 00:00:00 2001 From: Kirk Rodrigues <2454684+kirkrodrigues@users.noreply.github.com> Date: Wed, 22 Jan 2025 23:43:16 -0500 Subject: [PATCH 21/31] Clarify the prefix of the prefix must be the prefix. --- .../guides-using-object-storage/compress.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/docs/src/user-guide/guides-using-object-storage/compress.md b/docs/src/user-guide/guides-using-object-storage/compress.md index 4ff6ebae3..6c46d114d 100644 --- a/docs/src/user-guide/guides-using-object-storage/compress.md +++ b/docs/src/user-guide/guides-using-object-storage/compress.md @@ -17,7 +17,7 @@ Attach the following policy to the CLP IAM user by following [this guide][add-ia "Effect": "Allow", "Action": "s3:GetObject", "Resource": [ - "arn:aws:s3:::/*" + "arn:aws:s3:::/*" ] }, { @@ -30,7 +30,7 @@ Attach the following policy to the CLP IAM user by following [this guide][add-ia ], "Condition": { "StringLike": { - "s3:prefix": "*" + "s3:prefix": "*" } } } @@ -41,11 +41,11 @@ Attach the following policy to the CLP IAM user by following [this guide][add-ia Replace the fields in angle brackets (`<>`) with the appropriate values: * `` should be the name of the S3 bucket containing your logs. -* `` should be the prefix of all logs you wish to compress. +* `` should be the prefix of all logs you wish to compress. :::{note} If you want to enforce that only logs under a directory-like prefix, e.g., `logs/`, can be - compressed, you can append a trailing slash (`/`) after the `` value. This will + compressed, you can append a trailing slash (`/`) after the `` value. This will prevent CLP from compressing logs with prefixes like `logs-private`. However, note that to compress all logs under the `logs/` prefix, you will need to include the trailing slash when invoking `sbin/compress.sh` below. @@ -59,7 +59,7 @@ You can use the `s3` subcommand as follows: sbin/compress.sh s3 \ --aws-credentials-file \ --timestamp-key \ - https://.s3..amazonaws.com/ + https://.s3..amazonaws.com/ ``` * `` is the path to an AWS credentials file like the following: @@ -80,8 +80,8 @@ sbin/compress.sh s3 \ * `` is the name of the S3 bucket containing your logs. * `` is the AWS region [code][aws-region-codes] for the S3 bucket containing your logs. -* `` is the prefix of all logs you wish to compress and must include the `` - value from the IAM policy above. +* `` is the prefix of all logs you wish to compress and must begin with the + `` value from the IAM policy above. :::{note} The `s3` subcommand only supports a single URL but will compress any logs that have the given From 4ae7d43f940dcef8519b7845c6ec31dc668bbf19 Mon Sep 17 00:00:00 2001 From: Kirk Rodrigues <2454684+kirkrodrigues@users.noreply.github.com> Date: Thu, 23 Jan 2025 06:26:27 -0500 Subject: [PATCH 22/31] Fix placement of timestamp key line. --- docs/src/user-guide/guides-using-object-storage/compress.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/src/user-guide/guides-using-object-storage/compress.md b/docs/src/user-guide/guides-using-object-storage/compress.md index 6c46d114d..a65a8d8ef 100644 --- a/docs/src/user-guide/guides-using-object-storage/compress.md +++ b/docs/src/user-guide/guides-using-object-storage/compress.md @@ -63,7 +63,6 @@ sbin/compress.sh s3 \ ``` * `` is the path to an AWS credentials file like the following: -* `` is the field path of the kv-pair that contains the timestamp in each log event. ```ini [default] @@ -78,6 +77,7 @@ sbin/compress.sh s3 \ line using the `--aws-access-key-id` and `--aws-secret-access-key` flags (note that this may expose your credentials to other users running on the system). +* `` is the field path of the kv-pair that contains the timestamp in each log event. * `` is the name of the S3 bucket containing your logs. * `` is the AWS region [code][aws-region-codes] for the S3 bucket containing your logs. * `` is the prefix of all logs you wish to compress and must begin with the From 847c04250bd8fa309f8f5caf13ee2c8d940667d1 Mon Sep 17 00:00:00 2001 From: Kirk Rodrigues <2454684+kirkrodrigues@users.noreply.github.com> Date: Thu, 23 Jan 2025 06:26:51 -0500 Subject: [PATCH 23/31] Linebreak before s3 for clarity. --- docs/src/user-guide/guides-using-object-storage/compress.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/src/user-guide/guides-using-object-storage/compress.md b/docs/src/user-guide/guides-using-object-storage/compress.md index a65a8d8ef..51e1dce37 100644 --- a/docs/src/user-guide/guides-using-object-storage/compress.md +++ b/docs/src/user-guide/guides-using-object-storage/compress.md @@ -56,7 +56,8 @@ Replace the fields in angle brackets (`<>`) with the appropriate values: You can use the `s3` subcommand as follows: ```bash -sbin/compress.sh s3 \ +sbin/compress.sh \ + s3 \ --aws-credentials-file \ --timestamp-key \ https://.s3..amazonaws.com/ From c752945f63b1ffa0c576af89a1f489359903d4c5 Mon Sep 17 00:00:00 2001 From: Kirk Rodrigues <2454684+kirkrodrigues@users.noreply.github.com> Date: Thu, 23 Jan 2025 06:27:11 -0500 Subject: [PATCH 24/31] Clarify some instructions based on AWS' UI. --- docs/src/user-guide/guides-using-object-storage/index.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/docs/src/user-guide/guides-using-object-storage/index.md b/docs/src/user-guide/guides-using-object-storage/index.md index 37cd00f77..8e8f218c9 100644 --- a/docs/src/user-guide/guides-using-object-storage/index.md +++ b/docs/src/user-guide/guides-using-object-storage/index.md @@ -27,12 +27,15 @@ will be added in a future release. 3. An S3 bucket and key prefix where you wish to cache stream files. 4. An AWS IAM user with the necessary permissions to access the S3 prefixes mentioned above. * To create a user, follow [this guide][aws-create-iam-user]. + * You don't need to assign any groups or policies to the user at this stage since we will + attach policies in later steps, depending on which object storage use cases you require. * You may use a different IAM user for each use case to follow the [principle of least privilege][least-privilege-principle], or you can use the same user for all three. * For brevity, we'll refer to this user as the "CLP IAM user" in the rest of this guide. 5. IAM user (long-term) credentials for the IAM user(s) created in step (4) above. * To create these credentials, follow [this guide][aws-create-access-keys]. + * Choose the "Other" use case to generate long-term credentials. :::{note} CLP currently requires IAM user (long-term) credentials to access the relevant S3 buckets. From 454cb87f5d88d9a1147d63a24a7992980f2ed18e Mon Sep 17 00:00:00 2001 From: Kirk Rodrigues <2454684+kirkrodrigues@users.noreply.github.com> Date: Thu, 23 Jan 2025 07:11:30 -0500 Subject: [PATCH 25/31] Restructure to make docs easier to follow and use. --- .../archive-storage.md | 70 -------- .../guides-using-object-storage/clp-config.md | 78 +++++++++ .../{compress.md => clp-usage.md} | 62 +------ .../guides-using-object-storage/index.md | 42 +++-- .../object-storage-config.md | 152 ++++++++++++++++++ .../stream-storage.md | 110 ------------- 6 files changed, 263 insertions(+), 251 deletions(-) delete mode 100644 docs/src/user-guide/guides-using-object-storage/archive-storage.md create mode 100644 docs/src/user-guide/guides-using-object-storage/clp-config.md rename docs/src/user-guide/guides-using-object-storage/{compress.md => clp-usage.md} (53%) create mode 100644 docs/src/user-guide/guides-using-object-storage/object-storage-config.md delete mode 100644 docs/src/user-guide/guides-using-object-storage/stream-storage.md diff --git a/docs/src/user-guide/guides-using-object-storage/archive-storage.md b/docs/src/user-guide/guides-using-object-storage/archive-storage.md deleted file mode 100644 index 5da4744fb..000000000 --- a/docs/src/user-guide/guides-using-object-storage/archive-storage.md +++ /dev/null @@ -1,70 +0,0 @@ -# Storing archives - -To store compressed archives on S3, you'll need to: - -1. Enable the CLP IAM user to access the S3 path where archives should be stored. -2. Configure CLP to store archives under the relevant S3 path. - -## IAM user configuration - -Attach the following policy to the CLP IAM user by following [this guide][add-iam-policy]. - -```json -{ - "Version": "2012-10-17", - "Statement": [ - { - "Effect": "Allow", - "Action": [ - "s3:GetObject", - "s3:PutObject" - ], - "Resource": [ - "arn:aws:s3::://*" - ] - } - ] -} -``` - -Replace the fields in angle brackets (`<>`) with the appropriate values: - -* `` should be the name of the S3 bucket where compressed archives should be stored. -* `` should be the prefix (used like a directory path) where compressed archives should - be stored. - -## Configuring CLP's archive storage location - -To configure CLP to store archives on S3, update the `archive_output.storage` key in -`/etc/clp-config.yml`: - -```yaml -archive_output: - storage: - type: "s3" - staging_directory: "var/data/staged-archives" # Or a path of your choosing - s3_config: - region_code: "" - bucket: "" - key_prefix: "" - credentials: - access_key_id: "" - secret_access_key: "" - - # archive_output's other config keys -``` - -Replace the fields in angle brackets (`<>`) with the appropriate values: - -* `staging_directory` is the local filesystem directory where archives will be temporarily stored - before being uploaded to S3. -* `s3_config` configures both the S3 bucket where archives should be stored and the credentials - for accessing it. - * `` is the AWS region [code][aws-region-codes] for the bucket. - * `` is the bucket's name. - * `` is the "directory" where all archives will be stored within the bucket and - must end with trailing forward slash (e.g., `archives/`). - * `credentials` contains the CLP IAM user's credentials. - -[add-iam-policy]: https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html#embed-inline-policy-console -[aws-region-codes]: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.RegionsAndAvailabilityZones.html#Concepts.RegionsAndAvailabilityZones.Availability diff --git a/docs/src/user-guide/guides-using-object-storage/clp-config.md b/docs/src/user-guide/guides-using-object-storage/clp-config.md new file mode 100644 index 000000000..2e51ad4f0 --- /dev/null +++ b/docs/src/user-guide/guides-using-object-storage/clp-config.md @@ -0,0 +1,78 @@ +# Configuring CLP + +To use object storage with CLP, follow the steps below to configure each use case you require. + +:::{note} +If CLP is already running, shut it down, update its configuration, and then start it again. +::: + +## Configuration for archive storage + +To configure CLP to store archives on S3, update the `archive_output.storage` key in +`/etc/clp-config.yml` with the values in the code block below, replacing the fields in +angle brackets (`<>`) with the appropriate values: + +```yaml +archive_output: + storage: + type: "s3" + staging_directory: "var/data/staged-archives" # Or a path of your choosing + s3_config: + region_code: "" + bucket: "" + key_prefix: "" + credentials: + access_key_id: "" + secret_access_key: "" + + # archive_output's other config keys +``` + +* `staging_directory` is the local filesystem directory where archives will be temporarily stored + before being uploaded to S3. +* `s3_config` configures both the S3 bucket where archives should be stored and the credentials + for accessing it. + * `` is the AWS region [code][aws-region-codes] for the bucket. + * `` is the bucket's name. + * `` is the "directory" where all archives will be stored within the bucket and + must end with trailing forward slash (e.g., `archives/`). + * `credentials` contains the CLP IAM user's credentials. + +## Configuration for stream storage + +To configure CLP to cache stream files on S3, update the `stream_output.storage` key in +`/etc/clp-config.yml` with the values in the code block below, replacing the fields in +angle brackets (`<>`) with the appropriate values: + +```yaml +stream_output: + storage: + type: "s3" + staging_directory: "var/data/staged-streams" # Or a path of your choosing + s3_config: + region_code: "" + bucket: "" + key_prefix: "" + credentials: + access_key_id: "" + secret_access_key: "" + + # stream_output's other config keys +``` + +* `staging_directory` is the local filesystem directory where streams will be temporarily stored + before being uploaded to S3. +* `s3_config` configures both the S3 bucket where streams should be stored and the credentials + for accessing it. + * `` is the AWS region [code][aws-region-codes] for the bucket. + * `` is the bucket's name. + * `` is the "directory" where all streams will be stored within the bucket and + must end with trailing forward slash (e.g., `streams/`). + * `credentials` contains the CLP IAM user's credentials. + +:::{note} +CLP currently doesn't explicitly delete the cached streams. This limitation will be addressed in a +future release. +::: + +[aws-region-codes]: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.RegionsAndAvailabilityZones.html#Concepts.RegionsAndAvailabilityZones.Availability diff --git a/docs/src/user-guide/guides-using-object-storage/compress.md b/docs/src/user-guide/guides-using-object-storage/clp-usage.md similarity index 53% rename from docs/src/user-guide/guides-using-object-storage/compress.md rename to docs/src/user-guide/guides-using-object-storage/clp-usage.md index 51e1dce37..6fab2db44 100644 --- a/docs/src/user-guide/guides-using-object-storage/compress.md +++ b/docs/src/user-guide/guides-using-object-storage/clp-usage.md @@ -1,59 +1,12 @@ -# Compressing logs +# Using CLP with object storage -To compress logs from S3, you'll need to: +To compress logs from S3, follow the steps in the section below. For all other operations, you +should be able to use CLP as described in the [quick start](../quick-start-overview.md) guide. -1. Enable the CLP IAM user to access the S3 path containing your logs. -2. Use the `s3` subcommand of `sbin/compress.sh` to compress your logs. +## Compressing logs from S3 -## IAM user configuration - -Attach the following policy to the CLP IAM user by following [this guide][add-iam-policy]. - -```json -{ - "Version": "2012-10-17", - "Statement": [ - { - "Effect": "Allow", - "Action": "s3:GetObject", - "Resource": [ - "arn:aws:s3:::/*" - ] - }, - { - "Effect": "Allow", - "Action": [ - "s3:ListBucket" - ], - "Resource": [ - "arn:aws:s3:::" - ], - "Condition": { - "StringLike": { - "s3:prefix": "*" - } - } - } - ] -} -``` - -Replace the fields in angle brackets (`<>`) with the appropriate values: - -* `` should be the name of the S3 bucket containing your logs. -* `` should be the prefix of all logs you wish to compress. - - :::{note} - If you want to enforce that only logs under a directory-like prefix, e.g., `logs/`, can be - compressed, you can append a trailing slash (`/`) after the `` value. This will - prevent CLP from compressing logs with prefixes like `logs-private`. However, note that to - compress all logs under the `logs/` prefix, you will need to include the trailing slash when - invoking `sbin/compress.sh` below. - ::: - -## Using `sbin/compress.sh s3` - -You can use the `s3` subcommand as follows: +To compress logs from S3, use the `s3` subcommand as follows, replacing the fields in angle brackets +(`<>`) with the appropriate values: ```bash sbin/compress.sh \ @@ -82,7 +35,7 @@ sbin/compress.sh \ * `` is the name of the S3 bucket containing your logs. * `` is the AWS region [code][aws-region-codes] for the S3 bucket containing your logs. * `` is the prefix of all logs you wish to compress and must begin with the - `` value from the IAM policy above. + `` value from the [compression IAM policy][compression-iam-policy]. :::{note} The `s3` subcommand only supports a single URL but will compress any logs that have the given @@ -96,3 +49,4 @@ both logs to be compressed). This limitation will be addressed in a future relea [add-iam-policy]: https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html#embed-inline-policy-console [aws-region-codes]: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.RegionsAndAvailabilityZones.html#Concepts.RegionsAndAvailabilityZones.Availability +[compression-iam-policy]: ./object-storage-config.md#configuration-for-compression \ No newline at end of file diff --git a/docs/src/user-guide/guides-using-object-storage/index.md b/docs/src/user-guide/guides-using-object-storage/index.md index 8e8f218c9..119f1f013 100644 --- a/docs/src/user-guide/guides-using-object-storage/index.md +++ b/docs/src/user-guide/guides-using-object-storage/index.md @@ -6,9 +6,9 @@ CLP can: * store archives on object storage; and * cache stream files (used for viewing compressed logs) on object storage. -This guide explains how to configure CLP for all three use cases. Note that you can choose to use -object storage for any combination of the three use cases (e.g., compress logs from S3 and cache the -stream files on S3, but store archives on the local filesystem). +This guide explains how to configure and use CLP for all three use cases. Note that you can choose +to use object storage for any combination of the three use cases (e.g., compress logs from S3 and +cache the stream files on S3, but store archives on the local filesystem). :::{note} Currently, only the [clp-json][release-choices] release supports object storage. Support for @@ -43,41 +43,49 @@ will be added in a future release. release. ::: -## Use cases +## Configuration -The following subsections below explain how to set up each use case: +The subsections below explain how to configure your object storage bucket and CLP for each use case: ::::{grid} 1 1 1 1 :gutter: 2 :::{grid-item-card} -:link: compress -Compressing logs +:link: object-storage-config +Configuring object storage ^^^ -Compressing logs from object storage +Configuring your object storage bucket for each use case. ::: :::{grid-item-card} -:link: archive-storage -Storing archives +:link: clp-config +Configuring CLP ^^^ -Storing archives on object storage +Configuring CLP to use object storage for each use case. ::: +:::: + +## Using CLP with object storage + +The subsection below explains how to use CLP with object storage for each use case: + +::::{grid} 1 1 1 1 +:gutter: 2 :::{grid-item-card} -:link: stream-storage -Caching stream files +:link: clp-usage +Using CLP with object storage ^^^ -Caching stream files on object storage +Using CLP to compress, search, and view log files from object storage. ::: :::: :::{toctree} :hidden: -compress -archive-storage -stream-storage +object-storage-config +clp-config +clp-usage ::: [aws-create-access-keys]: https://docs.aws.amazon.com/keyspaces/latest/devguide/create.keypair.html diff --git a/docs/src/user-guide/guides-using-object-storage/object-storage-config.md b/docs/src/user-guide/guides-using-object-storage/object-storage-config.md new file mode 100644 index 000000000..b0a1cdbda --- /dev/null +++ b/docs/src/user-guide/guides-using-object-storage/object-storage-config.md @@ -0,0 +1,152 @@ +# Configuring object storage + +To use object storage with CLP, follow the steps below to configure the CLP IAM user and your object +storage bucket(s) for each use case you require. + +## Configuration for compression + +[Attach the policy][add-iam-policy] below to the CLP IAM user (you can use the JSON editor), +replacing the fields in angle brackets (`<>`) with the appropriate values: + +```json +{ + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Action": "s3:GetObject", + "Resource": [ + "arn:aws:s3:::/*" + ] + }, + { + "Effect": "Allow", + "Action": "s3:ListBucket", + "Resource": [ + "arn:aws:s3:::" + ], + "Condition": { + "StringLike": { + "s3:prefix": "*" + } + } + } + ] +} +``` + +* `` should be the name of the S3 bucket containing your logs. +* `` should be the prefix of all logs you wish to compress. + + :::{note} + If you want to enforce that only logs under a directory-like prefix, e.g., `logs/`, can be + compressed, you can append a trailing slash (`/`) after the `` value. This will + prevent CLP from compressing logs with prefixes like `logs-private`. However, note that to + compress all logs under the `logs/` prefix, you will need to include the trailing slash when + invoking `sbin/compress.sh` below. + ::: + +## Configuration for archive storage + +[Attach the policy][add-iam-policy] below to the CLP IAM user (you can use the JSON editor), +replacing the fields in angle brackets (`<>`) with the appropriate values: + +```json +{ + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Action": [ + "s3:GetObject", + "s3:PutObject" + ], + "Resource": [ + "arn:aws:s3::://*" + ] + } + ] +} +``` + +* `` should be the name of the S3 bucket where compressed archives should be stored. +* `` should be the prefix (used like a directory path) where compressed archives should + be stored. + +## Configuration for stream storage + +The [log viewer][yscope-log-viewer] currently supports viewing [IR][uber-clp-blog-1] and JSONL +stream files but not CLP archives; thus, to view the compressed logs from a CLP archive, CLP first +converts the compressed logs into stream files. These streams can be cached on the filesystem, or on +object storage. + +:::{note} +A future version of the log viewer will support viewing CLP archives directly. +::: + +Storing streams on S3 requires both configuring the CLP IAM user and setting up a cross-origin +resource sharing (CORS) policy for the S3 bucket. + +### IAM user configuration + +[Attach the policy][add-iam-policy] below to the CLP IAM user (you can use the JSON editor), +replacing the fields in angle brackets (`<>`) with the appropriate values: + +```json +{ + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Action": [ + "s3:GetObject", + "s3:PutObject" + ], + "Resource": [ + "arn:aws:s3::://*" + ] + } + ] +} +``` + +* `` should be the name of the S3 bucket where cached streams should be stored. +* `` should be the prefix (used like a directory path) where cached streams should be + stored. + +### Cross-origin resource sharing (CORS) configuration + +For CLP's log viewer to be able to access the cached stream files from S3 over the internet, the S3 +bucket must have a CORS policy configured. + +Add the CORS configuration below to your bucket by following [this guide][aws-cors-guide]: + +```json +[ + { + "AllowedHeaders": [ + "*" + ], + "AllowedMethods": [ + "GET" + ], + "AllowedOrigins": [ + "*" + ], + "ExposeHeaders": [ + "Access-Control-Allow-Origin" + ] + } +] +``` + +:::{tip} +The CORS policy above allows requests from any host (origin). If you already know what hosts will +access CLP's web interface, you can enhance security by changing `AllowedOrigins` from `["*"]` to +the specific list of hosts that will access the web interface. +::: + +[aws-cors-guide]: https://docs.aws.amazon.com/AmazonS3/latest/userguide/enabling-cors-examples.html +[add-iam-policy]: https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html#embed-inline-policy-console +[uber-clp-blog-1]: https://www.uber.com/en-US/blog/reducing-logging-cost-by-two-orders-of-magnitude-using-clp +[yscope-log-viewer]: https://github.com/y-scope/yscope-log-viewer diff --git a/docs/src/user-guide/guides-using-object-storage/stream-storage.md b/docs/src/user-guide/guides-using-object-storage/stream-storage.md deleted file mode 100644 index 3e59ecd11..000000000 --- a/docs/src/user-guide/guides-using-object-storage/stream-storage.md +++ /dev/null @@ -1,110 +0,0 @@ -# Caching stream files - -The [log viewer][yscope-log-viewer] currently supports viewing [IR][uber-clp-blog-1] and JSONL -stream files but not CLP archives; thus, to view the compressed logs from a CLP archive, CLP first -converts the compressed logs into stream files. These streams can be cached on the filesystem, or on -object storage as explained below. - -:::{note} -A future version of the log viewer will support viewing CLP archives directly. -::: - -To cache the stream files on S3, you'll need to: - -1. Enable the CLP IAM user to access the S3 path where stream files should be stored. -2. Set up a cross-origin resource sharing (CORS) policy for the S3 path in (1). -3. Configure CLP to cache stream files under the S3 path from step (1). - -## IAM user configuration - -Attach the following policy to the CLP IAM user by following [this guide][add-iam-policy]. - -```json -{ - "Version": "2012-10-17", - "Statement": [ - { - "Effect": "Allow", - "Action": [ - "s3:GetObject", - "s3:PutObject" - ], - "Resource": [ - "arn:aws:s3::://*" - ] - } - ] -} -``` - -Replace the fields in angle brackets (`<>`) with the appropriate values: - -* `` should be the name of the S3 bucket where cached streams should be stored. -* `` should be the prefix (used like a directory path) where cached streams should be - stored. - -## Cross-origin resource sharing (CORS) configuration - -For CLP's log viewer to be able to access the cached stream files from S3 over the internet, the S3 -bucket must have a CORS policy configured. - -Add the CORS configuration below to your bucket by following [this guide][aws-cors-guide]: - -```json -[ - { - "AllowedHeaders": [ - "*" - ], - "AllowedMethods": [ - "GET" - ], - "AllowedOrigins": [ - "*" - ], - "ExposeHeaders": [ - "Access-Control-Allow-Origin" - ] - } -] -``` - -:::{tip} -The CORS policy above allows requests from any host (origin). If you already know what hosts will -access CLP's web interface, you can enhance security by changing `AllowedOrigins` from `["*"]` to -the specific list of hosts that will access the web interface. -::: - -## Configuring CLP's stream storage location - -To configure CLP to cache stream files on S3, update the `stream_output.storage` key in -`/etc/clp-config.yml`: - -```yaml -stream_output: - storage: - type: "s3" - staging_directory: "var/data/staged-streams" # Or a path of your choosing - s3_config: - region_code: "" - bucket: "" - key_prefix: "" - credentials: - access_key_id: "" - secret_access_key: "" - - # stream_output's other config keys -``` - -The configuration keys above function identically to those in `archive_output.storage`, except they -should be configured to use a different S3 path. - -:::{note} -CLP currently doesn't explicitly delete the cached streams. This limitation will be addressed in a -future release. -::: - -[aws-cors-guide]: https://docs.aws.amazon.com/AmazonS3/latest/userguide/enabling-cors-examples.html -[add-iam-policy]: https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html#embed-inline-policy-console -[uber-clp-blog-1]: https://www.uber.com/en-US/blog/reducing-logging-cost-by-two-orders-of-magnitude-using-clp -[yscope-log-viewer]: https://github.com/y-scope/yscope-log-viewer From 57690eadc164ccdef7a1485c28f73bde0abf1731 Mon Sep 17 00:00:00 2001 From: Kirk Rodrigues <2454684+kirkrodrigues@users.noreply.github.com> Date: Thu, 23 Jan 2025 07:18:58 -0500 Subject: [PATCH 26/31] Apply the rabbit's suggestions. --- docs/src/user-guide/guides-using-object-storage/clp-config.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/src/user-guide/guides-using-object-storage/clp-config.md b/docs/src/user-guide/guides-using-object-storage/clp-config.md index 2e51ad4f0..02e3b9360 100644 --- a/docs/src/user-guide/guides-using-object-storage/clp-config.md +++ b/docs/src/user-guide/guides-using-object-storage/clp-config.md @@ -35,7 +35,7 @@ archive_output: * `` is the AWS region [code][aws-region-codes] for the bucket. * `` is the bucket's name. * `` is the "directory" where all archives will be stored within the bucket and - must end with trailing forward slash (e.g., `archives/`). + must end with a trailing forward slash (e.g., `archives/`). * `credentials` contains the CLP IAM user's credentials. ## Configuration for stream storage @@ -67,7 +67,7 @@ stream_output: * `` is the AWS region [code][aws-region-codes] for the bucket. * `` is the bucket's name. * `` is the "directory" where all streams will be stored within the bucket and - must end with trailing forward slash (e.g., `streams/`). + must end with a trailing forward slash (e.g., `streams/`). * `credentials` contains the CLP IAM user's credentials. :::{note} From e1e44441f8c7dc653ad9d6c6983c3e053e3b9996 Mon Sep 17 00:00:00 2001 From: Kirk Rodrigues <2454684+kirkrodrigues@users.noreply.github.com> Date: Thu, 23 Jan 2025 08:04:09 -0500 Subject: [PATCH 27/31] Mention that users need to be familiar with the quick start guide. --- .../user-guide/guides-using-object-storage/index.md | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/docs/src/user-guide/guides-using-object-storage/index.md b/docs/src/user-guide/guides-using-object-storage/index.md index 119f1f013..b08c8a112 100644 --- a/docs/src/user-guide/guides-using-object-storage/index.md +++ b/docs/src/user-guide/guides-using-object-storage/index.md @@ -22,10 +22,12 @@ will be added in a future release. ## Prerequisites -1. An S3 bucket and [key prefix][aws-key-prefixes] containing the logs you wish to compress. -2. An S3 bucket and key prefix where you wish to store compressed archives. -3. An S3 bucket and key prefix where you wish to cache stream files. -4. An AWS IAM user with the necessary permissions to access the S3 prefixes mentioned above. +1. This guide assumes you're able to configure, start, stop, and use a CLP cluster as described in + the [quick-start guide](../quick-start-overview.md). +2. An S3 bucket and [key prefix][aws-key-prefixes] containing the logs you wish to compress. +3. An S3 bucket and key prefix where you wish to store compressed archives. +4. An S3 bucket and key prefix where you wish to cache stream files. +5. An AWS IAM user with the necessary permissions to access the S3 prefixes mentioned above. * To create a user, follow [this guide][aws-create-iam-user]. * You don't need to assign any groups or policies to the user at this stage since we will attach policies in later steps, depending on which object storage use cases you require. @@ -33,7 +35,7 @@ will be added in a future release. [principle of least privilege][least-privilege-principle], or you can use the same user for all three. * For brevity, we'll refer to this user as the "CLP IAM user" in the rest of this guide. -5. IAM user (long-term) credentials for the IAM user(s) created in step (4) above. +6. IAM user (long-term) credentials for the IAM user(s) created in step (4) above. * To create these credentials, follow [this guide][aws-create-access-keys]. * Choose the "Other" use case to generate long-term credentials. From a57daafe617ee4896e793ca356f16b912759e0d7 Mon Sep 17 00:00:00 2001 From: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com> Date: Fri, 24 Jan 2025 00:35:55 -0500 Subject: [PATCH 28/31] Apply suggestions from code review Co-authored-by: haiqi96 <14502009+haiqi96@users.noreply.github.com> --- .../guides-using-object-storage/object-storage-config.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/src/user-guide/guides-using-object-storage/object-storage-config.md b/docs/src/user-guide/guides-using-object-storage/object-storage-config.md index b0a1cdbda..fcd449951 100644 --- a/docs/src/user-guide/guides-using-object-storage/object-storage-config.md +++ b/docs/src/user-guide/guides-using-object-storage/object-storage-config.md @@ -5,7 +5,7 @@ storage bucket(s) for each use case you require. ## Configuration for compression -[Attach the policy][add-iam-policy] below to the CLP IAM user (you can use the JSON editor), +[Attach the inline policy][add-iam-policy] below to the CLP IAM user (you can use the JSON editor), replacing the fields in angle brackets (`<>`) with the appropriate values: ```json @@ -48,7 +48,7 @@ replacing the fields in angle brackets (`<>`) with the appropriate values: ## Configuration for archive storage -[Attach the policy][add-iam-policy] below to the CLP IAM user (you can use the JSON editor), +[Attach the inline policy][add-iam-policy] below to the CLP IAM user (you can use the JSON editor), replacing the fields in angle brackets (`<>`) with the appropriate values: ```json @@ -89,7 +89,7 @@ resource sharing (CORS) policy for the S3 bucket. ### IAM user configuration -[Attach the policy][add-iam-policy] below to the CLP IAM user (you can use the JSON editor), +[Attach the inline policy][add-iam-policy] below to the CLP IAM user (you can use the JSON editor), replacing the fields in angle brackets (`<>`) with the appropriate values: ```json From 73de4de48f1fd9a61cc34f3f274e05fdd1c7ae37 Mon Sep 17 00:00:00 2001 From: Kirk Rodrigues <2454684+kirkrodrigues@users.noreply.github.com> Date: Thu, 23 Jan 2025 21:33:50 -0500 Subject: [PATCH 29/31] Fix indentation. --- .../object-storage-config.md | 20 +++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/docs/src/user-guide/guides-using-object-storage/object-storage-config.md b/docs/src/user-guide/guides-using-object-storage/object-storage-config.md index fcd449951..9cce64b99 100644 --- a/docs/src/user-guide/guides-using-object-storage/object-storage-config.md +++ b/docs/src/user-guide/guides-using-object-storage/object-storage-config.md @@ -55,16 +55,16 @@ replacing the fields in angle brackets (`<>`) with the appropriate values: { "Version": "2012-10-17", "Statement": [ - { - "Effect": "Allow", - "Action": [ - "s3:GetObject", - "s3:PutObject" - ], - "Resource": [ - "arn:aws:s3::://*" - ] - } + { + "Effect": "Allow", + "Action": [ + "s3:GetObject", + "s3:PutObject" + ], + "Resource": [ + "arn:aws:s3::://*" + ] + } ] } ``` From 5808e0d5a6283249ef084c3031df4e8da489f3a5 Mon Sep 17 00:00:00 2001 From: Kirk Rodrigues <2454684+kirkrodrigues@users.noreply.github.com> Date: Fri, 24 Jan 2025 00:36:51 -0500 Subject: [PATCH 30/31] Marco review. --- docs/src/user-guide/guides-using-object-storage/index.md | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/docs/src/user-guide/guides-using-object-storage/index.md b/docs/src/user-guide/guides-using-object-storage/index.md index b08c8a112..aebe45e4e 100644 --- a/docs/src/user-guide/guides-using-object-storage/index.md +++ b/docs/src/user-guide/guides-using-object-storage/index.md @@ -31,9 +31,7 @@ will be added in a future release. * To create a user, follow [this guide][aws-create-iam-user]. * You don't need to assign any groups or policies to the user at this stage since we will attach policies in later steps, depending on which object storage use cases you require. - * You may use a different IAM user for each use case to follow the - [principle of least privilege][least-privilege-principle], or you can use the same user for - all three. + * You may use a single IAM user for all use cases, or a separate one for each. * For brevity, we'll refer to this user as the "CLP IAM user" in the rest of this guide. 6. IAM user (long-term) credentials for the IAM user(s) created in step (4) above. * To create these credentials, follow [this guide][aws-create-access-keys]. @@ -93,5 +91,4 @@ clp-usage [aws-create-access-keys]: https://docs.aws.amazon.com/keyspaces/latest/devguide/create.keypair.html [aws-create-iam-user]: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html [aws-key-prefixes]: https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-prefixes.html -[least-privilege-principle]: https://en.wikipedia.org/wiki/Principle_of_least_privilege [release-choices]: ../quick-start-cluster-setup/index.md#choosing-a-release From 8e06c1ae8b23aadc7b88ab7e317e64be08ecb70d Mon Sep 17 00:00:00 2001 From: Kirk Rodrigues <2454684+kirkrodrigues@users.noreply.github.com> Date: Fri, 24 Jan 2025 01:18:50 -0500 Subject: [PATCH 31/31] Minor edit for consistency. --- docs/src/user-guide/guides-using-object-storage/index.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/src/user-guide/guides-using-object-storage/index.md b/docs/src/user-guide/guides-using-object-storage/index.md index aebe45e4e..d3f0aa536 100644 --- a/docs/src/user-guide/guides-using-object-storage/index.md +++ b/docs/src/user-guide/guides-using-object-storage/index.md @@ -27,7 +27,8 @@ will be added in a future release. 2. An S3 bucket and [key prefix][aws-key-prefixes] containing the logs you wish to compress. 3. An S3 bucket and key prefix where you wish to store compressed archives. 4. An S3 bucket and key prefix where you wish to cache stream files. -5. An AWS IAM user with the necessary permissions to access the S3 prefixes mentioned above. +5. An AWS IAM user with the necessary permissions to access the S3 bucket(s) and prefixes mentioned + above. * To create a user, follow [this guide][aws-create-iam-user]. * You don't need to assign any groups or policies to the user at this stage since we will attach policies in later steps, depending on which object storage use cases you require.