-
Notifications
You must be signed in to change notification settings - Fork 73
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
docs: Add guides for using clp-json with object storage; Update compr…
…ession scripts docs missed in previous PRs. (#683) Co-authored-by: Haiqi Xu <[email protected]>
- Loading branch information
1 parent
66067d6
commit 230d518
Showing
8 changed files
with
415 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
# Overview | ||
|
||
The guides below describe how to use CLP in different use cases. | ||
|
||
::::{grid} 1 1 2 2 | ||
:gutter: 2 | ||
|
||
:::{grid-item-card} | ||
:link: guides-using-object-storage/index | ||
Using object storage | ||
^^^ | ||
Using CLP to ingest logs from object storage and store archives on object storage. | ||
::: | ||
:::: |
78 changes: 78 additions & 0 deletions
78
docs/src/user-guide/guides-using-object-storage/clp-config.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,78 @@ | ||
# Configuring CLP | ||
|
||
To use object storage with CLP, follow the steps below to configure each use case you require. | ||
|
||
:::{note} | ||
If CLP is already running, shut it down, update its configuration, and then start it again. | ||
::: | ||
|
||
## Configuration for archive storage | ||
|
||
To configure CLP to store archives on S3, update the `archive_output.storage` key in | ||
`<package>/etc/clp-config.yml` with the values in the code block below, replacing the fields in | ||
angle brackets (`<>`) with the appropriate values: | ||
|
||
```yaml | ||
archive_output: | ||
storage: | ||
type: "s3" | ||
staging_directory: "var/data/staged-archives" # Or a path of your choosing | ||
s3_config: | ||
region_code: "<region-code>" | ||
bucket: "<bucket-name>" | ||
key_prefix: "<key-prefix>" | ||
credentials: | ||
access_key_id: "<aws-access-key-id>" | ||
secret_access_key: "<aws-secret-access-key>" | ||
|
||
# archive_output's other config keys | ||
``` | ||
|
||
* `staging_directory` is the local filesystem directory where archives will be temporarily stored | ||
before being uploaded to S3. | ||
* `s3_config` configures both the S3 bucket where archives should be stored and the credentials | ||
for accessing it. | ||
* `<region-code>` is the AWS region [code][aws-region-codes] for the bucket. | ||
* `<bucket-name>` is the bucket's name. | ||
* `<key-prefix>` is the "directory" where all archives will be stored within the bucket and | ||
must end with a trailing forward slash (e.g., `archives/`). | ||
* `credentials` contains the CLP IAM user's credentials. | ||
|
||
## Configuration for stream storage | ||
|
||
To configure CLP to cache stream files on S3, update the `stream_output.storage` key in | ||
`<package>/etc/clp-config.yml` with the values in the code block below, replacing the fields in | ||
angle brackets (`<>`) with the appropriate values: | ||
|
||
```yaml | ||
stream_output: | ||
storage: | ||
type: "s3" | ||
staging_directory: "var/data/staged-streams" # Or a path of your choosing | ||
s3_config: | ||
region_code: "<region-code>" | ||
bucket: "<bucket-name>" | ||
key_prefix: "<key-prefix>" | ||
credentials: | ||
access_key_id: "<aws-access-key-id>" | ||
secret_access_key: "<aws-secret-access-key>" | ||
|
||
# stream_output's other config keys | ||
``` | ||
|
||
* `staging_directory` is the local filesystem directory where streams will be temporarily stored | ||
before being uploaded to S3. | ||
* `s3_config` configures both the S3 bucket where streams should be stored and the credentials | ||
for accessing it. | ||
* `<region-code>` is the AWS region [code][aws-region-codes] for the bucket. | ||
* `<bucket-name>` is the bucket's name. | ||
* `<key-prefix>` is the "directory" where all streams will be stored within the bucket and | ||
must end with a trailing forward slash (e.g., `streams/`). | ||
* `credentials` contains the CLP IAM user's credentials. | ||
|
||
:::{note} | ||
CLP currently doesn't explicitly delete the cached streams. This limitation will be addressed in a | ||
future release. | ||
::: | ||
|
||
[aws-region-codes]: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.RegionsAndAvailabilityZones.html#Concepts.RegionsAndAvailabilityZones.Availability |
52 changes: 52 additions & 0 deletions
52
docs/src/user-guide/guides-using-object-storage/clp-usage.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
# Using CLP with object storage | ||
|
||
To compress logs from S3, follow the steps in the section below. For all other operations, you | ||
should be able to use CLP as described in the [quick start](../quick-start-overview.md) guide. | ||
|
||
## Compressing logs from S3 | ||
|
||
To compress logs from S3, use the `s3` subcommand as follows, replacing the fields in angle brackets | ||
(`<>`) with the appropriate values: | ||
|
||
```bash | ||
sbin/compress.sh \ | ||
s3 \ | ||
--aws-credentials-file <credentials-file> \ | ||
--timestamp-key <timestamp-key> \ | ||
https://<bucket-name>.s3.<region-code>.amazonaws.com/<prefix> | ||
``` | ||
|
||
* `<credentials-file>` is the path to an AWS credentials file like the following: | ||
|
||
```ini | ||
[default] | ||
aws_access_key_id = <aws-access-key-id> | ||
aws_secret_access_key = <aws-secret-access-key> | ||
``` | ||
|
||
* CLP expects the credentials to be in the `default` section. | ||
* `<aws-access-key-id>` and `<aws-secret-access-key>` are the access key ID and secret access | ||
key of the CLP IAM user. | ||
* If you don't want to use a credentials file, you can specify the credentials on the command | ||
line using the `--aws-access-key-id` and `--aws-secret-access-key` flags (note that this may | ||
expose your credentials to other users running on the system). | ||
|
||
* `<timestamp-key>` is the field path of the kv-pair that contains the timestamp in each log event. | ||
* `<bucket-name>` is the name of the S3 bucket containing your logs. | ||
* `<region-code>` is the AWS region [code][aws-region-codes] for the S3 bucket containing your logs. | ||
* `<prefix>` is the prefix of all logs you wish to compress and must begin with the | ||
`<all-logs-prefix>` value from the [compression IAM policy][compression-iam-policy]. | ||
|
||
:::{note} | ||
The `s3` subcommand only supports a single URL but will compress any logs that have the given | ||
prefix. | ||
|
||
If you wish to compress a single log file, specify the entire path to the log file. However, if that | ||
log file's path is a prefix of another log file's path, then both log files will be compressed | ||
(e.g., with two files "logs/syslog" and "logs/syslog.1", a prefix like "logs/syslog" will cause | ||
both logs to be compressed). This limitation will be addressed in a future release. | ||
::: | ||
|
||
[add-iam-policy]: https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html#embed-inline-policy-console | ||
[aws-region-codes]: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.RegionsAndAvailabilityZones.html#Concepts.RegionsAndAvailabilityZones.Availability | ||
[compression-iam-policy]: ./object-storage-config.md#configuration-for-compression |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,95 @@ | ||
# Using object storage | ||
|
||
CLP can: | ||
|
||
* compress logs from object storage (e.g., S3); | ||
* store archives on object storage; and | ||
* cache stream files (used for viewing compressed logs) on object storage. | ||
|
||
This guide explains how to configure and use CLP for all three use cases. Note that you can choose | ||
to use object storage for any combination of the three use cases (e.g., compress logs from S3 and | ||
cache the stream files on S3, but store archives on the local filesystem). | ||
|
||
:::{note} | ||
Currently, only the [clp-json][release-choices] release supports object storage. Support for | ||
`clp-text` will be added in a future release. | ||
::: | ||
|
||
:::{note} | ||
Currently, CLP only supports using S3 as object storage. Support for other object storage services | ||
will be added in a future release. | ||
::: | ||
|
||
## Prerequisites | ||
|
||
1. This guide assumes you're able to configure, start, stop, and use a CLP cluster as described in | ||
the [quick-start guide](../quick-start-overview.md). | ||
2. An S3 bucket and [key prefix][aws-key-prefixes] containing the logs you wish to compress. | ||
3. An S3 bucket and key prefix where you wish to store compressed archives. | ||
4. An S3 bucket and key prefix where you wish to cache stream files. | ||
5. An AWS IAM user with the necessary permissions to access the S3 bucket(s) and prefixes mentioned | ||
above. | ||
* To create a user, follow [this guide][aws-create-iam-user]. | ||
* You don't need to assign any groups or policies to the user at this stage since we will | ||
attach policies in later steps, depending on which object storage use cases you require. | ||
* You may use a single IAM user for all use cases, or a separate one for each. | ||
* For brevity, we'll refer to this user as the "CLP IAM user" in the rest of this guide. | ||
6. IAM user (long-term) credentials for the IAM user(s) created in step (4) above. | ||
* To create these credentials, follow [this guide][aws-create-access-keys]. | ||
* Choose the "Other" use case to generate long-term credentials. | ||
|
||
:::{note} | ||
CLP currently requires IAM user (long-term) credentials to access the relevant S3 buckets. | ||
Support for other authentication methods (e.g., temporary credentials) will be added in a future | ||
release. | ||
::: | ||
|
||
## Configuration | ||
|
||
The subsections below explain how to configure your object storage bucket and CLP for each use case: | ||
|
||
::::{grid} 1 1 1 1 | ||
:gutter: 2 | ||
|
||
:::{grid-item-card} | ||
:link: object-storage-config | ||
Configuring object storage | ||
^^^ | ||
Configuring your object storage bucket for each use case. | ||
::: | ||
|
||
:::{grid-item-card} | ||
:link: clp-config | ||
Configuring CLP | ||
^^^ | ||
Configuring CLP to use object storage for each use case. | ||
::: | ||
:::: | ||
|
||
## Using CLP with object storage | ||
|
||
The subsection below explains how to use CLP with object storage for each use case: | ||
|
||
::::{grid} 1 1 1 1 | ||
:gutter: 2 | ||
|
||
:::{grid-item-card} | ||
:link: clp-usage | ||
Using CLP with object storage | ||
^^^ | ||
Using CLP to compress, search, and view log files from object storage. | ||
::: | ||
:::: | ||
|
||
:::{toctree} | ||
:hidden: | ||
|
||
object-storage-config | ||
clp-config | ||
clp-usage | ||
::: | ||
|
||
[aws-create-access-keys]: https://docs.aws.amazon.com/keyspaces/latest/devguide/create.keypair.html | ||
[aws-create-iam-user]: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html | ||
[aws-key-prefixes]: https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-prefixes.html | ||
[release-choices]: ../quick-start-cluster-setup/index.md#choosing-a-release |
152 changes: 152 additions & 0 deletions
152
docs/src/user-guide/guides-using-object-storage/object-storage-config.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,152 @@ | ||
# Configuring object storage | ||
|
||
To use object storage with CLP, follow the steps below to configure the CLP IAM user and your object | ||
storage bucket(s) for each use case you require. | ||
|
||
## Configuration for compression | ||
|
||
[Attach the inline policy][add-iam-policy] below to the CLP IAM user (you can use the JSON editor), | ||
replacing the fields in angle brackets (`<>`) with the appropriate values: | ||
|
||
```json | ||
{ | ||
"Version": "2012-10-17", | ||
"Statement": [ | ||
{ | ||
"Effect": "Allow", | ||
"Action": "s3:GetObject", | ||
"Resource": [ | ||
"arn:aws:s3:::<bucket-name>/<all-logs-prefix>*" | ||
] | ||
}, | ||
{ | ||
"Effect": "Allow", | ||
"Action": "s3:ListBucket", | ||
"Resource": [ | ||
"arn:aws:s3:::<bucket-name>" | ||
], | ||
"Condition": { | ||
"StringLike": { | ||
"s3:prefix": "<all-logs-prefix>*" | ||
} | ||
} | ||
} | ||
] | ||
} | ||
``` | ||
|
||
* `<bucket-name>` should be the name of the S3 bucket containing your logs. | ||
* `<all-logs-prefix>` should be the prefix of all logs you wish to compress. | ||
|
||
:::{note} | ||
If you want to enforce that only logs under a directory-like prefix, e.g., `logs/`, can be | ||
compressed, you can append a trailing slash (`/`) after the `<all-logs-prefix>` value. This will | ||
prevent CLP from compressing logs with prefixes like `logs-private`. However, note that to | ||
compress all logs under the `logs/` prefix, you will need to include the trailing slash when | ||
invoking `sbin/compress.sh` below. | ||
::: | ||
|
||
## Configuration for archive storage | ||
|
||
[Attach the inline policy][add-iam-policy] below to the CLP IAM user (you can use the JSON editor), | ||
replacing the fields in angle brackets (`<>`) with the appropriate values: | ||
|
||
```json | ||
{ | ||
"Version": "2012-10-17", | ||
"Statement": [ | ||
{ | ||
"Effect": "Allow", | ||
"Action": [ | ||
"s3:GetObject", | ||
"s3:PutObject" | ||
], | ||
"Resource": [ | ||
"arn:aws:s3:::<bucket-name>/<key-prefix>/*" | ||
] | ||
} | ||
] | ||
} | ||
``` | ||
|
||
* `<bucket-name>` should be the name of the S3 bucket where compressed archives should be stored. | ||
* `<key-prefix>` should be the prefix (used like a directory path) where compressed archives should | ||
be stored. | ||
|
||
## Configuration for stream storage | ||
|
||
The [log viewer][yscope-log-viewer] currently supports viewing [IR][uber-clp-blog-1] and JSONL | ||
stream files but not CLP archives; thus, to view the compressed logs from a CLP archive, CLP first | ||
converts the compressed logs into stream files. These streams can be cached on the filesystem, or on | ||
object storage. | ||
|
||
:::{note} | ||
A future version of the log viewer will support viewing CLP archives directly. | ||
::: | ||
|
||
Storing streams on S3 requires both configuring the CLP IAM user and setting up a cross-origin | ||
resource sharing (CORS) policy for the S3 bucket. | ||
|
||
### IAM user configuration | ||
|
||
[Attach the inline policy][add-iam-policy] below to the CLP IAM user (you can use the JSON editor), | ||
replacing the fields in angle brackets (`<>`) with the appropriate values: | ||
|
||
```json | ||
{ | ||
"Version": "2012-10-17", | ||
"Statement": [ | ||
{ | ||
"Effect": "Allow", | ||
"Action": [ | ||
"s3:GetObject", | ||
"s3:PutObject" | ||
], | ||
"Resource": [ | ||
"arn:aws:s3:::<bucket-name>/<key-prefix>/*" | ||
] | ||
} | ||
] | ||
} | ||
``` | ||
|
||
* `<bucket-name>` should be the name of the S3 bucket where cached streams should be stored. | ||
* `<key-prefix>` should be the prefix (used like a directory path) where cached streams should be | ||
stored. | ||
|
||
### Cross-origin resource sharing (CORS) configuration | ||
|
||
For CLP's log viewer to be able to access the cached stream files from S3 over the internet, the S3 | ||
bucket must have a CORS policy configured. | ||
|
||
Add the CORS configuration below to your bucket by following [this guide][aws-cors-guide]: | ||
|
||
```json | ||
[ | ||
{ | ||
"AllowedHeaders": [ | ||
"*" | ||
], | ||
"AllowedMethods": [ | ||
"GET" | ||
], | ||
"AllowedOrigins": [ | ||
"*" | ||
], | ||
"ExposeHeaders": [ | ||
"Access-Control-Allow-Origin" | ||
] | ||
} | ||
] | ||
``` | ||
|
||
:::{tip} | ||
The CORS policy above allows requests from any host (origin). If you already know what hosts will | ||
access CLP's web interface, you can enhance security by changing `AllowedOrigins` from `["*"]` to | ||
the specific list of hosts that will access the web interface. | ||
::: | ||
|
||
[aws-cors-guide]: https://docs.aws.amazon.com/AmazonS3/latest/userguide/enabling-cors-examples.html | ||
[add-iam-policy]: https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html#embed-inline-policy-console | ||
[uber-clp-blog-1]: https://www.uber.com/en-US/blog/reducing-logging-cost-by-two-orders-of-magnitude-using-clp | ||
[yscope-log-viewer]: https://github.com/y-scope/yscope-log-viewer |
Oops, something went wrong.