-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: Add guides for using clp-json with object storage; Update compression scripts docs missed in previous PRs. #683
Changes from 5 commits
b13ce7d
72398d8
9f620b8
2bd546f
a298676
f1a40e8
4ea2018
19e01c3
542cba2
22feafb
baeffb1
b95b45f
c99bce3
bdcf423
82caf11
d7d2800
2f9ae7f
90dbe51
f6171e7
11e6991
9c4ee8a
59b72e9
4ae7d43
847c042
c752945
454cb87
57690ea
e1e4444
a57daaf
73de4de
5808e0d
8e06c1a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
# Overview | ||
|
||
The guides below describe how to use CLP in a variety of use cases. | ||
|
||
::::{grid} 1 1 2 2 | ||
:gutter: 2 | ||
|
||
:::{grid-item-card} | ||
:link: guides-using-object-storage | ||
Using object storage | ||
^^^ | ||
Using CLP to ingest logs from object storage and store archives on object storage. | ||
::: | ||
:::: |
Original file line number | Diff line number | Diff line change | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
@@ -0,0 +1,97 @@ | ||||||||||
# Using object storage | ||||||||||
|
||||||||||
CLP can both compress logs from object storage (e.g., S3) and store archives on object storage. This | ||||||||||
guide explains how to configure CLP for both use cases. | ||||||||||
|
||||||||||
:::{note} | ||||||||||
Currently, only the [clp-json][release-choices] release supports object storage. Support for | ||||||||||
clp-text will be added in a future release. | ||||||||||
::: | ||||||||||
|
||||||||||
:::{note} | ||||||||||
Currently, CLP only supports using S3 as object storage. Support for other object storage services | ||||||||||
will be added in a future release. | ||||||||||
::: | ||||||||||
|
||||||||||
## Compressing logs from object storage | ||||||||||
|
||||||||||
To compress logs from S3, use the `s3` subcommand of the `compress.sh` script: | ||||||||||
|
||||||||||
```bash | ||||||||||
sbin/compress.sh s3 s3://<bucket-name>/<path-prefix> | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. need to also mention about credentials? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also, do we want to give user a list of permission that are required for ingestion credentials and compression/stream extraction credentials? They have slightly different permission requirement. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. True. What permissions do they require? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For ingestion:
Not tested, but I believe if you don't care about limiting premission to a specific path, just do:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For compression/stream,
|
||||||||||
``` | ||||||||||
|
||||||||||
* `<bucket-name>` is the name of the S3 bucket containing your logs. | ||||||||||
* `<path-prefix>` is the path prefix of all logs you wish to compress. | ||||||||||
|
||||||||||
:::{note} | ||||||||||
The `s3` subcommand only supports a single URL but will compress any logs that have the given path | ||||||||||
prefix. | ||||||||||
|
||||||||||
If you wish to compress a single log file, specify the entire path to the log file. However, if that | ||||||||||
log file's path is a prefix of another log file's path, then both log files will be compressed. This | ||||||||||
limitation will be addressed in a future release. | ||||||||||
::: | ||||||||||
|
||||||||||
## Storing archives on object storage | ||||||||||
|
||||||||||
To configure CLP to store archives on S3, update the `archive_output.storage` key in | ||||||||||
`<package>/etc/clp-config.yml`: | ||||||||||
|
||||||||||
```yaml | ||||||||||
archive_output: | ||||||||||
storage: | ||||||||||
type: "s3" | ||||||||||
staging_directory: "var/data/staged-archives" # Or a path of your choosing | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. should we explain what does this mean? |
||||||||||
s3_config: | ||||||||||
region: "<aws-region-code>" | ||||||||||
bucket: "<s3-bucket-name>" | ||||||||||
key-prefix: "<s3-key-prefix>" | ||||||||||
credentials: | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we note that we only support long term credential, as documented here https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html |
||||||||||
access_key_id: "<aws-access-key-id>" | ||||||||||
secret_access_key: "<aws-secret-access-key>" | ||||||||||
|
||||||||||
# archive_output's other config keys | ||||||||||
``` | ||||||||||
|
||||||||||
* `s3_config` configures both the S3 bucket where archives should be stored as well as credentials | ||||||||||
for accessing it. | ||||||||||
* `<aws-region-code>` is the AWS region [code][aws-region-codes] for the bucket. | ||||||||||
* `<s3-bucket-name>` is the bucket's name. | ||||||||||
* `<s3-key-prefix>` is the "directory" where all archives will be stored within the bucket and | ||||||||||
must end with `/`. | ||||||||||
* `credentials` contains the S3 credentials necessary for accessing the bucket. | ||||||||||
|
||||||||||
To configure CLP to be able to view compressed log files from S3, you'll need to configure a bucket | ||||||||||
where CLP can store intermediate files that the log viewer can open. To do so, update the | ||||||||||
`stream_output.storage` key in `<package>/etc/clp-config.yml`: | ||||||||||
|
||||||||||
```yaml | ||||||||||
stream_output: | ||||||||||
storage: | ||||||||||
type: "s3" | ||||||||||
staging_directory: "var/data/staged-streams" # Or a path of your choosing | ||||||||||
s3_config: | ||||||||||
region: "<aws-region-code>" | ||||||||||
bucket: "<s3-bucket-name>" | ||||||||||
key-prefix: "<s3-key-prefix>" | ||||||||||
credentials: | ||||||||||
access_key_id: "<aws-access-key-id>" | ||||||||||
secret_access_key: "<aws-secret-access-key>" | ||||||||||
|
||||||||||
# stream_output's other config keys | ||||||||||
``` | ||||||||||
|
||||||||||
The configuration keys above function identically to those in `archive_output.storage`, except they | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We need to mention that log viewing required the bucket to be configured with cross region access permission. a typical configuration to add is:
|
||||||||||
should be configured to use a different S3 path (i.e., a different key-prefix in the same bucket or | ||||||||||
a different bucket entirely). | ||||||||||
|
||||||||||
:::{note} | ||||||||||
To view compressed log files, clp-text currently converts them into IR streams that the log viewer | ||||||||||
can open, while clp-json converts them into JSONL streams. These streams only need to be stored for | ||||||||||
as long as the streams are being viewed in the viewer, however CLP currently doesn't explicitly | ||||||||||
delete the streams. This limitation will be addressed in a future release. | ||||||||||
::: | ||||||||||
|
||||||||||
[aws-region-codes]: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.RegionsAndAvailabilityZones.html#Concepts.RegionsAndAvailabilityZones.Availability | ||||||||||
[release-choices]: http://localhost:8080/user-guide/quick-start-cluster-setup/index.html#choosing-a-release | ||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fix localhost URL in reference link The reference link contains a localhost URL which won't work in production. -[release-choices]: http://localhost:8080/user-guide/quick-start-cluster-setup/index.html#choosing-a-release
+[release-choices]: ../quick-start-cluster-setup/index.md#choosing-a-release 📝 Committable suggestion
Suggested change
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would consider renaming this section from
guides
toobject storage
. Everything in user-guide is technically a guide. We could then renameUsing object storage
toUsing AWS S3
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our plan is actually to rename "User guide" to "User docs" and "Developer guide" to "Developer docs". Although technically everything is a guide, we do want to differentiate "guides" (tutorials) from reference docs. Technically we could also move the quick start section into the guides section, but I'd need to restructure it a little.
What do you think about moving in that direction instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I feel maybe the restructure can wait after we release the software to the intended user