Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Add guides for using clp-json with object storage; Update compression scripts docs missed in previous PRs. #683

Merged
merged 32 commits into from
Jan 24, 2025

Conversation

kirkrodrigues
Copy link
Member

@kirkrodrigues kirkrodrigues commented Jan 21, 2025

Description

This PR:

  • Adds a new Guides section to the User guides category.
  • Adds a guide for using clp-json with object storage.
  • Updates the quick start compression docs to include the subcommand necessary in sbin/compress.sh (this was missed in previous PRs).

Validation performed

  • task docs:serve
    • The output was as expected.
    • All links were valid.
  • Followed the steps in the guide successfully.

Summary by CodeRabbit

  • Documentation

    • Added a new "Guides" section in the user documentation.
    • Introduced a comprehensive guide for using CLP with object storage (Amazon S3).
    • Updated compression commands to include new fs subcommand for both JSON and text logs.
    • Enhanced structure of user guide with new grid item cards and toctrees for easier navigation.
    • Added detailed instructions for configuring CLP with object storage use cases.
    • Included a new section on compressing logs from Amazon S3 with specific command examples.
  • New Features

    • Enhanced documentation for log compression from filesystem and object storage.
    • Provided detailed configuration instructions for S3 integration.

Copy link
Contributor

coderabbitai bot commented Jan 21, 2025

Walkthrough

This pull request enhances the CLP (Command Line Processor) documentation by introducing comprehensive guides for using object storage, specifically focusing on Amazon S3. The changes include adding new documentation sections that explain how to compress logs, store archives, and configure stream files using object storage. The documentation provides detailed instructions for configuration, prerequisites, and usage, with a particular emphasis on S3 integration and IAM user permissions.

Changes

File Change Summary
docs/src/user-guide/guides-overview.md Added overview section introducing available guides
docs/src/user-guide/index.md Added new "Guides" grid item and toctree with links to object storage guides
docs/src/user-guide/quick-start-compression/json.md Updated compression command to include fs subcommand and added tip about object storage compression
docs/src/user-guide/quick-start-compression/text.md Modified compression command to require fs argument
docs/src/user-guide/guides-using-object-storage/index.md Created comprehensive guide for object storage usage with S3
docs/src/user-guide/guides-using-object-storage/clp-config.md Added configuration instructions for object storage archiving and streaming
docs/src/user-guide/guides-using-object-storage/clp-usage.md Introduced usage guide for compressing logs from S3
docs/src/user-guide/guides-using-object-storage/object-storage-config.md Detailed IAM policy configuration for object storage use cases

Possibly related PRs

Suggested reviewers

  • haiqi96

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@kirkrodrigues kirkrodrigues marked this pull request as draft January 21, 2025 02:29
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (5)
docs/src/user-guide/quick-start-compression/text.md (1)

6-6: Add explanation for the 'fs' subcommand

The command now includes an 'fs' subcommand, but its purpose isn't explained. Consider adding a note explaining that 'fs' indicates compression from the filesystem, similar to the explanation provided in the JSON compression guide.

 ```bash
 sbin/compress.sh fs <path1> [<path2> ...]

+* fs is a subcommand for compressing logs from the filesystem.


</blockquote></details>
<details>
<summary>docs/src/user-guide/index.md (1)</summary><blockquote>

`18-23`: **Consider rephrasing for clarity**

The description could be more concise. Consider revising "in a variety of use cases" to "in different use cases" or simply "for various use cases".

```diff
-Guides for using CLP in a variety of use cases.
+Guides for using CLP in various use cases.
🧰 Tools
🪛 LanguageTool

[style] ~22-~22: The phrase “a variety of” may be wordy. To make your writing clearer, consider replacing it.
Context: ...view Guides ^^^ Guides for using CLP in a variety of use cases. ::: :::{grid-item-card} :li...

(A_VARIETY_OF)

docs/src/user-guide/guides-using-object-storage.md (2)

57-58: Improve grammar in configuration description

Revise the sentence for better clarity.

-* `s3_config` configures both the S3 bucket where archives should be stored as well as credentials
-  for accessing it.
+* `s3_config` configures both the S3 bucket where archives should be stored and the credentials
+  for accessing it.
🧰 Tools
🪛 LanguageTool

[grammar] ~57-~57: Probable usage error. Use “and” after ‘both’.
Context: ... bucket where archives should be stored as well as credentials for accessing it. * `<a...

(BOTH_AS_WELL_AS)


89-94: Consider adding cleanup instructions

Since there's a known limitation regarding stream cleanup, consider adding temporary workaround instructions for manually cleaning up the streams until the automatic cleanup is implemented.

docs/src/user-guide/guides-overview.md (1)

3-3: Consider rephrasing for clarity

Similar to the index page, consider revising the phrase for conciseness.

-The guides below describe how to use CLP in a variety of use cases.
+The guides below describe how to use CLP in various use cases.
🧰 Tools
🪛 LanguageTool

[style] ~3-~3: The phrase “a variety of” may be wordy. To make your writing clearer, consider replacing it.
Context: ...guides below describe how to use CLP in a variety of use cases. ::::{grid} 1 1 2 2 :gutter:...

(A_VARIETY_OF)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ba63a76 and a298676.

📒 Files selected for processing (5)
  • docs/src/user-guide/guides-overview.md (1 hunks)
  • docs/src/user-guide/guides-using-object-storage.md (1 hunks)
  • docs/src/user-guide/index.md (2 hunks)
  • docs/src/user-guide/quick-start-compression/json.md (1 hunks)
  • docs/src/user-guide/quick-start-compression/text.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/src/user-guide/index.md

[style] ~22-~22: The phrase “a variety of” may be wordy. To make your writing clearer, consider replacing it.
Context: ...view Guides ^^^ Guides for using CLP in a variety of use cases. ::: :::{grid-item-card} :li...

(A_VARIETY_OF)

docs/src/user-guide/guides-using-object-storage.md

[grammar] ~31-~31: Possible subject-verb agreement error detected.
Context: ... path to the log file. However, if that log file's path is a prefix of another log ...

(THIS_THAT_AGR)


[grammar] ~57-~57: Probable usage error. Use “and” after ‘both’.
Context: ... bucket where archives should be stored as well as credentials for accessing it. * `<a...

(BOTH_AS_WELL_AS)


[style] ~61-~61: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...he bucket's name. * <s3-key-prefix> is the "directory" where all archives will...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

docs/src/user-guide/guides-overview.md

[style] ~3-~3: The phrase “a variety of” may be wordy. To make your writing clearer, consider replacing it.
Context: ...guides below describe how to use CLP in a variety of use cases. ::::{grid} 1 1 2 2 :gutter:...

(A_VARIETY_OF)

🔇 Additional comments (3)
docs/src/user-guide/quick-start-compression/json.md (1)

6-13: LGTM! Well-documented changes

The changes are clear and helpful:

  • Added the 'fs' subcommand with explanation
  • Included a useful tip linking to object storage documentation
docs/src/user-guide/index.md (1)

57-64: LGTM! Good documentation structure

The new toctree for guides improves navigation and organization.

docs/src/user-guide/guides-overview.md (1)

8-13: LGTM! Clear and concise card description

The grid card provides a clear overview of the object storage guide's purpose.

Comment on lines 96 to 97
[aws-region-codes]: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.RegionsAndAvailabilityZones.html#Concepts.RegionsAndAvailabilityZones.Availability
[release-choices]: http://localhost:8080/user-guide/quick-start-cluster-setup/index.html#choosing-a-release
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix localhost URL in reference link

The reference link contains a localhost URL which won't work in production.

-[release-choices]: http://localhost:8080/user-guide/quick-start-cluster-setup/index.html#choosing-a-release
+[release-choices]: ../quick-start-cluster-setup/index.md#choosing-a-release
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
[aws-region-codes]: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.RegionsAndAvailabilityZones.html#Concepts.RegionsAndAvailabilityZones.Availability
[release-choices]: http://localhost:8080/user-guide/quick-start-cluster-setup/index.html#choosing-a-release
[aws-region-codes]: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.RegionsAndAvailabilityZones.html#Concepts.RegionsAndAvailabilityZones.Availability
[release-choices]: ../quick-start-cluster-setup/index.md#choosing-a-release

To compress logs from S3, use the `s3` subcommand of the `compress.sh` script:

```bash
sbin/compress.sh s3 s3://<bucket-name>/<path-prefix>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to also mention about credentials?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, do we want to give user a list of permission that are required for ingestion credentials and compression/stream extraction credentials?

They have slightly different permission requirement.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. What permissions do they require?

Copy link
Contributor

@haiqi96 haiqi96 Jan 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For ingestion:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "s3:GetObject",
            "Resource": [
                "arn:aws:s3:::<bucket_name>/<access_allowed_path_prefix>/*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::<bucket_name>"
            ],
            "Condition": {
                "StringLike": {
                    "s3:prefix": "<access_allowed_path_prefix>/*"
                }
            }
        }
    ]
}

Not tested, but I believe if you don't care about limiting premission to a specific path, just do:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "s3:GetObject",
            "Resource": [
                "arn:aws:s3:::<bucket_name>/*"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:ListBucket"
            ],
            "Resource": [
                "arn:aws:s3:::<bucket_name>"
            ]
        }
    ]
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For compression/stream,

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:PutObject"
            ],
            "Resource": [
                "arn:aws:s3:::<bucket_name>/<path_prefix>/*"
            ]
        }
    ]
}

archive_output:
storage:
type: "s3"
staging_directory: "var/data/staged-archives" # Or a path of your choosing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we explain what does this mean?

region: "<aws-region-code>"
bucket: "<s3-bucket-name>"
key-prefix: "<s3-key-prefix>"
credentials:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we note that we only support long term credential, as documented here https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html

# stream_output's other config keys
```

The configuration keys above function identically to those in `archive_output.storage`, except they
Copy link
Contributor

@haiqi96 haiqi96 Jan 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to mention that log viewing required the bucket to be configured with cross region access permission.

a typical configuration to add is:

[
    {
        "AllowedHeaders": [
            "*"
        ],
        "AllowedMethods": [
            "GET"
        ],
        "AllowedOrigins": [
            "http://localhost:<LOG_VIEWER_WEBUI_PORT_NUMBER>"
            # or "*", known to work but maybe too wild
            # or maybe "http://localhost:*", never tried though.
        ],
        "ExposeHeaders": []
    }
]

@kirkrodrigues kirkrodrigues changed the title WIP: docs: Add guides for using clp-json with object storage; Update compression scripts docs missed in previous PRs. docs: Add guides for using clp-json with object storage; Update compression scripts docs missed in previous PRs. Jan 22, 2025
@kirkrodrigues kirkrodrigues marked this pull request as ready for review January 22, 2025 12:23
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (5)
docs/src/user-guide/guides-overview.md (1)

3-3: Consider more concise wording.

The phrase "a variety of" could be simplified.

-The guides below describe how to use CLP in a variety of use cases.
+The guides below describe how to use CLP in different use cases.
🧰 Tools
🪛 LanguageTool

[style] ~3-~3: The phrase “a variety of” may be wordy. To make your writing clearer, consider replacing it.
Context: ...guides below describe how to use CLP in a variety of use cases. ::::{grid} 1 1 2 2 :gutter:...

(A_VARIETY_OF)

docs/src/user-guide/index.md (1)

22-22: Consider more concise wording.

The phrase "a variety of" could be simplified.

-Guides for using CLP in a variety of use cases.
+Guides for using CLP in different use cases.
🧰 Tools
🪛 LanguageTool

[style] ~22-~22: The phrase “a variety of” may be wordy. To make your writing clearer, consider replacing it.
Context: ...view Guides ^^^ Guides for using CLP in a variety of use cases. ::: :::{grid-item-card} :li...

(A_VARIETY_OF)

docs/src/user-guide/guides-using-object-storage/archive-storage.md (1)

63-63: Clarify key prefix requirement.

The documentation should explicitly state that the key prefix must end with a trailing slash.

-    must end with `/`.
+    must end with a trailing slash (e.g., "archives/").
docs/src/user-guide/guides-using-object-storage/compress.md (1)

62-66: Fix list indentation.

The nested list items should be indented by 2 spaces relative to their parent.

* `<credentials-file>` is the path to an AWS credentials file like the following:

    ```ini
    [default]
    aws_access_key_id = <aws-access-key-id>
    aws_secret_access_key = <aws-secret-access-key>
    ```

-    * CLP expects the credentials to be in the `default` section.
-    * `<aws-access-key-id>` and `<aws-secret-access-key>` are the access key ID and secret access
-      key of the CLP IAM user.
-    * If you don't want to use a credentials file, you can specify the credentials on the command
-      line using the `--aws-access-key-id` and `--aws-secret-access-key` flags.
+  * CLP expects the credentials to be in the `default` section.
+  * `<aws-access-key-id>` and `<aws-secret-access-key>` are the access key ID and secret access
+    key of the CLP IAM user.
+  * If you don't want to use a credentials file, you can specify the credentials on the command
+    line using the `--aws-access-key-id` and `--aws-secret-access-key` flags.
🧰 Tools
🪛 Markdownlint (0.37.0)

62-62: Expected: 2; Actual: 4
Unordered list indentation

(MD007, ul-indent)


63-63: Expected: 2; Actual: 4
Unordered list indentation

(MD007, ul-indent)


65-65: Expected: 2; Actual: 4
Unordered list indentation

(MD007, ul-indent)

docs/src/user-guide/guides-using-object-storage/index.md (1)

53-53: Add missing determiner "the" before "principle"

Add the missing determiner for better grammar.

-    * You may use a different IAM user for each use case to follow the principle of least privilege
+    * You may use a different IAM user for each use case to follow the principle of least privilege
🧰 Tools
🪛 LanguageTool

[grammar] ~53-~53: A determiner may be missing.
Context: ... case to follow the [principle of least privilege][least-privilege-principle], ...

(THE_SUPERLATIVE)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a298676 and b95b45f.

📒 Files selected for processing (7)
  • docs/src/user-guide/guides-overview.md (1 hunks)
  • docs/src/user-guide/guides-using-object-storage/archive-storage.md (1 hunks)
  • docs/src/user-guide/guides-using-object-storage/compress.md (1 hunks)
  • docs/src/user-guide/guides-using-object-storage/index.md (1 hunks)
  • docs/src/user-guide/guides-using-object-storage/stream-storage.md (1 hunks)
  • docs/src/user-guide/index.md (2 hunks)
  • docs/src/user-guide/quick-start-compression/json.md (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • docs/src/user-guide/quick-start-compression/json.md
🧰 Additional context used
🪛 LanguageTool
docs/src/user-guide/guides-using-object-storage/compress.md

[style] ~43-~43: To form a complete sentence, be sure to include a subject or ‘there’.
Context: ... appropriate values: * <bucket-name> should be the name of the S3 bucket containing...

(MISSING_IT_THERE)


[style] ~44-~44: To form a complete sentence, be sure to include a subject or ‘there’.
Context: ... containing your logs. * <key-prefix> should be the prefix of all logs you wish to c...

(MISSING_IT_THERE)


[grammar] ~75-~75: Possible subject-verb agreement error detected.
Context: ... path to the log file. However, if that log file's path is a prefix of another log ...

(THIS_THAT_AGR)

docs/src/user-guide/guides-using-object-storage/archive-storage.md

[style] ~32-~32: To form a complete sentence, be sure to include a subject or ‘there’.
Context: ... appropriate values: * <bucket-name> should be the name of the S3 bucket to store c...

(MISSING_IT_THERE)


[style] ~33-~33: To form a complete sentence, be sure to include a subject or ‘there’.
Context: ...e compressed archives. * <key-prefix> should be the prefix where you want the compre...

(MISSING_IT_THERE)


[style] ~62-~62: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...s the bucket's name. * <key-prefix> is the "directory" where all archives will...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

docs/src/user-guide/guides-using-object-storage/index.md

[grammar] ~53-~53: A determiner may be missing.
Context: ... case to follow the [principle of least privilege][least-privilege-principle], ...

(THE_SUPERLATIVE)

docs/src/user-guide/guides-overview.md

[style] ~3-~3: The phrase “a variety of” may be wordy. To make your writing clearer, consider replacing it.
Context: ...guides below describe how to use CLP in a variety of use cases. ::::{grid} 1 1 2 2 :gutter:...

(A_VARIETY_OF)

docs/src/user-guide/index.md

[style] ~22-~22: The phrase “a variety of” may be wordy. To make your writing clearer, consider replacing it.
Context: ...view Guides ^^^ Guides for using CLP in a variety of use cases. ::: :::{grid-item-card} :li...

(A_VARIETY_OF)

🪛 Markdownlint (0.37.0)
docs/src/user-guide/guides-using-object-storage/compress.md

62-62: Expected: 2; Actual: 4
Unordered list indentation

(MD007, ul-indent)


63-63: Expected: 2; Actual: 4
Unordered list indentation

(MD007, ul-indent)


65-65: Expected: 2; Actual: 4
Unordered list indentation

(MD007, ul-indent)


10-10: Missing link or image reference definition: "add-iam-policy"
Reference links and images should use a label that is defined

(MD052, reference-links-images)

docs/src/user-guide/guides-using-object-storage/archive-storage.md

10-10: Missing link or image reference definition: "add-iam-policy"
Reference links and images should use a label that is defined

(MD052, reference-links-images)


60-60: Missing link or image reference definition: "aws-region-codes"
Reference links and images should use a label that is defined

(MD052, reference-links-images)

docs/src/user-guide/guides-using-object-storage/stream-storage.md

12-12: Missing link or image reference definition: "add-iam-policy"
Reference links and images should use a label that is defined

(MD052, reference-links-images)


37-37: Missing link or image reference definition: "aws-cors-guide"
Reference links and images should use a label that is defined

(MD052, reference-links-images)

docs/src/user-guide/guides-using-object-storage/index.md

99-99: Unused link or image reference definition: "add-iam-policy"
Link and image reference definitions should be needed

(MD053, link-image-reference-definitions)


100-100: Unused link or image reference definition: "aws-cors-guide"
Link and image reference definitions should be needed

(MD053, link-image-reference-definitions)


103-103: Unused link or image reference definition: "aws-region-codes"
Link and image reference definitions should be needed

(MD053, link-image-reference-definitions)

🔇 Additional comments (4)
docs/src/user-guide/guides-using-object-storage/index.md (4)

1-21: Well-structured introduction with clear capabilities and limitations!

The introduction effectively communicates CLP's object storage capabilities and current limitations.


23-60: Comprehensive prerequisites with excellent visual aid!

The prerequisites section effectively outlines all requirements and includes a helpful diagram explaining the S3 URL structure.

🧰 Tools
🪛 LanguageTool

[grammar] ~53-~53: A determiner may be missing.
Context: ... case to follow the [principle of least privilege][least-privilege-principle], ...

(THE_SUPERLATIVE)


62-97: Excellent organization of use cases with clear navigation!

The grid layout effectively presents the three use cases, and the hidden toctree ensures proper navigation.


99-105: Verify usage of reference links

The following reference links appear to be unused in this file:

  • add-iam-policy
  • aws-cors-guide
  • aws-region-codes

They might be used in the related files. Let's verify their usage.

✅ Verification successful

Reference links are properly utilized across documentation files

All three reference links are actively used in related documentation files within the same directory:

  • [add-iam-policy] is used in archive-storage.md, compress.md, and stream-storage.md
  • [aws-cors-guide] is used in stream-storage.md
  • [aws-region-codes] is used in archive-storage.md
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if the reference links are used in related documentation files

echo "Checking for reference link usage..."
rg -l "\[.*\]\[(?:add-iam-policy|aws-cors-guide|aws-region-codes)\]" docs/src/user-guide/guides-using-object-storage/

Length of output: 395


Script:

#!/bin/bash
# Check specific link usage in each file
echo "Checking add-iam-policy usage..."
rg "\[.*?\]\[add-iam-policy\]" docs/src/user-guide/guides-using-object-storage/{stream-storage,compress,archive-storage}.md

echo -e "\nChecking aws-cors-guide usage..."
rg "\[.*?\]\[aws-cors-guide\]" docs/src/user-guide/guides-using-object-storage/{stream-storage,compress,archive-storage}.md

echo -e "\nChecking aws-region-codes usage..."
rg "\[.*?\]\[aws-region-codes\]" docs/src/user-guide/guides-using-object-storage/{stream-storage,compress,archive-storage}.md

Length of output: 1689

🧰 Tools
🪛 Markdownlint (0.37.0)

99-99: Unused link or image reference definition: "add-iam-policy"
Link and image reference definitions should be needed

(MD053, link-image-reference-definitions)


100-100: Unused link or image reference definition: "aws-cors-guide"
Link and image reference definitions should be needed

(MD053, link-image-reference-definitions)


103-103: Unused link or image reference definition: "aws-region-codes"
Link and image reference definitions should be needed

(MD053, link-image-reference-definitions)


## IAM user configuration

Attach the following policy to the CLP IAM user by following [this guide][add-iam-policy].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add missing link definition.

The link reference [add-iam-policy] is undefined.

Add the following link definition at the bottom of the file:

[add-iam-policy]: https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html#add-policies-console
🧰 Tools
🪛 Markdownlint (0.37.0)

10-10: Missing link or image reference definition: "add-iam-policy"
Reference links and images should use a label that is defined

(MD052, reference-links-images)


* `s3_config` configures both the S3 bucket where archives should be stored and the credentials
for accessing it.
* `<region-code>` is the AWS region [code][aws-region-codes] for the bucket.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add missing link definition.

The link reference [aws-region-codes] is undefined.

Add the following link definition at the bottom of the file:

[aws-region-codes]: https://docs.aws.amazon.com/general/latest/gr/rande.html#regional-endpoints
🧰 Tools
🪛 Markdownlint (0.37.0)

60-60: Missing link or image reference definition: "aws-region-codes"
Reference links and images should use a label that is defined

(MD052, reference-links-images)


## IAM user configuration

Attach the following policy to the CLP IAM user by following [this guide][add-iam-policy].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add missing link definition.

The link reference [add-iam-policy] is undefined.

Add the following link definition at the bottom of the file:

[add-iam-policy]: https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html#add-policies-console
🧰 Tools
🪛 Markdownlint (0.37.0)

10-10: Missing link or image reference definition: "add-iam-policy"
Reference links and images should use a label that is defined

(MD052, reference-links-images)


## IAM user configuration

Attach the following policy to the CLP IAM user by following [this guide][add-iam-policy].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add missing link definition.

The link reference [add-iam-policy] is undefined.

Add the following link definition at the bottom of the file:

[add-iam-policy]: https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html#add-policies-console
🧰 Tools
🪛 Markdownlint (0.37.0)

12-12: Missing link or image reference definition: "add-iam-policy"
Reference links and images should use a label that is defined

(MD052, reference-links-images)

For CLP's log viewer to be able to view the compressed logs from S3 over the internet, the S3 bucket
must have a CORS policy configured.

Add the CORS configuration below to your bucket by following [this guide][aws-cors-guide]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add missing link definition.

The link reference [aws-cors-guide] is undefined.

Add the following link definition at the bottom of the file:

[aws-cors-guide]: https://docs.aws.amazon.com/AmazonS3/latest/userguide/enabling-cors-examples.html
🧰 Tools
🪛 Markdownlint (0.37.0)

37-37: Missing link or image reference definition: "aws-cors-guide"
Reference links and images should use a label that is defined

(MD052, reference-links-images)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (5)
docs/src/user-guide/guides-using-object-storage/archive-storage.md (1)

64-65: Clarify key prefix requirement.

While the documentation mentions that the key prefix must end with /, it doesn't explain why this is required or what happens if it's omitted.

Add a note explaining the implications of omitting the trailing slash, for example:

  * `<key-prefix>` is the "directory" where all archives will be stored within the bucket and
    must end with `/`.
+ > **Note:** The trailing slash is required to ensure archives are stored in a directory-like
+ > structure. Without it, archives might be stored at unexpected locations in the bucket.
🧰 Tools
🪛 LanguageTool

[style] ~64-~64: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...s the bucket's name. * <key-prefix> is the "directory" where all archives will...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

docs/src/user-guide/guides-using-object-storage/compress.md (1)

75-77: Enhance clarity of the limitation note.

The current explanation about path prefix limitations might be confusing to users. Consider providing a concrete example.

Add an example to clarify:

 If you wish to compress a single log file, specify the entire path to the log file. However, if that
 log file's path is a prefix of another log file's path, then both log files will be compressed. This
 limitation will be addressed in a future release.
+ > For example, if you specify `s3://bucket/logs/app.log` but there exists another file
+ > `s3://bucket/logs/app.log.1`, both files will be compressed.
🧰 Tools
🪛 LanguageTool

[grammar] ~75-~75: Possible subject-verb agreement error detected.
Context: ... path to the log file. However, if that log file's path is a prefix of another log ...

(THIS_THAT_AGR)

docs/src/user-guide/guides-using-object-storage/index.md (3)

13-16: Add version information for future compatibility.

The note about clp-text support should include version information to help users track when the feature becomes available.

Add version tracking:

 Currently, only the [clp-json][release-choices] release supports object storage. Support for
-`clp-text` will be added in a future release.
+`clp-text` will be added in a future release (tracked in issue #XXX).

Would you like me to create an issue to track this feature request?


42-45: Improve Mermaid diagram clarity.

The current diagram shows a basic S3 URL structure but could be enhanced to show more common scenarios.

Consider adding more examples to the diagram:

 graph TD
   A["s3://my-bucket-name/my-logs-dir/"] --"Bucket name"--> B[my-bucket-name]
   A --"Key prefix"--> C[path/to/my/file.txt]
+  D["s3://my-bucket-name/logs/2024/01/"] --"Example: Date-based prefix"--> E[logs/2024/01/]
+  F["s3://my-bucket-name/app1/logs/"] --"Example: App-specific prefix"--> G[app1/logs/]

57-60: Add workaround for temporary credentials limitation.

While temporary credentials aren't supported yet, users might need a temporary solution.

Add a workaround note:

 CLP currently requires IAM user (long-term) credentials to access the relevant S3 buckets. Support
 for other authentication methods (e.g., temporary credentials) will be added in a future release.
+ > **Workaround:** If you need to use temporary credentials, consider creating a separate IAM user
+ > specifically for CLP with appropriate permissions and credential rotation policies.
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b95b45f and bdcf423.

📒 Files selected for processing (4)
  • docs/src/user-guide/guides-using-object-storage/archive-storage.md (1 hunks)
  • docs/src/user-guide/guides-using-object-storage/compress.md (1 hunks)
  • docs/src/user-guide/guides-using-object-storage/index.md (1 hunks)
  • docs/src/user-guide/guides-using-object-storage/stream-storage.md (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • docs/src/user-guide/guides-using-object-storage/stream-storage.md
🧰 Additional context used
🪛 LanguageTool
docs/src/user-guide/guides-using-object-storage/compress.md

[style] ~43-~43: To form a complete sentence, be sure to include a subject or ‘there’.
Context: ... appropriate values: * <bucket-name> should be the name of the S3 bucket containing...

(MISSING_IT_THERE)


[style] ~44-~44: To form a complete sentence, be sure to include a subject or ‘there’.
Context: ... containing your logs. * <key-prefix> should be the prefix of all logs you wish to c...

(MISSING_IT_THERE)


[grammar] ~75-~75: Possible subject-verb agreement error detected.
Context: ... path to the log file. However, if that log file's path is a prefix of another log ...

(THIS_THAT_AGR)

docs/src/user-guide/guides-using-object-storage/archive-storage.md

[style] ~32-~32: To form a complete sentence, be sure to include a subject or ‘there’.
Context: ... appropriate values: * <bucket-name> should be the name of the S3 bucket to store c...

(MISSING_IT_THERE)


[style] ~33-~33: To form a complete sentence, be sure to include a subject or ‘there’.
Context: ...e compressed archives. * <key-prefix> should be the prefix where you want the compre...

(MISSING_IT_THERE)


[style] ~64-~64: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...s the bucket's name. * <key-prefix> is the "directory" where all archives will...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

docs/src/user-guide/guides-using-object-storage/index.md

[grammar] ~53-~53: A determiner may be missing.
Context: ... case to follow the [principle of least privilege][least-privilege-principle], ...

(THE_SUPERLATIVE)

🪛 Markdownlint (0.37.0)
docs/src/user-guide/guides-using-object-storage/compress.md

62-62: Expected: 2; Actual: 4
Unordered list indentation

(MD007, ul-indent)


63-63: Expected: 2; Actual: 4
Unordered list indentation

(MD007, ul-indent)


65-65: Expected: 2; Actual: 4
Unordered list indentation

(MD007, ul-indent)

⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: lint-check (ubuntu-latest)
  • GitHub Check: lint-check (macos-latest)
🔇 Additional comments (1)
docs/src/user-guide/guides-using-object-storage/archive-storage.md (1)

44-44: Validate staging directory path.

The staging_directory path should be validated to ensure:

  1. The directory exists or can be created
  2. The CLP process has write permissions
  3. Sufficient disk space is available

Comment on lines 65 to 66
* If you don't want to use a credentials file, you can specify the credentials on the command
line using the `--aws-access-key-id` and `--aws-secret-access-key` flags.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

🛠️ Refactor suggestion

Add security warning about command-line credential exposure

The suggestion to add a security note is valid. The codebase primarily promotes using credentials files, and there are currently no warnings about the security implications of command-line credentials. Adding this warning would help users make informed decisions about credential handling.

  • Location: docs/src/user-guide/guides-using-object-storage/compress.md
🔗 Analysis chain

Consider security implications of command-line credentials.

While offering command-line credentials options provides flexibility, it may expose sensitive information in shell history or process listings.

Add a security note:

    * If you don't want to use a credentials file, you can specify the credentials on the command
      line using the `--aws-access-key-id` and `--aws-secret-access-key` flags.
+ > **Security Note:** Providing credentials via command-line arguments may expose them in shell
+ > history or process listings. Using a credentials file is recommended for better security.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if there are any security-related comments or validations for command-line credentials
rg -A 3 "aws.*credentials.*command.*line.*security"

Length of output: 53


Script:

#!/bin/bash
# Search for any security-related warnings about credentials
rg -i "security.*warning.*credentials|warning.*credentials.*security"

# Search for command-line credential patterns
rg -i "credentials?.*(command|cli|terminal|shell)"

# Check documentation for security best practices
fd -e md -e txt . docs/ -x cat {} | rg -i "security|credentials"

Length of output: 1526

🧰 Tools
🪛 Markdownlint (0.37.0)

65-65: Expected: 2; Actual: 4
Unordered list indentation

(MD007, ul-indent)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (6)
docs/src/user-guide/guides-using-object-storage/compress.md (3)

8-45: Enhance IAM policy documentation with security context

Consider adding explanations about:

  • Why each permission (s3:GetObject and s3:ListBucket) is necessary
  • The principle of least privilege and why these specific resource constraints are important

Add this content after line 11:

This policy follows the principle of least privilege by:
- Allowing `s3:GetObject` only for objects under the specified prefix
- Allowing `s3:ListBucket` with a condition to list only objects under the specified prefix
- Excluding unnecessary S3 permissions like `s3:PutObject` or `s3:DeleteObject`
🧰 Tools
🪛 LanguageTool

[style] ~43-~43: To form a complete sentence, be sure to include a subject or ‘there’.
Context: ... appropriate values: * <bucket-name> should be the name of the S3 bucket containing...

(MISSING_IT_THERE)


[style] ~44-~44: To form a complete sentence, be sure to include a subject or ‘there’.
Context: ... containing your logs. * <key-prefix> should be the prefix of all logs you wish to c...

(MISSING_IT_THERE)


50-52: Add a concrete example with actual values

Consider adding a real-world example to help users understand the command structure better.

Add this after line 52:

For example:
```bash
sbin/compress.sh s3 --aws-credentials-file ~/.aws/credentials s3://my-logs-bucket/apache/2024/01/

---

`72-80`: **Enhance prefix matching explanation with visual example and workaround**

The prefix matching behaviour could be clearer with a tree structure example and a temporary workaround.


Add this content after line 79:
```markdown
For example, given this structure:

logs/
├── syslog
└── syslog.1


To compress only `syslog`, you can temporarily move or rename `syslog.1` until this limitation is addressed.
🧰 Tools
🪛 LanguageTool

[grammar] ~76-~76: Possible subject-verb agreement error detected.
Context: ... path to the log file. However, if that log file's path is a prefix of another log ...

(THIS_THAT_AGR)

docs/src/user-guide/guides-using-object-storage/archive-storage.md (3)

1-7: Consider adding context about why these steps are necessary.

The introduction could be more helpful by briefly explaining why these configuration steps are required (e.g., for security and access control purposes).

Add a brief explanation after line 4:

 To store compressed archives on S3, you'll need to:
+
+These steps ensure secure access to your S3 storage while maintaining proper access controls:

18-24: Add explanations for the IAM policy permissions.

Consider clarifying:

  1. Why both s3:GetObject and s3:PutObject permissions are needed
  2. The significance of the trailing wildcard in the Resource pattern

Add explanations after line 28:

 }

+Notes about the policy:
+* s3:GetObject is required for reading existing archives
+* s3:PutObject is required for storing new archives
+* The trailing /* in the Resource pattern allows access to all objects under the specified prefix


---

`69-69`: **Update the AWS regions documentation link.**

The current link points to RDS-specific documentation. Consider using the general AWS regions documentation instead.

```diff
-[aws-region-codes]: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.RegionsAndAvailabilityZones.html#Concepts.RegionsAndAvailabilityZones.Availability
+[aws-region-codes]: https://docs.aws.amazon.com/general/latest/gr/rande.html#regional-endpoints
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bdcf423 and 82caf11.

📒 Files selected for processing (3)
  • docs/src/user-guide/guides-overview.md (1 hunks)
  • docs/src/user-guide/guides-using-object-storage/archive-storage.md (1 hunks)
  • docs/src/user-guide/guides-using-object-storage/compress.md (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • docs/src/user-guide/guides-overview.md
🧰 Additional context used
🪛 LanguageTool
docs/src/user-guide/guides-using-object-storage/compress.md

[style] ~43-~43: To form a complete sentence, be sure to include a subject or ‘there’.
Context: ... appropriate values: * <bucket-name> should be the name of the S3 bucket containing...

(MISSING_IT_THERE)


[style] ~44-~44: To form a complete sentence, be sure to include a subject or ‘there’.
Context: ... containing your logs. * <key-prefix> should be the prefix of all logs you wish to c...

(MISSING_IT_THERE)


[grammar] ~76-~76: Possible subject-verb agreement error detected.
Context: ... path to the log file. However, if that log file's path is a prefix of another log ...

(THIS_THAT_AGR)

docs/src/user-guide/guides-using-object-storage/archive-storage.md

[style] ~32-~32: To form a complete sentence, be sure to include a subject or ‘there’.
Context: ... appropriate values: * <bucket-name> should be the name of the S3 bucket to store c...

(MISSING_IT_THERE)


[style] ~33-~33: To form a complete sentence, be sure to include a subject or ‘there’.
Context: ...e compressed archives. * <key-prefix> should be the prefix where you want the compre...

(MISSING_IT_THERE)


[style] ~64-~64: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...s the bucket's name. * <key-prefix> is the "directory" where all archives will...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: lint-check (ubuntu-latest)
  • GitHub Check: build (macos-latest)
  • GitHub Check: lint-check (macos-latest)

Comment on lines 54 to 68
* `<credentials-file>` is the path to an AWS credentials file like the following:

```ini
[default]
aws_access_key_id = <aws-access-key-id>
aws_secret_access_key = <aws-secret-access-key>
```

* CLP expects the credentials to be in the `default` section.
* `<aws-access-key-id>` and `<aws-secret-access-key>` are the access key ID and secret access
key of the CLP IAM user.
* If you don't want to use a credentials file, you can specify the credentials on the command
line using the `--aws-access-key-id` and `--aws-secret-access-key` flags (note that this may
expose your credentials to other users running on the system).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add AWS region configuration information

The credentials section should mention how to specify the AWS region.

Add this content after line 60:

  You can specify the AWS region in the credentials file:
  ```ini
  region = us-east-1

Alternatively, use the --aws-region flag on the command line.


<!-- This is an auto-generated comment by CodeRabbit -->

@kirkrodrigues kirkrodrigues requested a review from haiqi96 January 22, 2025 13:51
* For brevity, we'll refer to this user as the "CLP IAM user" in the rest of this guide.

:::{note}
CLP currently requires IAM user (long-term) credentials to access the relevant S3 buckets. Support
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just double checked the IAM user creation. The access key is not created by default when creating the user from AWS console, so user need to manually create it.

Should we point user to this guide somewhere? https://docs.aws.amazon.com/keyspaces/latest/devguide/create.keypair.html

Note that AWS's recommendation is to not use long-term key, so I guess it's better to mention that this solution is temporary and we will support temporary key (which you already mentioned in the note)

:link: stream-storage
Viewing compressed logs
^^^
Viewing compressed logs from object storage
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about term Viewing compressed logs from object storage.
It may give people an impression that it's about viewing the "logs stored in object storage", which could cause confusion.
It's hard to explain intermediate files in a short title though. should we briefly explain what's going on underthehood at the beginning of stream-storage.md?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could cause confusion

What would be the specific ambiguity?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's say my configuration is storing archives on S3, and use local FS storage for log viewer.
Can this be interpreted as Viewing compressed logs from object storage, since the logs to be viewed are indeed from object storage.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (11)
docs/src/user-guide/guides-using-object-storage/object-storage-config.md (5)

11-36: Consider adding explanatory comments for IAM policy actions.

The IAM policy would be more user-friendly with brief explanations of why each permission (s3:GetObject and s3:ListBucket) is necessary for the compression workflow.


41-47: Add examples of valid prefix values.

The note about directory-like prefixes would be more helpful with concrete examples. Consider adding examples like:

Examples:
- `logs/` - Allows compression of files under the logs directory
- `logs/2024/` - Allows compression of files under logs/2024
- `logs` (without trailing slash) - Also allows compression of files with prefixes like `logs-backup`

49-75: Consider adding context about permissions and lifecycle management.

The archive storage section would benefit from:

  1. Explaining why both s3:GetObject and s3:PutObject permissions are required
  2. Adding a note about considering S3 bucket lifecycle policies for managing archive retention and transitioning to cheaper storage classes
🧰 Tools
🪛 LanguageTool

[style] ~72-~72: To form a complete sentence, be sure to include a subject or ‘there’.
Context: ... ] } ] } ``` * <bucket-name> should be the name of the S3 bucket where comp...

(MISSING_IT_THERE)


[style] ~73-~73: To form a complete sentence, be sure to include a subject or ‘there’.
Context: ...ives should be stored. * <key-prefix> should be the prefix (used like a directory pa...

(MISSING_IT_THERE)


127-129: Consider restricting allowed headers in CORS policy.

The current CORS configuration allows all headers ("*"). For enhanced security, consider listing only the specific headers required by the log viewer.


143-147: Enhance security recommendation with examples.

The security tip about allowed origins would be more actionable with examples:

Example of specific allowed origins:
```json
"AllowedOrigins": [
    "https://your-company-domain.com",
    "https://internal-tools.your-company-domain.com"
]

</blockquote></details>
<details>
<summary>docs/src/user-guide/guides-using-object-storage/clp-usage.md (2)</summary><blockquote>

`19-33`: **Consider adding environment variables as a secure alternative.**

The credentials configuration is well-explained, but consider mentioning environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY) as a more secure alternative to command-line arguments.

```diff
  * If you don't want to use a credentials file, you can specify the credentials on the command
    line using the `--aws-access-key-id` and `--aws-secret-access-key` flags (note that this may
-    expose your credentials to other users running on the system).
+    expose your credentials to other users running on the system). Alternatively, you can use
+    environment variables `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` for better security.

50-52: Remove unused reference definition.

The reference [add-iam-policy] is defined but not used in the document. Consider removing it to maintain clean documentation.

-[add-iam-policy]: https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_manage-attach-detach.html#embed-inline-policy-console
 [aws-region-codes]: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.RegionsAndAvailabilityZones.html#Concepts.RegionsAndAvailabilityZones.Availability
 [compression-iam-policy]: ./object-storage-config.md#configuration-for-compression
🧰 Tools
🪛 Markdownlint (0.37.0)

50-50: Unused link or image reference definition: "add-iam-policy"
Link and image reference definitions should be needed

(MD053, link-image-reference-definitions)

docs/src/user-guide/guides-using-object-storage/clp-config.md (2)

9-40: Add missing article in the key_prefix description.

The archive storage configuration is well-documented. However, there's a minor grammatical issue.

Apply this change:

-  must end with trailing forward slash (e.g., `archives/`).
+  must end with a trailing forward slash (e.g., `archives/`).
🧰 Tools
🪛 LanguageTool

[style] ~37-~37: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...s the bucket's name. * <key-prefix> is the "directory" where all archives will...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[uncategorized] ~38-~38: Possible missing article found.
Context: ...within the bucket and must end with trailing forward slash (e.g., archives/). * ...

(AI_HYDRA_LEO_MISSING_A)


41-72: Add missing article in the key_prefix description.

The stream storage configuration maintains excellent consistency with the archive storage section. However, there's the same grammatical issue.

Apply this change:

-  must end with trailing forward slash (e.g., `streams/`).
+  must end with a trailing forward slash (e.g., `streams/`).
🧰 Tools
🪛 LanguageTool

[style] ~69-~69: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...s the bucket's name. * <key-prefix> is the "directory" where all streams will ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

docs/src/user-guide/guides-using-object-storage/index.md (2)

5-7: Consider clarifying the "compress logs from object storage" description.

Based on previous discussions, the current wording might cause confusion about whether this refers to viewing logs that are already stored in object storage versus compressing logs from object storage.

Consider adding a brief clarification:

-* compress logs from object storage (e.g., S3);
+* compress logs stored in object storage (e.g., S3) into archives;

37-39: Emphasize manual access key creation requirement.

Based on previous feedback, it would be helpful to explicitly mention that access keys are not created by default during IAM user creation.

Consider adding this clarification:

    * To create these credentials, follow [this guide][aws-create-access-keys].
+     * Note: Access keys are not created automatically when creating an IAM user.
      * Choose the "Other" use case to generate long-term credentials.
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 59b72e9 and 454cb87.

📒 Files selected for processing (4)
  • docs/src/user-guide/guides-using-object-storage/clp-config.md (1 hunks)
  • docs/src/user-guide/guides-using-object-storage/clp-usage.md (1 hunks)
  • docs/src/user-guide/guides-using-object-storage/index.md (1 hunks)
  • docs/src/user-guide/guides-using-object-storage/object-storage-config.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/src/user-guide/guides-using-object-storage/object-storage-config.md

[style] ~38-~38: To form a complete sentence, be sure to include a subject or ‘there’.
Context: ... } } ] } ``` * <bucket-name> should be the name of the S3 bucket containing...

(MISSING_IT_THERE)


[style] ~39-~39: To form a complete sentence, be sure to include a subject or ‘there’.
Context: ...aining your logs. * <all-logs-prefix> should be the prefix of all logs you wish to c...

(MISSING_IT_THERE)


[style] ~72-~72: To form a complete sentence, be sure to include a subject or ‘there’.
Context: ... ] } ] } ``` * <bucket-name> should be the name of the S3 bucket where comp...

(MISSING_IT_THERE)


[style] ~73-~73: To form a complete sentence, be sure to include a subject or ‘there’.
Context: ...ives should be stored. * <key-prefix> should be the prefix (used like a directory pa...

(MISSING_IT_THERE)


[style] ~113-~113: To form a complete sentence, be sure to include a subject or ‘there’.
Context: ... ] } ] } ``` * <bucket-name> should be the name of the S3 bucket where cach...

(MISSING_IT_THERE)


[style] ~114-~114: To form a complete sentence, be sure to include a subject or ‘there’.
Context: ...eams should be stored. * <key-prefix> should be the prefix (used like a directory pa...

(MISSING_IT_THERE)

docs/src/user-guide/guides-using-object-storage/clp-config.md

[style] ~37-~37: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...s the bucket's name. * <key-prefix> is the "directory" where all archives will...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[uncategorized] ~38-~38: Possible missing article found.
Context: ...within the bucket and must end with trailing forward slash (e.g., archives/). * ...

(AI_HYDRA_LEO_MISSING_A)


[style] ~69-~69: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...s the bucket's name. * <key-prefix> is the "directory" where all streams will ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

docs/src/user-guide/guides-using-object-storage/clp-usage.md

[style] ~36-~36: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...containing your logs. * <region-code> is the AWS region [code][aws-region-codes]...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~37-~37: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...cket containing your logs. * <prefix> is the prefix of all logs you wish to comp...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[grammar] ~44-~44: Possible subject-verb agreement error detected.
Context: ... path to the log file. However, if that log file's path is a prefix of another log ...

(THIS_THAT_AGR)

docs/src/user-guide/guides-using-object-storage/index.md

[grammar] ~33-~33: A determiner may be missing.
Context: ... case to follow the [principle of least privilege][least-privilege-principle], ...

(THE_SUPERLATIVE)

🪛 Markdownlint (0.37.0)
docs/src/user-guide/guides-using-object-storage/clp-usage.md

50-50: Unused link or image reference definition: "add-iam-policy"
Link and image reference definitions should be needed

(MD053, link-image-reference-definitions)

⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: lint-check (ubuntu-latest)
  • GitHub Check: lint-check (macos-latest)
🔇 Additional comments (9)
docs/src/user-guide/guides-using-object-storage/clp-usage.md (4)

1-5: LGTM! Clear and concise introduction.

The introduction effectively sets the context and provides helpful navigation to related documentation.


6-18: LGTM! Well-structured command example.

The command syntax is clear, and the use of line continuation improves readability.


34-39: LGTM! Clear parameter descriptions.

The parameter descriptions are comprehensive and well-linked to related documentation.

🧰 Tools
🪛 LanguageTool

[style] ~36-~36: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...containing your logs. * <region-code> is the AWS region [code][aws-region-codes]...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~37-~37: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...cket containing your logs. * <prefix> is the prefix of all logs you wish to comp...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


40-48: LGTM! Important limitations clearly documented.

The notes section effectively communicates current limitations and expected future improvements.

🧰 Tools
🪛 LanguageTool

[grammar] ~44-~44: Possible subject-verb agreement error detected.
Context: ... path to the log file. However, if that log file's path is a prefix of another log ...

(THIS_THAT_AGR)

docs/src/user-guide/guides-using-object-storage/clp-config.md (2)

1-7: LGTM! Clear introduction with important operational note.

The introduction effectively sets up the configuration guide, and the note about shutting down CLP before making changes is appropriately highlighted.


73-78: LGTM! Clear documentation of current limitations.

The note about stream caching limitations and the AWS region codes reference are well-documented.

docs/src/user-guide/guides-using-object-storage/index.md (3)

46-67: LGTM! Well-structured configuration section.

The grid layout effectively organizes the configuration subsections and provides clear navigation to detailed guides.


68-82: LGTM! Consistent and clear usage section.

The section maintains excellent consistency with the configuration section and provides clear guidance for users.


83-95: LGTM! Well-organized navigation and references.

The toctree structure and references are properly organized and complete.

@kirkrodrigues kirkrodrigues requested a review from haiqi96 January 23, 2025 12:19
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (5)
docs/src/user-guide/guides-using-object-storage/clp-config.md (5)

1-8: Consider adding AWS-specific context and prerequisites.

The introduction would benefit from:

  1. Explicitly mentioning that this guide focuses on Amazon S3 storage.
  2. Adding a prerequisites section that outlines:
    • Required AWS IAM permissions
    • How to obtain AWS credentials
    • Supported S3 bucket configurations

24-26: Consider adding security best practices for credential management.

Recommend adding guidance on:

  1. Using AWS IAM roles instead of hardcoded credentials where possible
  2. Environment variables as an alternative to storing credentials in the config file
  3. AWS credential file usage (~/.aws/credentials)

31-40: Enhance the configuration parameters documentation.

Consider the following improvements:

  1. Add validation steps to verify the configuration
  2. Provide examples of valid key prefixes
  3. Rephrase the bullet points to avoid repetitive sentence beginnings

Example rephrasing:

- * `<region-code>` is the AWS region [code][aws-region-codes] for the bucket.
- * `<bucket-name>` is the bucket's name.
- * `<key-prefix>` is the "directory" where all archives will be stored...
+ * For `<region-code>`, specify the AWS region [code][aws-region-codes] for the bucket.
+ * The `<bucket-name>` specifies your S3 bucket's name.
+ * Within `<key-prefix>`, define the "directory" path where archives will be stored...
🧰 Tools
🪛 LanguageTool

[style] ~37-~37: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...s the bucket's name. * <key-prefix> is the "directory" where all archives will...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


41-72: Reduce documentation duplication.

Consider refactoring the stream storage section to:

  1. Extract common S3 configuration concepts into a shared section
  2. Reference the shared concepts from both archive and stream sections
  3. Focus each section on use-case specific details

This would improve maintainability and reduce the risk of documentation drift.

🧰 Tools
🪛 LanguageTool

[style] ~69-~69: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...s the bucket's name. * <key-prefix> is the "directory" where all streams will ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


73-76: Enhance the stream deletion limitation note.

Consider adding:

  1. Temporary workarounds for managing stream storage
  2. Best practices for monitoring S3 storage usage
  3. Guidance on setting up S3 lifecycle rules to manage storage costs
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 454cb87 and 57690ea.

📒 Files selected for processing (1)
  • docs/src/user-guide/guides-using-object-storage/clp-config.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/src/user-guide/guides-using-object-storage/clp-config.md

[style] ~37-~37: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...s the bucket's name. * <key-prefix> is the "directory" where all archives will...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)


[style] ~69-~69: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...s the bucket's name. * <key-prefix> is the "directory" where all streams will ...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: lint-check (ubuntu-latest)
  • GitHub Check: build (macos-latest)
  • GitHub Check: lint-check (macos-latest)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
docs/src/user-guide/guides-using-object-storage/index.md (3)

3-11: Consider adding a diagram to illustrate the relationships between functionalities.

The three capabilities are well-explained, but a visual diagram showing how they interact (e.g., how stream files relate to viewing compressed logs) would help users better understand the workflow.

Would you like me to help create a Mermaid diagram to illustrate these relationships?


32-33: Add missing comma for better readability.

Add a comma after "at this stage" to improve sentence flow.

-      * You don't need to assign any groups or policies to the user at this stage since we will
+      * You don't need to assign any groups or policies to the user at this stage, since we will
🧰 Tools
🪛 LanguageTool

[uncategorized] ~32-~32: Possible missing comma found.
Context: ... groups or policies to the user at this stage since we will attach policies i...

(AI_HYDRA_LEO_MISSING_COMMA)


55-60: Enhance the card description with specific configuration aspects.

The current description is quite general. Consider expanding it to briefly mention the key configuration aspects covered (e.g., bucket policies, IAM permissions).

 Configuring object storage
 ^^^
-Configuring your object storage bucket for each use case.
+Configuring your object storage bucket, including bucket policies and IAM permissions, for each use case.
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 57690ea and e1e4444.

📒 Files selected for processing (1)
  • docs/src/user-guide/guides-using-object-storage/index.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/src/user-guide/guides-using-object-storage/index.md

[uncategorized] ~32-~32: Possible missing comma found.
Context: ... groups or policies to the user at this stage since we will attach policies i...

(AI_HYDRA_LEO_MISSING_COMMA)


[grammar] ~35-~35: A determiner may be missing.
Context: ... case to follow the [principle of least privilege][least-privilege-principle], ...

(THE_SUPERLATIVE)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: lint-check (macos-latest)
🔇 Additional comments (2)
docs/src/user-guide/guides-using-object-storage/index.md (2)

38-41: 🛠️ Refactor suggestion

Clarify the access key creation process.

Based on previous feedback, it would be helpful to explicitly mention that access keys are not created automatically when creating an IAM user through the AWS console.

 6. IAM user (long-term) credentials for the IAM user(s) created in step (4) above.
     * To create these credentials, follow [this guide][aws-create-access-keys].
+      * Note: Access keys are not created automatically when creating an IAM user through the AWS console.
       * Choose the "Other" use case to generate long-term credentials.

Likely invalid or redundant comment.


81-81: 🛠️ Refactor suggestion

Clarify the relationship between viewing and stream files.

Based on previous discussions about potential confusion, consider being more explicit about the relationship between viewing compressed logs and stream files.

-Using CLP to compress, search, and view log files from object storage.
+Using CLP to compress logs, store archives, and view compressed logs using locally cached stream files.

Likely invalid or redundant comment.

Comment on lines 54 to 70
```json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject"
],
"Resource": [
"arn:aws:s3:::<bucket-name>/<key-prefix>/*"
]
}
]
}
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indentation is set to one space. let's be consistent with the other jsons which use two spaces.

Suggested change
```json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject"
],
"Resource": [
"arn:aws:s3:::<bucket-name>/<key-prefix>/*"
]
}
]
}
```
```json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject"
],
"Resource": [
"arn:aws:s3:::<bucket-name>/<key-prefix>/*"
]
}
]
}

@haiqi96
Copy link
Contributor

haiqi96 commented Jan 23, 2025

As per offline discussion, it seems Marco's machine doesn't automatically pull our latest execution image since the image is already on his machine.

davemarco
davemarco previously approved these changes Jan 24, 2025
Copy link
Contributor

@davemarco davemarco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some comments. But feel free to ignore them. Most are out of scope of this PR. I 'm also out of the loop on this project, so I dont know whats going behind the scenes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would consider renaming this section from guides to object storage. Everything in user-guide is technically a guide. We could then renameUsing object storage to Using AWS S3.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our plan is actually to rename "User guide" to "User docs" and "Developer guide" to "Developer docs". Although technically everything is a guide, we do want to differentiate "guides" (tutorials) from reference docs. Technically we could also move the quick start section into the guides section, but I'd need to restructure it a little.

What do you think about moving in that direction instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I feel maybe the restructure can wait after we release the software to the intended user


* compress logs from object storage (e.g., S3);
* store archives on object storage; and
* cache stream files (used for viewing compressed logs) on object storage.
Copy link
Contributor

@davemarco davemarco Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The term cache stream files is a bit confusing I think for a new user. Maybe store decompressed archives on object storage for viewing in UI

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more for discussion for a later point, but why would a user want this, like what benefits does this provide over local disk? Also, perhaps this feature can eventually be removed if the log viewer can open archives?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I think this guide is more for a user who has some familiarity with CLP, but I do agree that stream files are not well-explained in our overall set of docs; we need to add more about the log viewer and perhaps we could link to those new docs from here.
  • As for why a user would want this, I think that could be explained in the log viewer docs mentioned above? I do want to keep this guide focused on how to use object storage.
  • Once archive viewing is supported in the log viewer, we probably won't need this cache as much anymore. I could still see it being useful if the archive is just too big to be downloaded/viewed in the viewer so we need to generate something more manageable.

bucket: "<bucket-name>"
key_prefix: "<key-prefix>"
credentials:
access_key_id: "<aws-access-key-id>"
Copy link
Contributor

@davemarco davemarco Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if in future these can be optional, and have boto magically find credentials if they are available (it will search env variables, ./credential files, EC2, etc... I believe. I dont think many people would actually have multiple IAM user like I mentioned before (and u may beable to specify them with a profile i believe).Also, it's not optimal for users to have to enter text keys. It will also help out eventually to run on EC2, where you dont need to pass in keys to instance. Also Iamrole support would be nice. But this is all out of scope of this PR.

Copy link
Member Author

@kirkrodrigues kirkrodrigues Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is planned for a future PR.

* You don't need to assign any groups or policies to the user at this stage since we will
attach policies in later steps, depending on which object storage use cases you require.
* You may use a different IAM user for each use case to follow the
[principle of least privilege][least-privilege-principle], or you can use the same user for
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would get rid of this principle of least privilege sentence.

You may use a single IAM user for all buckets, or a separate IAM user for each bucket.

My knowledge of aws security rings is weak but...
I think it is a bit weird suggesting that they have multiple IAM users as best practice. I think in general someone working at a company would only have one IAM user, and can cannot create new ones. Like I don't think multiple IAM user is normal, and multiple IAM roles would make more sense. I may be completely wrong here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also in general it probably makes sense to provide IAM role support later on.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of agree that it might be weird for someone to have more than one users. Maybe we should just ask them to use one account as default?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the way I reworded the sentence helps a bit. By putting the single user case first. But maybe there is a better way

```

* CLP expects the credentials to be in the `default` section.
* `<aws-access-key-id>` and `<aws-secret-access-key>` are the access key ID and secret access
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again not for this PR. But it would nice if this was optional, and boto could do magic to find credentials. Also iamrole support

@davemarco davemarco dismissed their stale review January 24, 2025 00:55

Haiqi should approve

@haiqi96
Copy link
Contributor

haiqi96 commented Jan 24, 2025

@davemarco about the confusing error message, since you deleted the original comments. let me add some notes here.

It is kind of a known issue, where pydantic's error isn't very intuitive when a field can be one of a list of ObjectClass.

It seems pydantic will try to iterate and verify against all possible case, (in this case, S3config and FSconfig), and report errors for each case. Another example of this is explained in #676. (it's an unrelated case but has an example in the description)

Maybe we should open an issue to improve this

@davemarco
Copy link
Contributor

davemarco commented Jan 24, 2025

@davemarco about the confusing error message, since you deleted the original comments. let me add some notes here.

I deleted it because I noticed, it did actually give the error, but later on... Looks like the same thing in #676 . Maybe someone can look into, but it might just not be fixable...

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
docs/src/user-guide/guides-using-object-storage/index.md (2)

7-7: 🛠️ Refactor suggestion

Clarify the term "cache stream files" for new users

The term "cache stream files" might be unclear to new users. Consider rephrasing to "store decompressed archives on object storage for viewing in UI" and adding a brief explanation of why a user might choose object storage over local disk for this purpose.


34-34: 🛠️ Refactor suggestion

Simplify IAM user recommendation

The suggestion to use separate IAM users might not align with common practices. Consider recommending a single IAM user as the default approach, as multiple IAM users are not typical in enterprise environments.

-    * You may use a single IAM user for all use cases, or a separate one for each.
+    * We recommend using a single IAM user for all use cases.
🧹 Nitpick comments (2)
docs/src/user-guide/guides-using-object-storage/object-storage-config.md (2)

41-47: Consider adding a link to the compress.sh documentation.

The note references sbin/compress.sh without providing a link to its documentation. Consider adding a reference link to help users find the related compression documentation more easily.


83-85: Consider adding a tracking link for future log viewer capabilities.

The note about future log viewer support would be more helpful if it included a link to track progress (e.g., a GitHub issue or project board).

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e1e4444 and 5808e0d.

📒 Files selected for processing (2)
  • docs/src/user-guide/guides-using-object-storage/index.md (1 hunks)
  • docs/src/user-guide/guides-using-object-storage/object-storage-config.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/src/user-guide/guides-using-object-storage/object-storage-config.md

[style] ~38-~38: To form a complete sentence, be sure to include a subject or ‘there’.
Context: ... } } ] } ``` * <bucket-name> should be the name of the S3 bucket containing...

(MISSING_IT_THERE)


[style] ~39-~39: To form a complete sentence, be sure to include a subject or ‘there’.
Context: ...aining your logs. * <all-logs-prefix> should be the prefix of all logs you wish to c...

(MISSING_IT_THERE)


[style] ~72-~72: To form a complete sentence, be sure to include a subject or ‘there’.
Context: ... ] } ] } ``` * <bucket-name> should be the name of the S3 bucket where comp...

(MISSING_IT_THERE)


[style] ~73-~73: To form a complete sentence, be sure to include a subject or ‘there’.
Context: ...ives should be stored. * <key-prefix> should be the prefix (used like a directory pa...

(MISSING_IT_THERE)


[style] ~113-~113: To form a complete sentence, be sure to include a subject or ‘there’.
Context: ... ] } ] } ``` * <bucket-name> should be the name of the S3 bucket where cach...

(MISSING_IT_THERE)


[style] ~114-~114: To form a complete sentence, be sure to include a subject or ‘there’.
Context: ...eams should be stored. * <key-prefix> should be the prefix (used like a directory pa...

(MISSING_IT_THERE)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: lint-check (macos-latest)
🔇 Additional comments (8)
docs/src/user-guide/guides-using-object-storage/object-storage-config.md (4)

11-36: LGTM! IAM policy is well-structured and follows JSON formatting conventions.

The policy correctly specifies the minimum required permissions with proper indentation and clear placeholder markers.


54-70: LGTM! Archive storage IAM policy is well-structured.

The policy correctly specifies both read and write permissions required for archive storage operations.


95-111: LGTM! Stream storage IAM policy is well-structured.

The policy correctly specifies both read and write permissions required for stream storage operations.


124-141: Consider security implications of wildcard CORS configuration.

While the current CORS policy works, using wildcards ("*") for AllowedHeaders and AllowedOrigins might be too permissive for production environments. Consider:

  1. Limiting AllowedHeaders to specific required headers
  2. Using a list of specific origins instead of "*"
docs/src/user-guide/guides-using-object-storage/index.md (4)

36-38: Add note about AWS best practices for access keys

Consider adding a note about AWS's recommendation against using long-term access keys, even though it's currently required. This helps users understand the security implications.

 6. IAM user (long-term) credentials for the IAM user(s) created in step (4) above.
     * To create these credentials, follow [this guide][aws-create-access-keys].
       * Choose the "Other" use case to generate long-term credentials.
+    * Note: While CLP currently requires long-term access keys, AWS generally recommends using temporary credentials where possible for enhanced security.

46-66: LGTM! Well-structured configuration section

The grid layout effectively organizes the configuration topics, making it easy for users to navigate to specific configuration instructions.


79-79: Clarify "viewing log files from object storage"

The current wording might cause confusion about whether this refers to viewing logs stored in object storage or viewing logs using stream files cached in object storage. Consider adding a brief explanation at the beginning of the referenced page to clarify this distinction.


83-94: LGTM! Complete reference section

All necessary references are included and properly formatted.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
docs/src/user-guide/guides-using-object-storage/index.md (3)

7-7: Consider clarifying the concept of stream files

Based on previous discussions, the term "cache stream files" might be unclear to new users. Consider adding a brief explanation or linking to documentation that explains what stream files are and their role in the log viewing process.

-* cache stream files (used for viewing compressed logs) on object storage.
+* cache stream files on object storage (these are temporary files generated when viewing compressed logs in the UI).

33-34: Add a comma for better readability

Add a comma after "stage" to improve sentence flow.

-      * You don't need to assign any groups or policies to the user at this stage since we will
+      * You don't need to assign any groups or policies to the user at this stage, since we will
🧰 Tools
🪛 LanguageTool

[uncategorized] ~33-~33: Possible missing comma found.
Context: ... groups or policies to the user at this stage since we will attach policies i...

(AI_HYDRA_LEO_MISSING_COMMA)


35-36: Simplify IAM user guidance

Based on previous discussions, having multiple IAM users isn't a common practice. Consider simplifying this guidance to focus on the typical single-user scenario.

-    * You may use a single IAM user for all use cases, or a separate one for each.
-    * For brevity, we'll refer to this user as the "CLP IAM user" in the rest of this guide.
+    * We recommend using a single IAM user for all use cases. We'll refer to this as the "CLP IAM user" in the rest of this guide.
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5808e0d and 8e06c1a.

📒 Files selected for processing (1)
  • docs/src/user-guide/guides-using-object-storage/index.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/src/user-guide/guides-using-object-storage/index.md

[uncategorized] ~33-~33: Possible missing comma found.
Context: ... groups or policies to the user at this stage since we will attach policies i...

(AI_HYDRA_LEO_MISSING_COMMA)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: lint-check (ubuntu-latest)
🔇 Additional comments (2)
docs/src/user-guide/guides-using-object-storage/index.md (2)

51-67: Well-organized configuration section

The grid layout effectively organizes the configuration topics and provides clear navigation to detailed instructions.


69-95: Clear navigation and well-structured references

The usage section maintains a consistent layout with the configuration section, and the reference links are well-organized.

@kirkrodrigues kirkrodrigues merged commit 230d518 into y-scope:main Jan 24, 2025
5 checks passed
@kirkrodrigues kirkrodrigues deleted the s3-docs branch January 24, 2025 06:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants