Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS Credentials cache must be protected by a lock #3364

Open
1 task
Veetaha opened this issue Jan 31, 2025 · 0 comments
Open
1 task

AWS Credentials cache must be protected by a lock #3364

Veetaha opened this issue Jan 31, 2025 · 0 comments
Labels
bug This issue is a confirmed bug. needs-triage This issue or PR still needs to be triaged.

Comments

@Veetaha
Copy link

Veetaha commented Jan 31, 2025

Describe the bug

Today session.get_credentials() and session.create_client() don't use a lock when sourcing the credentials. This results in a performance problem when multiple AWS clients are created concurrently. This is a real-world problem when the script runs in an environment that uses credential_process that takes an exclusive file lock. For example, it reproduces in my real environment that uses aws-vault SSO config with pass backend.

So the problem is that the credential loading process is not locked, and multiple threads try to spawn their own credential process when the credential cache is empty, thus resulting in a significant lag.

Regression Issue

  • Select this option if this issue appears to be a regression.

Expected Behavior

Credentials loading must be locked, so only a single thread at a time loads the credentials and populates the cache.

Current Behavior

Every thread that creates the client tries to mutate the credentials cache in parallel and thus spawns many credential_processes that significantly hinders the performance, and is a footgun.

Reproduction Steps

Here is a minimized reproduction of the bug. Take this Python code as an example:

import concurrent.futures
import time
import botocore.session

session = botocore.session.get_session()

# This fixes the problem by populating the cache eagerly.
# Uncomment to see the difference
# session.get_credentials()

timer = time.perf_counter()

with concurrent.futures.ThreadPoolExecutor() as executor:
    executor.submit(session.create_client, 'sts')
    executor.submit(session.create_client, 'organizations')
    executor.submit(session.create_client, 's3')
    executor.submit(session.create_client, 'ec2')
    executor.submit(session.create_client, 'efs')
    executor.submit(session.create_client, 'fsx')

print(f"Client creation took {time.perf_counter() - timer:.2f} seconds")

Then create an AWS config with the following profile:

[profile test-cred-process]
region = eu-central-1
credential_process = /tmp/cred-process.sh

Put the following bash script into /tmp/cred-process.sh and make it executable. This script takes an advisory file lock on /tmp/lockfile and imitates the delay of 0.5s of resolving credentials (just like aws-vault with SSO setup has a considerable delay):

#!/usr/bin/env bash

(
    flock 200 -c "
        sleep 0.5s
        echo '"'{
            "Version": 1,
            "AccessKeyId": "kkk",
            "SecretAccessKey": "aaa",
            "SessionToken": "sss",
            "Expiration": "2025-01-31T11:16:40Z"
        }
    '"'"
) 200>/tmp/lockfile

Now if you run the provided python code, it'll take ~3 seconds to execute. This is because every create_client thread invokes its own credential process, and waits for 0.5 on a file lock to be released. If you call session.get_credentials() right before executor.submits then the runtime decreases to ~0.6 seconds.

Possible Solution

No response

Additional Information/Context

The original analogous problem was reported in aibotocore package at aio-libs/aiobotocore#1282

SDK version used

1.36.3

Environment details (OS name and version, etc.)

22.04.5 LTS (Jammy Jellyfish)

@Veetaha Veetaha added bug This issue is a confirmed bug. needs-triage This issue or PR still needs to be triaged. labels Jan 31, 2025
@jakob-keller jakob-keller marked this as a duplicate of aio-libs/aiobotocore#1282 Feb 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a confirmed bug. needs-triage This issue or PR still needs to be triaged.
Projects
None yet
Development

No branches or pull requests

1 participant