Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

403 with pre-signed S3 URL #585

Open
TomAugspurger opened this issue Jan 13, 2025 · 2 comments
Open

403 with pre-signed S3 URL #585

TomAugspurger opened this issue Jan 13, 2025 · 2 comments

Comments

@TomAugspurger
Copy link

S3 supports pre-signed URLs, a way to encode authorization into the URL so that they can be shared and used similarly to a public HTTP URL. Currently, it looks like they are not supported. A pre-signed URL can be generated through the console, CLI, or SDKs:

In [1]: import boto3

In [2]: import boto3, httpx

In [3]: s3 = boto3.client("s3")

In [4]: url = s3.generate_presigned_url("get_object", Params={"Bucket": "kvikiobench-56481", "Key": "data/small/0000"}, ExpiresIn=600)

In [5]: httpx.get(url).status_code
Out[5]: 200

If we take that url and use it with kvikio, we get a 403 error:

>>> import kvikio
>>> kvikio.RemoteFile.open_http(url="https://kvikiobench-56481.s3.us-east-2.amazonaws.com/data/small/0000?response-content-disposition=inline&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Security-Token=...")
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[3], line 1
----> 1 kvikio.RemoteFile.open_http(url="https://kvikiobench-56481.s3.us-east-2.amazonaws.com/data/small/0000?response-content-disposition=inline&X-Amz-Content-Sha256=UNSIGNED-PAYLOAD&X-Amz-Security-Token=...")

File /raid/toaugspurger/envs/kvikio-env/lib/python3.12/site-packages/kvikio/remote_file.py:69, in RemoteFile.open_http(cls, url, nbytes)
     53 @classmethod
     54 def open_http(
     55     cls,
     56     url: str,
     57     nbytes: Optional[int] = None,
     58 ) -> RemoteFile:
     59     """Open a http file.
     60
     61     Parameters
   (...)
     67         for the file size.
     68     """
---> 69     return RemoteFile(_get_remote_module().RemoteFile.open_http(url, nbytes))

File remote_handle.pyx:92, in kvikio._lib.remote_handle.RemoteFile.open_http()

File remote_handle.pyx:81, in kvikio._lib.remote_handle.RemoteFile._from_endpoint()

RuntimeError: curl_easy_perform() error near /opt/conda/conda-bld/work/cpp/src/remote_handle.cpp:47(The requested URL returned error: 403)
@madsbk
Copy link
Member

madsbk commented Jan 21, 2025

The problem is that the presigned URLs doesn't support HEAD thus KvikIO fails when trying to get the file size.

It should work when setting the file size manually:

import kvikio
kvikio.RemoteFile.open_http(url="presigned-aws-url", bytes=100)

@TomAugspurger
Copy link
Author

Ah, thanks for tracking that down.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants