-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Instructions for Contributing to the project #85
Comments
Do you have docker-compose installed? I believe the AmazonSageMakerFullAccess policy has by default an S3 condition in which the S3 bucket has to have the word sagemaker within the bucket name. In addition for local mode, I believe since you have your AWS credentials set it should be passed properly to the container. https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/local/image.py#L631 Could you perhaps trying passing exporting your credentials as environment variables? |
Hi @ChoiByungWook, I appreciate your help. The problem is that inside the container, the ExecutionRole passed to the When debugging these issues I assume the role from my local machine and run under that role. You can see from the stacktrace above that I have explicitly built the docker image with a credentials file of a user that is able to assume the ExecutionRole:
After that point the
Now I use the very same credentials file to authenticate and assume the intended role. Observe the following executed with the
I hope this makes it clear that the role has permissions to I am sure that I am simply not running the local container properly. Perhaps there are requirements about the notebook environment that instantiate the Once again, thanks for your help. I think Sagemaker has fantastic potential and am quite keen to contribute. |
I agree that i's confusing that the role passed in TensorFlow estimator is not actually used in the containers with local mode. As mentioned in aws/sagemaker-python-sdk#413, we will update our document. |
I am trying to get to the bottom of a problem #413 causing my deployed tensorflow model to fail.
The model is a simple and deploys with basic instructions to GCP MLE. The serving function which errors out on sagemaker works fine on MLE.
The problem seems to be in the way the sagemaker container processes the input.
As such I have started to debug locally but I am guessing about how to do that properly and am currently unsure how the local sagemaker container assumes the role passed to the
TensorFlow
constructor.Currently, I am building the latest sagemaker-tensorflow-container image at v
1.10.0
and calling it from a local notebook instance using the MNIST example provided by amazon-sagemaker-examples:however the local container fails because it cannot get an object from s3:
The role can be verified to copy the object, leading me to suppose that the container does not assume the role properly.
I wonder how it is meant to assume the role?
The instructions to build the container image locally are clear, thank you for that. I would like to see something in the README.md or CONTIBUTING.md that shows the recomended process of developing the container and calling the built image locally.
The text was updated successfully, but these errors were encountered: