I get AWS ECR exit status 255 despite using AWS ubuntu containers

I get AWS ECR exit status 255 despite using AWS ubuntu containers - amazon-ecs

I am trying to build a docker container in AWS code build as a means to deploy a container to ECR, but I get this error.
Error while executing command: $(aws ecr get-login --region ap-southeast-1). Reason: exit status 255
This command was run on the buildspec.yml file, using aws/codebuild/ubuntu-base:14.04 and Enable this flag if you want to build Docker images or want your builds to get elevated privileges.
The log files are as follows:
[Container] 2018/10/11 00:52:49 Running command $(aws ecr get-login --region $AWS_DEFAULT_REGION --no-include-email)
An error occurred (AccessDeniedException) when calling the GetAuthorizationToken operation: User: arn:aws:sts::502776083946:assumed-role/code-build-timesheet/AWSCodeBuild-f1d205b1-b03f-4727-a4d7-a02118021eec is not authorized to perform: ecr:GetAuthorizationToken on resource: *
[Container] 2018/10/11 00:52:52 Command did not exit successfully $(aws ecr get-login --region $AWS_DEFAULT_REGION --no-include-email) exit status 255
[Container] 2018/10/11 00:52:52 Phase complete: INSTALL Success: false
[Container] 2018/10/11 00:52:52 Phase context status code: COMMAND_EXECUTION_ERROR Message: Error while executing command: $(aws ecr get-login --region $AWS_DEFAULT_REGION --no-include-email). Reason: exit status 255

This status code usually indicates an unauthorized user. To fix this, we need to let our Code Build role be able to talk to ECR. To do this: Go to IAM and then attach a AmazonEC2ContainerRegistryPowerUser policy to your CodeBuild role.

The aws-cli version 2 has been updated and the command get-login was depreciated, you should use get-login-password.
aws ecr get-login-password --region $AWS_DEFAULT_REGION | docker login
--username AWS --password-stdin $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com
You can see the updated documentation to build a docker image in CodeBuild: https://docs.aws.amazon.com/codebuild/latest/userguide/sample-docker.html

In my case, I added the permission but was still getting the same issue. Later found that my "Permissions boundary" in the IAM role was not letting the permission go through. So if you set Permission policies to allow ecr:GetAuthorizationToken but have Permissions boundary enabled as well then you need to add the same permission to the Permissions boundary (or remove Permissions boundary).

Is there any more specific error in your cloudwatch log? Like AccessDenied or something else? There should be some details in the log around the failed command part.
Thanks,
Xin

Related

Gloud deploy - container failed to start

docker build -t "us.gcr.io/ek-airflow-stage/array_data:sree" .
Status: Downloaded newer image for python:3.7
---> 869a8debb0fd
Successfully built 869a8debb0fd
Successfully tagged us.gcr.io/ek-airflow-stage/array_data:sree
docker push "us.gcr.io/ek-airflow-stage/array_data:sree"
The push refers to repository [us.gcr.io/ek-airflow-stage/array_data]
a36ba9e322f7: Layer already exists
sree: b size: 2218
gcloud run deploy "ek-airflow-stage" \
--quiet \
--image "us.gcr.io/ek-airflow-stage/array_data:sree" \
--region "us-central1" \
--platform "managed"
Deploying container to Cloud Run service [ek-airflow-stage] in project ["project"] region [us-central1]
/ Deploying... Cloud Run error: Container failed to start. Failed to start and then listen on the port defined by the PORT environment variable. Logs for this revision might contain more information.
Deployment failed
ERROR: (gcloud.run.deploy) Cloud Run error: Container failed to start. Failed to start and then listen on the port defined by the PORT environment variable. Logs for this revision might contain more information.

When trying to connect jenkins and kubernetes, Jenkins job throws the following error

Started by user admin.
Running as SYSTEM.
Building in workspace /var/lib/jenkins/workspace/myjob
[myjob] $ /bin/sh -xe /tmp/jenkins8491647919256685444.sh
+ sudo kubectl get pods
error: the server doesn't have a resource type "pods"
Build step 'Execute shell' marked build as failure
Finished: FAILURE

It looks to me that the authentication credentials were not set correctly. Please copy the kubeconfig file /etc/kubernetes/admin.conf to ~/.kube/config? Also check that the KUBECONFIG variable is set.
It would also help to increase the verbose level using the flag --v=99.
Please take a look: kubernetes-configuration.

Error executing access token command "/google/google-cloud-sdk/bin/gcloud config-helper --format=json

I'm trying to follow this step by step to upload the airflow in Kubernetes (https://github.com/EamonKeane/airflow-GKE-k8sExecutor-helm) but in this part of the execution I have problems as follows:
Researching on the topic did not find anything that solved so far my problem, does anyone have any suggestions of what to do?
SQL_ALCHEMY_CONN=postgresql+psycopg2://$AIRFLOW_DB_USER:$AIRFLOW_DB_USER_PASSWORD#$KUBERNETES_POSTGRES_CLOUDSQLPROXY_SERVICE:$KUBERNETES_POSTGRES_CLOUDSQLPROXY_PORT/$AIRFLOW_DB_NAME
echo $SQL_ALCHEMY_CONN > /secrets/airflow/sql_alchemy_conn
# Create the fernet key which is needed to decrypt database the database
FERNET_KEY=$(dd if=/dev/urandom bs=32 count=1 2>/dev/null | openssl base64)
echo $FERNET_KEY > /secrets/airflow/fernet-key
kubectl create secret generic airflow \
--from-file=fernet-key=/secrets/airflow/fernet-key \
--from-file=sql_alchemy_conn=/secrets/airflow/sql_alchemy_conn
Unable to connect to the server: error executing access token command
"/google/google-cloud-sdk/bin/gcloud config config-helper
--format=json": err=exit status 1 output= stderr=ERROR: gcloud crashed (BadStatusLine): '' If you would like to report this issue, please run
the following command: gcloud feedback To check gcloud for common
problems, please run the following command: gcloud info
--run-diagnostics

I solved this by creating a new cloud shell tab to connect the cluster:
gcloud container clusters get-credentials testcluster1 --zone = your_zone

Example:
get the name and location of your cluster
gcloud container clusters list
then
gcloud container clusters get-credentials demo --region=us-west1-a

IBM Bluemix registry push authentication error

When pushing to the Bluemix registry, I get the following error:
47c2386f248c: Waiting
2be95f0d8a0c: Waiting
2df9b8def18a: Waiting
unauthorized: authentication required
I've got the cs and cr plugins both installed, an have verified Bx is being added to more auths file. Have tried both using OSX keychain as credstore and without.
When I pull the IBMLiberty example from the BX registry, or build an image with Liberty as the base, it does pull without issue.
I'm running:
docker build . -t registry.ng.bluemix.net/my_space/ibm
docker push registry.ng.bluemix.net/my_space/ibm
Have also tried manually exporting BLUEMIX_TRACE=true and re-running the login and init commands.

Ensure you have logged into the Bluemix Container repository before doing docker push:
$ docker pull registry.ng.bluemix.net/myspace/myimage
Using default tag: latest
Please login prior to pull:
Username (bearer): XXXX
Password:
Error response from daemon: unauthorized: authentication required
$ bx cr login
Logging in to 'registry.ng.bluemix.net'...
Logged in to 'registry.ng.bluemix.net'.
$ docker pull registry.ng.bluemix.net/myspace/myimage:4
4: Pulling from myspace/myimage
7b6bb4652a1b: Downloading [===> ] 5.272MB/70.48MB
See:
$ bx cr login --help
NAME:
login - Log the local Docker client in to IBM Bluemix Container Registry.
USAGE:
bx cr login

It's not clear if you own the namespace my_space can you run bx cr namespaces to see what namespaces you can push to? If need be you can add one with bx cr namespace-add <something unique to you>.

Google Dataproc Agent reports failure when using initialization script

I am trying to set up a cluster with an initialization script, but I get the following error:
[BAD JSON: JSON Parse error: Unexpected identifier "Google"]
In the log folder the init script output log is absent.
This seems rather strange as it seemed to work past week, and the error message does not seem related to the init script, but rather to the input arguments for the cluster creation. I used the following command:
gcloud beta dataproc clusters create <clustername> --bucket <bucket> --zone <zone> --master-machine-type n1-standard-1 --master-boot-disk-size 10 --num-workers 2 --worker-machine-type n1-standard-1 --worker-boot-disk-size 10 --project <projectname> --initialization-actions <gcs-uri of script>

Apparently changing
#!/bin/sh
to
#!/bin/bash
and removing all "sudo" occurrences did the trick.

This particular error occurs most often when the initialization script is in a Cloud Storage (GCS) bucket to which the project running the cluster does not have access.
I would recommend double-checking the project which is being used for the cluster has read access to the bucket.