I'm setting up Spinnaker in K8s with aws-ecr. My setup and steps are:
on AWS side:
Added policies ecr-pull, ecr-push, and ecr-generate-token
Attached the policy to a role
Spinnaker setup:
Modified values.yaml with below above settings:
```accounts:
name: my-ecr
address: https://123456xxx.dkr.ecr.my-region.amazonaws.com
repositories:
123456xxx.dkr.ecr..amazonaws.com/spinnaker-test-project
```
Annotated clouddriver.yaml: deployment to use created role (using the IAM role in a pod by referencing the role name in an annotation on the pod specification)
But it doesn't work and the error on the cloudrvier side is :
.d.r.p.a.DockerRegistryImageCachingAgent : Could not load tags for 1234xxxxx.dkr.ecr.<my_region>.amazonaws.com/spinnaker-test-project in https://1234xxxxx.dkr.ecr.<my_region>.amazonaws.com
Would like to get some help or advice what I'm missing, thank you
Got the answer from an official Spinnaker slack channel. That adding an iam policy to the clouddriver pod won't work unfortunately since it uses the docker client instead of the aws client. The workaround to make it work can be found here
Note* Ecr support currently is broken in halyard.This might get fixed in future after halyard migrates from the kubernetes v1 -> v2 or earlier so please verify with community or docs.
Related
It seems that we cannot make the Snowplow container (snowplow/scala-stream-collector-kinesis) use the service account we provide. It always uses the shared-eks-node-role but not the provided service account. The config is set to default for both the accessKey as the secretKey.
This is the service account part we use:
apiVersion: v1
kind: ServiceAccount
metadata:
name: thijs-service-account
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::123:role/thijs-eks-service-account-role-snowplow
And when I inspect the pod I can see the account:
AWS_ROLE_ARN: arn:aws:iam::123:role/thijs-eks-service-account-role-snowplow
The error then shows not the right account.
Exception in thread "main" com.amazonaws.services.kinesis.model.AmazonKinesisException: User: arn:aws:sts::123:assumed-role/shared-eks-node-role/i-123 is not authorized to perform: kinesis:DescribeStream on resource: arn:aws:kinesis:eu-west-1:123:stream/snowplow-good (Service: AmazonKinesis; Status Code: 400; Error Code: AccessDeniedException; Request ID: 123-123-123; Proxy: null)
The collector itself doesn't do any role swapping. It only cares to receive credentials via one of three methods:
the default creds provider chain
a specific IAM role
environment variables.
The most popular deployment is on an EC2 instance, in which case the default EC2 role can be used to access other resources in the account.
It looks like when you are deploying it on EKS things are not as straightforward. The collector seems to work with this assumed role: arn:aws:sts::123:assumed-role/shared-eks-node-role/i-123 but it is not authorised with Kinesis permissions. Do you know what process creates that role? Perhaps you could add the missing Kinesis policies there?
I had the same issue. First make sure you have the IAM role setup correctly according to https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html. Make sure the names are consistent and it has the right permissions.
Once you've double-checked that, make sure you are on a recent version of snowplow. An old version might not have the right version of the AWS SDK. You need at least AWS SDK v1.12.128 or for AWS SDK v2, 2.10.11 [link].
Finally set the aws accessKey and secretKey in your snowplow configuration file to default. Redeploy and make sure the pod and service account has been recreated. You should be good at this point.
Reference:
https://github.com/snowplow/stream-collector/issues/186
I have the same issue.
It can't use env for the values, because those are not set. However, the collector is runnning as a container - it should use the default credential chain.
From the notes, it looks like without the the env variables being set, i should use iam - which when I do this, it uses the IAM Instance Profile, which loads the underlying nodes role - not the role specified by the SA.
The SDK supports IRSA (i have updated snowplow collector container image to 1 that has supported SDK of greater than 1.11.704 as per supported versions), and from what I can see from the collector docs, the streams config needs an aws block with either env or iam as values... but I want to use the default credential chain without specifying a method....
If i connect to the container, i can see that the creds are set up as per the SA:
$ env | grep -i aws
AWS_REGION=my-region
AWS_DEFAULT_REGION=my-region
AWS_ROLE_ARN=arn:aws:iam::<redacted>:role/sp-collector-role
AWS_WEB_IDENTITY_TOKEN_FILE=/var/run/secrets/eks.amazonaws.com/serviceaccount/token
But when I run the collector, it still uses the nodes IAM Instance profile, and I don't see any activity under sp-collector-role. Is there a way to use the default credential chain? e.g. with aws CLI in on a container in the same service account, I don't specify any credentials, but when I run aws sts get-caller-identity the SDK resolves the IRSA role correctly.
I am studying about CI/CD on AWS (CodePipeline/CodeBuild/CodeDeploy) and found it to be a very good tool for managing a pipeline on the cloud with everything managed (don't even need to install Jenkins on EC2).
I am now reading about container building and deployment. For the build phase, CodeBuild supports building container images. For the deploy phase, while I could find a CodeDeploy solution to ECS cluster, it seems there is no direct CodeDeploy solution for EKS (kindly correct if I am wrong).
May I know if there is a solution to integrate EKS cluster (i.e. the deploy phase can fetch the docker image from ECR or dockerhub and deploy to EKS)? I have come across some ideas using lamda functions to trigger the cluster to perform rolling update of the container image, but I could not find a step-by-step guide on this.
=========================
(Update 17 Sep 2020)
Somehow managed to create a lambda function to trigger an update to EKS to perform rolling update of the k8s deployment. Thanks Prashanna for the source base.
Just want to share the key setups in the process.
(1) Update the lambda execution role to include permission to describe EKS clusters
Create a policy with describe EKS cluster access, and attach to the role:
Policy snippet:
...
......
"Action": "eks:Describe*"
...
......
Or you can create a "EKSFullAccess" policy, and attach to the lambda execution role
(2) Update the k8s ConfigMap, and supplement the lambda execution role ARN to the mapRole section. The corresponding k8s role should be a role that has permission to update container images (say system:masters) used for the k8s deployment
You can edit the map with command like below:
kubectl edit -n kube-system configmap/aws-auth
You don't have to add/update another ConfigMap even if your deployment is in another namespace. It will take effect as well.
Sample lambda function call request and response:
Gitab provides the inbuilt integration of EKS and deployment with the help of Helm charts. If you plan to use other tools Using AWS lambda to update the image is the best bet!
I've added my github project.
Setup a lambda with below code and give RBAC access to this lambda in your EKS. Try invoking the lambda by passing the required information like namespace, deployment, image etc
Lambda for Kubernetes image update
The lambda must require EKS:describecluster policy.
The Lambda role must be provided atleast update image RBAC role in EKS cluster RBAC role setup
Since there's no built-in CI/CD for EKS at the moment, this is going to be a showcase of success/failure stories of a 3rd-party CI/CDs in EKS :) My take: https://github.com/fluxcd/flux
Pros:
Quick to set up initially (until you get into multiple teams/environments)
Tracks and deploys image releases out of box
Possibility to split what to auto-deploy in dev/prod using regex. E.g. all versions to dev, only minor to prod. Or separate tag prefixes for dev/prod.
All state is in git - a good practice to start with
Cons:
Getting complex for further pipeline expansion, e.g. blue-green, canary, auto-rollbacks, etc.
The dashboard is proprietary (weave works product)
Not for on-demand parametrized job runs like traditional CIs.
Setup:
Setup an automated image build (looks like you've already figured out)
Setup flux and helm-operator into the cluster, point them to your "gitops repo"
For each app, create a HelmRelease object that describes a regex of image tag to track
Done. A newly published image tag that falls into regex will be auto-deployed to the cluster and the new version is committed to a gitops repo.
GitLab offers to manage a Kubernetes cluster, which includes (e.g.) creating the namespace, adding some tokens, etc. In GitLab CI jobs, one can directly use the $KUBECONFIG variable for contacting the cluster and e.g. creating deployments using helm. This works like a charm, as long as the GitLab project is public and therefore Docker images hosted by the GitLab project's image registry are publicly accessible.
However, when working with private projects, Kubernetes of course needs an ImagePullSecret to authenticate the GitLab's image registry to retreive the image. As far as I can see, GitLab does not automatically provide an ImagePullSecret for repository access.
Therefore, my question is: What is the best way to access the image repository of private GitLab repositories in a Kubernetes deployment in a GitLab managed deployment environment?
In my opinion, these are the possibilities and why they are not eligible/optimal:
Permanent ImagePullSecret provided by GitLab: When doing a deployment on a GitLab managed Kubernetes cluster, GitLab provides a list of variables to the deployment script (e.g. Helm Chart or kubectl apply -f manifest.yml). As far as I can (not) see, there is a lot of stuff like ServiceAccounts and tokens etc., but no ImagePullSecret - and also no configuration option for enabling ImagePullSecret creation.
Using $CI_JOB_TOKEN: When working with GitLab CI/CD, GitLab provides a variable named $CI_JOB_TOKEN which can be used for uploading Docker images to the registry during job execution. This token expires after the job is done. It could be combined with helm install --wait, but when a rescheduling takes place to a new node which does not have the image yet, the token is expired and the node is not able to download the image anymore. Therefore, this only works right in the moment of deploying the app.
Creating an ImagePullSecret manually and add it to the Deployment or the default ServiceAccount: *This is a manual step, has to be repeated for each individual project and just sucks - we're trying to automate things/GitLab managed Kubernetes clusters is designed for avoiding any manual step.`
Something else but I don't know about it.
So, am I wrong in one of these points? Am I missing a eligible option in this listing?
Again: It's all about a seamless integration with the "Managed Cluster" features of GitLab. I know how to add tokens from GitLab as ImagePullSecrets in Kubernetes, but I want to know how to automate this with the Managed Cluster feature.
There is another way. You can bake the ImagePullSecret in your container runtime configuration. Docker, containerd or CRI-O (Whatever you are using)
Docker
As root run docker login <your-private-registry-url>. Then a file /root/.docker/config.json should be created/updated. Stick that in all your Kubernetes node and make sure your kubelet runs as root (which typically does). Some background info.
The content of the file should look something like this:
{
"auths": {
"my-private-registry": {
"auth": "xxxxxx"
}
},
"HttpHeaders": {
"User-Agent": "Docker-Client/18.09.2 (Linux)"
}
}
Containerd
Configure your containerd.toml file with something like this:
[plugins.cri.registry.auths]
[plugins.cri.registry.auths."https://gcr.io"]
username = ""
password = ""
auth = ""
identitytoken = ""
CRI-O
Specify the global_auth_file option in your crio.conf file.
✌️
Configure your account.
For example, for kubernetes pull image gitlab.com, use the address registry.gitlab.com:
kubectl create secret docker-registry regcred --docker-server=<your-registry-server> --docker-username=<your-name> --docker-password=<your-pword> --docker-email=<your-email>
I have one master node, and it looks like everything ok.
But I open the dashboard and I see many errors.
The system:anonymous is not authorized to perform the list actions within your cluster.
You can solve this case in 2 ways:
1 - Using RBAC Authorization, described in k8s documentation here.
2 - A NOT recommended way is explained in this GitHub thread:
kubectl create clusterrolebinding cluster-system-anonymous --clusterrole=dont-do-this --user=system:anonymous
At the end of the GitHub thread, the Kubernetes team explains why this is not a recommended approach:
[...] granting anonymous clients full access to the Kubernetes API [...] should not be considered as solutions to permission issues
But if you are not in Production and would like to check if it works first, that can help.
You can double check and create the proper Cluster Role and Cluster Role Binding afterwards.
I've followed the instructions to create an EKS cluster in AWS using Terraform.
https://www.terraform.io/docs/providers/aws/guides/eks-getting-started.html
I've also copied the output for connecting to the cluster to ~/.kube/config-eks. I've verified this successfully works as I've been able to connect to the cluster and manually deploy containers. However, now i'm trying to use the Terraform Kubernetes provider to connect to the cluster but cannot seem to be able to configure the provider properly.
I've configured the provider to use my kubectl configuration but when attempting to push a simple configmap, i get an error stating the following:
configmaps is forbidden: User "system:anonymous" cannot create configmaps in the namespace "kube-system"
I know that the provider is picking up part of the configuration but I cannot seem to get it to authenticate. I suspect this is because EKS uses heptio for authentication and i'm not sure if the K8s Go client used by Terraform can support heptio. However, given that Terraform released their AWS EKS support when EKS went GA, I'd doubt that they wouldn't also update their Terraform provider to work with it.
Is it possible to even do this now? Are there alternatives?
Exec auth was added here: https://github.com/kubernetes/client-go/commit/19c591bac28a94ca793a2f18a0cf0f2e800fad04
This is what is utilized for custom authentication plugins and was published Feb 7th.
Right now, Terraform doesn't support the new exec-based authentication provider, but there is an issue open with a workaround: https://github.com/terraform-providers/terraform-provider-kubernetes/issues/161
That said, if I get some free time I will work on a PR.