How to pass cloud-init file to vm in gcloud using golang?

How to pass cloud-init file to vm in gcloud using golang? - rest

I want to pass the yaml file (#cloud-config) to user-data in gcloud vm instance using golang. How can I do that?
Metadata: &compute.Metadata{
Items: []*compute.Items{
{Key: porto.String("user-data"),
Value: ...userdata.yaml...?
},
},
},
compute: https://cloud.google.com/compute/docs/reference/rest/v1/instances
gcp cloud-init: https://cloud.google.com/container-optimized-os/docs/how-to/create-configure-instance
Creating gcp vm instance using golang - https://cloud.google.com/compute/docs/instances/create-start-instance#publicimage

Related

How to create GCP Instance-Template with accessConfig using gcloud command

Hope this is not too incidental so someone can help.
I like to create an instance template using the create command
when I run this :
gcloud compute instance-templates create jenkins-slave-instance-template-tmp1 --network-interface=network=default,network-tier=PREMIUM ... .
I get the networkInterfaces in this way (using describe command ) :
networkInterfaces:
- kind: compute#networkInterface
name: nic0
network:
https://www.googleapis.com/compute/v1/projects/*******/global/networks/default
But when creating using the GCP UI console I get it ( as I actually need it ):
networkInterfaces:
- accessConfigs:
- kind: compute#accessConfig
name: External NAT
networkTier: PREMIUM
type: ONE_TO_ONE_NAT
kind: compute#networkInterface
name: nic0
network: https://www.googleapis.com/compute/v1/projects/*******/global/networks/default
How can I add the accessConfig for the instance-template on creation time ( I can do the same from the UI but the equivalent gcloud compute instance-templates create create it without the accessConfig entry .
Thanks for the your help

You can create an instance-template in gcloud cli by using the instance-templates create command with default parameters.
gcloud compute instance-templates create INSTANCE_TEMPLATE_NAME
gcloud compute uses the following default values, if you do not provide explicit template configuration/properties.
Machine type: the machine type—for example, n1-standard-1
Image: the latest Debian image
Boot disk: a new standard boot disk named after the VM
Network: the default VPC network
IP address: an ephemeral external IP address
If you run
gcloud compute instance-templates describe INSTANCE_TEMPLATE_NAME command you will get accessConfigs in network interface parameters.
If you want to provide the template configuration settings explicitly like Machine type, boot disk, Image properties,service account etc.,
you can specify them through gcloud cli, but if you want accessConfigs parameters then you should omit network-interface parameters
(--network-interface=network=default,network-tier=PREMIUM,nic-type=GVNIC) while running instance-template command.
For example:
gcloud compute instance-templates create example-template-1 --machine-type=e2-standard-4 --image-family=debian-10 --image-project=debian-cloud --boot-disk-size=250GB
The above command will create the instance-template with mentioned configuration and gives you accessConfigs parameters since you didn’t mention network- interface parameters.
Refer to the documentation for creating a new instance template.

Why isn't my `KubernetesPodOperator` using the IRSA I've annotated worker pods with?

I've deployed an EKS cluster using the Terraform module terraform-aws-modules/eks/aws. I’ve deployed Airflow on this EKS cluster with Helm (the official chart, not the community one), and I’ve annotated worker pods with the following IRSA:
serviceAccount:
# Specifies whether a ServiceAccount should be created
create: true
# The name of the ServiceAccount to use.
# If not set and create is true, a name is generated using the release name
name: "airflow-worker"
# Annotations to add to worker kubernetes service account.
annotations:
eks.amazonaws.com/role-arn: "arn:aws:iam::123456789:role/airflow-worker"
This airflow-worker role has a policy attached to it to enable it to assume a different role.
I have a Python program that assumes this other role and performs some S3 operations. I can exec into a running BashOperator pod, open a Python shell, assume this role, and issue the exact same S3 operations successfully.
But, when I create a Docker image with this program and try to call it from a KubernetesPodOperator task, I see the following error:
botocore.exceptions.ClientError: An error occurred (AccessDenied) when calling the AssumeRole operation:
User: arn:aws:sts::123456789:assumed-role/core_node_group-eks-node-group-20220726041042973200000001/i-089c64b96cf7878d8 is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::987654321:role/TheOtherRole
I don't really know what this role is, but I believe it was created automatically by the Terraform module. However, when I kubectl describe one of these failed pods, I see this:
Environment:
...
...
...
AWS_ROLE_ARN: arn:aws:iam::123456789:role/airflow-worker
My questions:
Why is this role being used, and not the IRSA airflow-worker that I've specified in the Helm chart's values?
What even is this role? It seems the Terraform module creates a number of roles automatically, but it is very difficult to tell what their purpose is or where they're used from the Terraform documentation.
How am I able to assume this role and do everything the Dockerized Python program does when in a shell in the pod? Okay, this is because other operators (such as BashOperator) do use the airflow-worker role. Just not KubernetesPodOperators.
What is the AWS_IAM_ROLE environment variable, and why isn't it being used?
Happy to provide more context if it's helpful.

In order to use the AWS role in EKS pod, you need to add this policy to it:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": " arn:aws:iam::123456789:role/airflow-worker”
},
"Action": "sts:AssumeRole"
}
]
}
Here you can find some information about AWS Security Token Service (STS).
For the tasks running in the worker prod, they will use the role automatically, but if you create a new pod, it will be separated from your worker pod, so you need to let it use the service account which attach the role in order to add the AWS role creds file to the pod.

This is pretty much by design. The non-KubernetesPodOperators use an auto-generated pod template file that has Helm chart values as default properties, while the KubernetesPodOperator needs its own pod template file. That, or it needs to essentially create one by passing arguments to KubernetesPodOperator(....
I fixed the ultimate issue by passing service_account="airflow-worker" to KubernetesPodOperator(....

How to attach GCP secret to Kubernetes service account?

How can i use a secret object created from Google cloud JSON file into a service account? I have minikf on the VM and kubeflow installed. I am trying to make a container using Jupyter notebook in minikf Kubernetes cluster. The notebook has access to GCP using PodDefault but the kanico container started from notebook automatically can't access GCP.
The code in jupyter notebook for building the container is as follows:
IMAGE_NAME="mnist_training_kf_pipeline"
TAG="latest" # "v_$(date +%Y%m%d_%H%M%S)"
GCR_IMAGE="gcr.io/{PROJECT_ID}/{IMAGE_NAME}:{TAG}".format(
PROJECT_ID=PROJECT_ID,
IMAGE_NAME=IMAGE_NAME,
TAG=TAG
)
builder = kfp.containers._container_builder.ContainerBuilder(
gcs_staging=GCS_BUCKET + "/kfp_container_build_staging")
image_name = kfp.containers.build_image_from_working_dir(
image_name=GCR_IMAGE,
working_dir='./tmp/components/mnist_training/',
builder=builder
)
I get the error:
Error: error resolving source context: dialing: google: could not find default credentials. See https://developers.google.com/accounts/docs/application-default-credentials for more information.
Usage:
executor [flags]
The pod name starting with Kaniko gets created and fails because it can't access the google cloud storage:
The proof of Jupyter notebook is able to utilize my secret object "user-gcp-sa" is that the above code is preparing files on GCS:

best way to seed new machine with k8s/eks info

Say we have a couple of clusters on Amazon EKS. We have a new user or new machine that needs .kube/config to be populated with the latest cluster info.
Is there some easy way we get the context info from our clusters on EKS and put the info in the .kube/config file? something like:
eksctl init "cluster-1-ARN" "cluster-2-ARN"
so after some web-sleuthing, I heard about:
aws eks update-kubeconfig
I tried that, and I get this:
$ aws eks update-kubeconfig usage: aws [options]
[ ...] [parameters] To see help text, you can
run:
aws help aws help aws help
aws: error: argument --name is required
I would think it would just update for all clusters then, but it don't. So I put the cluster names/ARNs, like so:
aws eks update-kubeconfig --name arn:aws:eks:us-west-2:913xxx371:cluster/eks-cluster-1
aws eks update-kubeconfig --name arn:aws:eks:us-west-2:913xxx371:cluster/ignitecluster
but then I get:
kbc stderr: An error occurred (ResourceNotFoundException) when calling the DescribeCluster operation: No cluster found for name: arn:aws:eks:us-west-2:913xxx371:cluster/eks-cluster-1.
kbc stderr: An error occurred (ResourceNotFoundException) when calling the DescribeCluster operation: No cluster found for name: arn:aws:eks:us-west-2:913xxx371:cluster/ignitecluster.
hmmm this is kinda dumb 😒 those cluster names exist..so what 🤷 do I do now

So yeah those clusters I named don't actually exist. I discovered that via:
aws eks list-clusters
ultimately however, I still feel strong because we people need to make a tool that can just update your config with all the clusters that exist instead of having you name them.
So to do this programmatically, it would be:
aws eks list-clusters | jq '.clusters[]' | while read c; do
aws eks update-kubeconfig --name "$c"
done;

In my case, I was working with two AWS environments. My ~/.aws/credentials were pointing to one and had to be changed to point to the correct account. Once you change the account details, you can verify the change by running the following commands:
eksctl get clusters
and then setting the kube-config using the command below after verifying the region.
aws eks --region your_aws_region update-kubeconfig --name your_eks_cluster

kubefed init says "waiting for the federation control plane to come up" and it never comes up

I've created clusters using kops command. For each cluster I've to create a hosted zone and add namespaces to DNS provider. To create a hosted zone, I've created a sub-domain in the hosted zone in aws(example.com) by using the following command :
ID=$(uuidgen) && aws route53 create-hosted-zone --name subdomain1.example.com --caller-reference $ID | jq .DelegationSet.NameServers
The nameservers I get by executing above command are included in a newly created file subdomain1.json with the following content.
{
"Comment": "Create a subdomain NS record in the parent domain",
"Changes": [
{
"Action": "CREATE",
"ResourceRecordSet": {
"Name": "subdomain1.example.com",
"Type": "NS",
"TTL": 300,
"ResourceRecords": [
{
"Value": "ns-1.awsdns-1.co.uk"
},
{
"Value": "ns-2.awsdns-2.org"
},
{
"Value": "ns-3.awsdns-3.com"
},
{
"Value": "ns-4.awsdns-4.net"
}
]
}
}
]
}
To get the parent-zone-id, I've used the following command:
aws route53 list-hosted-zones | jq '.HostedZones[] | select(.Name=="example.com.") | .Id'
To apply the subdomain NS records to the parent hosted zone-
aws route53 change-resource-record-sets --hosted-zone-id <parent-zone-id> --change-batch file://subdomain1.json
then I created a cluster using kops command-
kops create cluster --name=subdomain1.example.com --master-count=1 --master-zones ap-southeast-1a --node-count=1 --zones=ap-southeast-1a --authorization=rbac --state=s3://example.com --kubernetes-version=1.11.0 --yes
I'm able to create a cluster, validate it and get its nodes. By using the same procedure, I created one more cluster (subdomain2.example.com).
I've set aliases for the two clusters using these commands-
kubectl config set-context subdomain1 --cluster=subdomain1.example.com --user=subdomain1.example.com
kubectl config set-context subdomain2 --cluster=subdomain2.example.com --user=subdomain2.example.com
To set up federation between these two clusters, I've used these commands-
kubectl config use-context subdomain1
kubectl create clusterrolebinding admin-to-cluster-admin-binding --clusterrole=cluster-admin --user=admin
kubefed init interstellar --host-cluster-context=subdomain1 --dns-provider=aws-route53 --dns-zone-name=example.com
-The output of kubefed init command should be
But for me it's showing as "waiting for the federation control plane to come up...", but it does not come up. What might be the error?
I've followed the following tutorial to create 2 clusters.
https://gist.github.com/arun-gupta/02f534c7720c8e9c9a875681b430441a

There was a problem with the default image used for federation api server and controller manager binaries. By default, the below mentioned image is considered for the kubefed init command-
"gcr.io/k8s-jkns-e2e-gce-federation/fcp-amd64:v0.0.0-master_$Format:%h$".
But this image is old and is not available, the federation control plane tries to pull the image but fails. This is the error I was getting.
To rectify it, build a fcp image of your own and push it to some repository and use this image in kubefed init command. Below are the instructions to be executed(Run all of these commands from this path "$GOPATH/src/k8s.io/kubernetes/federation")-
to create fcp image and push it to a repository -
docker load -i _output/release-images/amd64/fcp-amd64.tar
docker tag gcr.io/google_containers/fcp-amd64:v1.9.0-alpha.2.60_430416309f9e58-dirty REGISTRY/REPO/IMAGENAME[:TAG]
docker push REGISTRY/REPO/IMAGENAME[:TAG]
now create a federation control plane with the following command-
_output/dockerized/bin/linux/amd64/kubefed init myfed --host-cluster-context=HOST_CLUSTER_CONTEXT --image=REGISTRY/REPO/IMAGENAME[:TAG] --dns-provider="PROVIDER" --dns-zone-name="YOUR_ZONE" --dns-provider-config=/path/to/provider.conf