I've been using
helm install spinnaker stable/spinnaker -f spinnaker-config.yaml --timeout 1200s --version 2.0.0-rc9
Which is the latest helm chart for Spinnaker.
Using this on a freshly created K8s cluster on GKE.
Just installed Helm so I have the latest.
Result is that it created a Job named spinnaker-install-using-hal and the pod for this job keeps restarting...
Container logs show:
/opt/halyard/scripts/config.sh: line 10: syntax error near unexpected token `newline'
I've actually found this file is mounted from a ConfigMap named *-spinnaker-halyard-config.
The ConfigMap value for config.sh is set to:
# Spinnaker version
$HAL_COMMAND config version edit --version 1.19.4
# Storage
$HAL_COMMAND config storage gcs edit --project XXXXXXXXXX --json-path /opt/gcs/key.json --bucket <GCS-BUCKET-NAME>
$HAL_COMMAND config storage edit --type gcs
# Docker Registry
$HAL_COMMAND config provider docker-registry enable
if $HAL_COMMAND config provider docker-registry account get dockerhub; then
PROVIDER_COMMAND='edit'
else
PROVIDER_COMMAND='add'
fi
$HAL_COMMAND config provider docker-registry account $PROVIDER_COMMAND dockerhub --address index.docker.io \
\
--repositories library/alpine,library/ubuntu,library/centos,library/nginx
$HAL_COMMAND config provider kubernetes enable
if $HAL_COMMAND config provider kubernetes account get default; then
PROVIDER_COMMAND='edit'
else
PROVIDER_COMMAND='add'
fi
$HAL_COMMAND config provider kubernetes account $PROVIDER_COMMAND default --docker-registries dockerhub \
--context default --service-account true \
\
\
\
\
--omit-namespaces=kube-system,kube-public \
\
\
--provider-version v2
$HAL_COMMAND config deploy edit --account-name default --type distributed \
--location default
# Use Deck to route to Gate
$HAL_COMMAND config security api edit --no-validate --override-base-url /gate
$HAL_COMMAND config features edit --artifacts true
In line #9 it has the value <GCS-BUCKET-NAME> instead of the real bucket name. This probably caused the script to fail.
Still not sure what causes that to not be populated.
Found the problem... maybe someone finds it helpful...
I was using the following guide https://medium.com/velotio-perspectives/know-everything-about-spinnaker-how-to-deploy-using-kubernetes-engine-57090881c78f
Which is a great guide but i guess not too updated...
Anyway it says you should configure
storageBucket: $BUCKET
gcs:
enabled: true
project: $PROJECT
jsonKey: '$SA_JSON'
Which is incorrect... it should be as follows:
gcs:
enabled: true
bucket: $BUCKET
project: $PROJECT
jsonKey: '$SA_JSON'
This solved it.
Related
In CI, with gcp auth plugin I was using gcloud auth activate-service-account ***#developer.gserviceaccount.com --key-file ***.json prior to execute kubectl commands.
Now with gke-gcloud-auth-plugin I can’t find any equivalent to use a gcp service account key file.
I've installed gke-gcloud-auth-plugin and gke-gcloud-auth-plugin --version is giving me Kubernetes v1.25.2-alpha+ae91c1fc0c443c464a4c878ffa2a4544483c6d1f
Would you know if there’s a way?
I tried to add this command:
kubectl config set-credentials my-user --auth-provider=gcp
But I still get:
error: The gcp auth plugin has been removed. Please use the "gke-gcloud-auth-plugin" kubectl/client-go credential plugin instead.
You will need to set the env variable to use the new plugin before doing the get-credentials:
export USE_GKE_GCLOUD_AUTH_PLUGIN=True
gcloud container clusters get-credentials $CLUSTER \
--region $REGION \
--project $PROJECT \
--internal-ip
I would not have expected the env variable to still be required (now that the gcp auth plugin is completely deprecated) - but it seems it still is.
Your kubeconfig will end up looking like this if the new auth provider is in use.
...
- name: $NAME
user:
exec:
apiVersion: client.authentication.k8s.io/v1beta1
command: gke-gcloud-auth-plugin
installHint: Install gke-gcloud-auth-plugin for use with kubectl by following
https://cloud.google.com/blog/products/containers-kubernetes/kubectl-auth-changes-in-gke
provideClusterInfo: true
I have a GKE cluster running with several persistent disks for storage.
To set up a staging environment, I created a second cluster inside the same project.
Now I want to use the data from the persistent disks of the production cluster in the staging cluster.
I already created persistent disks for the staging cluster. What is the best approach to move over the production data to the disks of the staging cluster.
You can use the open source tool Velero which is designed to migrate Kubernetes cluster resources.
Follow these steps to migrate a persistent disk within GKE clusters:
Create a GCS bucket:
BUCKET=<your_bucket_name>
gsutil mb gs://$BUCKET/
Create a Google Service Account and store the associated email in a variable for later use:
GSA_NAME=<your_service_account_name>
gcloud iam service-accounts create $GSA_NAME \
--display-name "Velero service account"
SERVICE_ACCOUNT_EMAIL=$(gcloud iam service-accounts list \
--filter="displayName:Velero service account" \
--format 'value(email)')
Create a custom role for the Service Account:
PROJECT_ID=<your_project_id>
ROLE_PERMISSIONS=(
compute.disks.get
compute.disks.create
compute.disks.createSnapshot
compute.snapshots.get
compute.snapshots.create
compute.snapshots.useReadOnly
compute.snapshots.delete
compute.zones.get
storage.objects.create
storage.objects.delete
storage.objects.get
storage.objects.list
)
gcloud iam roles create velero.server \
--project $PROJECT_ID \
--title "Velero Server" \
--permissions "$(IFS=","; echo "${ROLE_PERMISSIONS[*]}")"
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member serviceAccount:$SERVICE_ACCOUNT_EMAIL \
--role projects/$PROJECT_ID/roles/velero.server
gsutil iam ch serviceAccount:$SERVICE_ACCOUNT_EMAIL:objectAdmin gs://${BUCKET}
Grant access to Velero:
gcloud iam service-accounts keys create credentials-velero \
--iam-account $SERVICE_ACCOUNT_EMAIL
Download and install Velero on the source cluster:
wget https://github.com/vmware-tanzu/velero/releases/download/v1.8.1/velero-v1.8.1-linux-amd64.tar.gz
tar -xvzf velero-v1.8.1-linux-amd64.tar.gz
sudo mv velero-v1.8.1-linux-amd64/velero /usr/local/bin/velero
velero install \
--provider gcp \
--plugins velero/velero-plugin-for-gcp:v1.4.0 \
--bucket $BUCKET \
--secret-file ./credentials-velero
Note: The download and installation was performed on a Linux system, which is the OS used by Cloud Shell. If you are managing your GCP resources via Cloud SDK, the release and installation process could vary.
Confirm that the velero pod is running:
$ kubectl get pods -n velero
NAME READY STATUS RESTARTS AGE
velero-xxxxxxxxxxx-xxxx 1/1 Running 0 11s
Create a backup for the PV,PVCs:
velero backup create <your_backup_name> --include-resources pvc,pv --selector app.kubernetes.io/<your_label_name>=<your_label_value>
Verify that your backup was successful with no errors or warnings:
$ velero backup describe <your_backup_name> --details
Name: your_backup_name
Namespace: velero
Labels: velero.io/storage-location=default
Annotations: velero.io/source-cluster-k8s-gitversion=v1.21.6-gke.1503
velero.io/source-cluster-k8s-major-version=1
velero.io/source-cluster-k8s-minor-version=21
Phase: Completed
Errors: 0
Warnings: 0
Now that the Persistent Volumes are backed up, you can proceed with the migration to the destination cluster following these steps:
Authenticate in the destination cluster
gcloud container clusters get-credentials <your_destination_cluster> --zone <your_zone> --project <your_project>
Install Velero using the same parameters as step 5 on the first part:
velero install \
--provider gcp \
--plugins velero/velero-plugin-for-gcp:v1.4.0 \
--bucket $BUCKET \
--secret-file ./credentials-velero
Confirm that the velero pod is running:
kubectl get pods -n velero
NAME READY STATUS RESTARTS AGE
velero-xxxxxxxxxx-xxxxx 1/1 Running 0 19s
To avoid the backup data being overwritten, change the bucket to read-only mode:
kubectl patch backupstoragelocation default -n velero --type merge --patch '{"spec":{"accessMode":"ReadOnly"}}'
Confirm Velero is able to access the backup from bucket:
velero backup describe <your_backup_name> --details
Restore the backed up Volumes:
velero restore create --from-backup <your_backup_name>
Confirm that the persistent volumes have been restored on the destination cluster:
kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
redis-data-my-release-redis-master-0 Bound pvc-ae11172a-13fa-4ac4-95c5-d0a51349d914 8Gi RWO standard 79s
redis-data-my-release-redis-replicas-0 Bound pvc-f2cc7e07-b234-415d-afb0-47dd7b9993e7 8Gi RWO standard 79s
redis-data-my-release-redis-replicas-1 Bound pvc-ef9d116d-2b12-4168-be7f-e30b8d5ccc69 8Gi RWO standard 79s
redis-data-my-release-redis-replicas-2 Bound pvc-65d7471a-7885-46b6-a377-0703e7b01484 8Gi RWO standard 79s
Check out this tutorial as a reference.
I want to restart my Kubernetes access ssh key using commands from this website:
https://github.com/kubernetes/kops/blob/master/docs/security.md#ssh-access
so those:
kops delete secret --name <clustername> sshpublickey admin
kops create secret --name <clustername> sshpublickey admin -i ~/.ssh/newkey.pub
kops update cluster --yes
And when I type last command "kops update cluster --yes" I get that error:
completed cluster failed validation: spec.spec.kubeProxy.enabled: Forbidden: kube-router requires kubeProxy to be disabled
Does Anybody have any idea what can I change those secret key without disabling kubeProxy?
This problem comes from having set
spec:
networking:
kuberouter: {}
but not
spec:
kubeProxy:
enabled: false
in the cluster spec.
Export the config using kops get -o yaml > myspec.yaml, edit the config according to the error above. Then you can apply the spec using kops replace -f myspec.yaml.
It is considered a best practice to check the above yaml into version control to track any changes done to the cluster configuration.
Once the cluster spec has been amended, the new ssh key should work as well.
What version of kubernetes are you running? If you are running the latests one 1.18.xx the user its not admin but ubuntu.
One other thing that you could do is to first edit the cluster and set the spect of kubeproxy to enabled fist . Run kops update cluster and rolling update and then do the secret delete and creation.
For my e2e tests I'm spinning up a separate cluster into which I'd like to import my production TLS certificate. I'm having trouble to switch the context between the two clusters (export/get from one and import/apply (in)to another) because the cluster doesn't seem to be visible.
I extracted a MVCE using a GitLab CI and the following .gitlab-ci.yml where I create a secret for demonstration purposes:
stages:
- main
- tear-down
main:
image: google/cloud-sdk
stage: main
script:
- echo "$GOOGLE_KEY" > key.json
- gcloud config set project secret-transfer
- gcloud auth activate-service-account --key-file key.json --project secret-transfer
- gcloud config set compute/zone us-central1-a
- gcloud container clusters create secret-transfer-1-$CI_COMMIT_SHORT_SHA-$CI_PIPELINE_IID --project secret-transfer --machine-type=f1-micro
- kubectl create secret generic secret-1 --from-literal=key=value
- gcloud container clusters create secret-transfer-2-$CI_COMMIT_SHORT_SHA-$CI_PIPELINE_IID --project secret-transfer --machine-type=f1-micro
- gcloud config set container/use_client_certificate True
- gcloud config set container/cluster secret-transfer-1-$CI_COMMIT_SHORT_SHA-$CI_PIPELINE_IID
- kubectl get secret letsencrypt-prod --cluster=secret-transfer-1-$CI_COMMIT_SHORT_SHA-$CI_PIPELINE_IID -o yaml > secret-1.yml
- gcloud config set container/cluster secret-transfer-2-$CI_COMMIT_SHORT_SHA-$CI_PIPELINE_IID
- kubectl apply --cluster=secret-transfer-2-$CI_COMMIT_SHORT_SHA-$CI_PIPELINE_IID -f secret-1.yml
tear-down:
image: google/cloud-sdk
stage: tear-down
when: always
script:
- echo "$GOOGLE_KEY" > key.json
- gcloud config set project secret-transfer
- gcloud auth activate-service-account --key-file key.json
- gcloud config set compute/zone us-central1-a
- gcloud container clusters delete --quiet secret-transfer-1-$CI_COMMIT_SHORT_SHA-$CI_PIPELINE_IID
- gcloud container clusters delete --quiet secret-transfer-2-$CI_COMMIT_SHORT_SHA-$CI_PIPELINE_IID
I added secret-transfer-[1/2]-$CI_COMMIT_SHORT_SHA-$CI_PIPELINE_IID before kubectl statements in order to avoid error: no server found for cluster "secret-transfer-1-...-...", but it doesn't change the outcome.
I created a project secret-transfer, activated the Kubernetes API and got a JSON key for the Compute Engine service account which I'm providing in the environment variable GOOGLE_KEY. The output after checkout is
$ echo "$GOOGLE_KEY" > key.json
$ gcloud config set project secret-transfer
Updated property [core/project].
$ gcloud auth activate-service-account --key-file key.json --project secret-transfer
Activated service account credentials for: [131478687181-compute#developer.gserviceaccount.com]
$ gcloud config set compute/zone us-central1-a
Updated property [compute/zone].
$ gcloud container clusters create secret-transfer-1-$CI_COMMIT_SHORT_SHA-$CI_PIPELINE_IID --project secret-transfer --machine-type=f1-micro
WARNING: In June 2019, node auto-upgrade will be enabled by default for newly created clusters and node pools. To disable it, use the `--no-enable-autoupgrade` flag.
WARNING: Starting in 1.12, new clusters will have basic authentication disabled by default. Basic authentication can be enabled (or disabled) manually using the `--[no-]enable-basic-auth` flag.
WARNING: Starting in 1.12, new clusters will not have a client certificate issued. You can manually enable (or disable) the issuance of the client certificate using the `--[no-]issue-client-certificate` flag.
WARNING: Currently VPC-native is not the default mode during cluster creation. In the future, this will become the default mode and can be disabled using `--no-enable-ip-alias` flag. Use `--[no-]enable-ip-alias` flag to suppress this warning.
WARNING: Starting in 1.12, default node pools in new clusters will have their legacy Compute Engine instance metadata endpoints disabled by default. To create a cluster with legacy instance metadata endpoints disabled in the default node pool, run `clusters create` with the flag `--metadata disable-legacy-endpoints=true`.
WARNING: Your Pod address range (`--cluster-ipv4-cidr`) can accommodate at most 1008 node(s).
This will enable the autorepair feature for nodes. Please see https://cloud.google.com/kubernetes-engine/docs/node-auto-repair for more information on node autorepairs.
Creating cluster secret-transfer-1-9b219ea8-9 in us-central1-a...
...done.
Created [https://container.googleapis.com/v1/projects/secret-transfer/zones/us-central1-a/clusters/secret-transfer-1-9b219ea8-9].
To inspect the contents of your cluster, go to: https://console.cloud.google.com/kubernetes/workload_/gcloud/us-central1-a/secret-transfer-1-9b219ea8-9?project=secret-transfer
kubeconfig entry generated for secret-transfer-1-9b219ea8-9.
NAME LOCATION MASTER_VERSION MASTER_IP MACHINE_TYPE NODE_VERSION NUM_NODES STATUS
secret-transfer-1-9b219ea8-9 us-central1-a 1.12.8-gke.10 34.68.118.165 f1-micro 1.12.8-gke.10 3 RUNNING
$ kubectl create secret generic secret-1 --from-literal=key=value
secret/secret-1 created
$ gcloud container clusters create secret-transfer-2-$CI_COMMIT_SHORT_SHA-$CI_PIPELINE_IID --project secret-transfer --machine-type=f1-micro
WARNING: In June 2019, node auto-upgrade will be enabled by default for newly created clusters and node pools. To disable it, use the `--no-enable-autoupgrade` flag.
WARNING: Starting in 1.12, new clusters will have basic authentication disabled by default. Basic authentication can be enabled (or disabled) manually using the `--[no-]enable-basic-auth` flag.
WARNING: Starting in 1.12, new clusters will not have a client certificate issued. You can manually enable (or disable) the issuance of the client certificate using the `--[no-]issue-client-certificate` flag.
WARNING: Currently VPC-native is not the default mode during cluster creation. In the future, this will become the default mode and can be disabled using `--no-enable-ip-alias` flag. Use `--[no-]enable-ip-alias` flag to suppress this warning.
WARNING: Starting in 1.12, default node pools in new clusters will have their legacy Compute Engine instance metadata endpoints disabled by default. To create a cluster with legacy instance metadata endpoints disabled in the default node pool, run `clusters create` with the flag `--metadata disable-legacy-endpoints=true`.
WARNING: Your Pod address range (`--cluster-ipv4-cidr`) can accommodate at most 1008 node(s).
This will enable the autorepair feature for nodes. Please see https://cloud.google.com/kubernetes-engine/docs/node-auto-repair for more information on node autorepairs.
Creating cluster secret-transfer-2-9b219ea8-9 in us-central1-a...
...done.
Created [https://container.googleapis.com/v1/projects/secret-transfer/zones/us-central1-a/clusters/secret-transfer-2-9b219ea8-9].
To inspect the contents of your cluster, go to: https://console.cloud.google.com/kubernetes/workload_/gcloud/us-central1-a/secret-transfer-2-9b219ea8-9?project=secret-transfer
kubeconfig entry generated for secret-transfer-2-9b219ea8-9.
NAME LOCATION MASTER_VERSION MASTER_IP MACHINE_TYPE NODE_VERSION NUM_NODES STATUS
secret-transfer-2-9b219ea8-9 us-central1-a 1.12.8-gke.10 104.198.37.21 f1-micro 1.12.8-gke.10 3 RUNNING
$ gcloud config set container/use_client_certificate True
Updated property [container/use_client_certificate].
$ gcloud config set container/cluster secret-transfer-1-$CI_COMMIT_SHORT_SHA-$CI_PIPELINE_IID
Updated property [container/cluster].
$ kubectl get secret secret-1 --cluster=secret-transfer-1-$CI_COMMIT_SHORT_SHA-$CI_PIPELINE_IID -o yaml > secret-1.yml
error: no server found for cluster "secret-transfer-1-9b219ea8-9"
I'm expecting kubectl get secret to work because both clusters exist and the --cluster argument points to the right cluster.
Generally gcloud commands are used to manage gcloud resources and handle how you authenticate with gcloud, whereas kubectl commands affect how you interact with Kubernetes clusters, whether or not they happen to be running on GCP and/or created in GKE. As such, I would avoid doing:
$ gcloud config set container/use_client_certificate True
Updated property [container/use_client_certificate].
$ gcloud config set container/cluster \
secret-transfer-1-$CI_COMMIT_SHORT_SHA-$CI_PIPELINE_IID
Updated property [container/cluster].
It's not doing what you probably think it's doing (namely, changing anything about how kubectl targets clusters), and might mess with how future gcloud commands work.
Another consequence of gcloud and kubectl being separate, and in particular kubectl not knowing intimately about your gcloud settings, is that the cluster name from gcloud perspective is not the same as from the kubectl perspective. When you do things like gcloud config set compute/zone, kubectl doesn't know anything about that, so it has to be able to identify clusters uniquely which may have the same name but be in different projects and zone, and maybe not even in GKE (like minikube or some other cloud provider). That's why kubectl --cluster=<gke-cluster-name> <some_command> is not going to work, and it's why you're seeing the error message:
error: no server found for cluster "secret-transfer-1-9b219ea8-9"
As #coderanger pointed out, the cluster name that gets generated in your ~/.kube/config file after doing gcloud container clusters create ... has a more complex name, which currently has a pattern something like gke_[project]_[region]_[name].
So you could run commands with kubectl --cluster gke_[project]_[region]_[name] ... (or kubectl --context [project]_[region]_[name] ... which would be more idiomatic, although both will happen to work in this case since you're using the same service account for both clusters), however that requires knowledge of how gcloud generates these strings for context and cluster names.
An alternative would be to do something like:
$ KUBECONFIG=~/.kube/config1 gcloud container clusters create \
secret-transfer-1-$CI_COMMIT_SHORT_SHA-$CI_PIPELINE_IID \
--project secret-transfer --machine-type=f1-micro
$ KUBECONFIG=~/.kube/config1 kubectl create secret secret-1 --from-literal=key=value
$ KUBECONFIG=~/.kube/config2 gcloud container clusters create \
secret-transfer-2-$CI_COMMIT_SHORT_SHA-$CI_PIPELINE_IID \
--project secret-transfer --machine-type=f1-micro
$ KUBECONFIG=~/.kube/config1 kubectl get secret secret-1 -o yaml > secret-1.yml
$ KUBECONFIG=~/.kube/config2 kubectl apply -f secret-1.yml
By having separate KUBECONFIG files that you control, you don't have to guess any strings. Setting the KUBECONFIG variable when creating a cluster will result in creating that file and gcloud putting the credentials for kubectl to access that cluster in that file. Setting the KUBECONFIG environment variable when running kubectl command will ensure kubectl uses the context as set in that particular file.
You probably mean to be using --context rather than --cluster. The context sets both the cluster and user in use. Additionally the context and cluster (and user) names created by GKE are not just the cluster identifier, it's gke_[project]_[region]_[name].
I have a cloudformation template to deploy EKS and worker nodes along with several other resources like RDS, ES, etc.
I want to write a terraform template which does the same job as that of my cloudformation.
I am new to terraform and I got stuck with the userdata section of EKS worker nodes launch configuration.
The section in my cloudformation is:
UserData:
Fn::Base64:
!Sub |
#!/bin/bash
set -o xtrace
/etc/eks/bootstrap.sh ${AWS::StackName}-cluster
/opt/aws/bin/cfn-signal --exit-code $? \
--stack ${AWS::StackName} \
--resource SpotNodeGroup \
--region ${AWS::Region}
I want to replicate the same in terraform and I am not sure what is the equivalent for "/opt/aws/bin/cfn-signal " in terraform.
Any idea what I could use instead ?
Please refer Terraform document "https://learn.hashicorp.com/terraform/aws/eks-intro". This will give you more details about configurations.