How to migrate Persistent Volumes within GKE clusters in the same project? - kubernetes

I have a GKE cluster running with several persistent disks for storage.
To set up a staging environment, I created a second cluster inside the same project.
Now I want to use the data from the persistent disks of the production cluster in the staging cluster.
I already created persistent disks for the staging cluster. What is the best approach to move over the production data to the disks of the staging cluster.

You can use the open source tool Velero which is designed to migrate Kubernetes cluster resources.
Follow these steps to migrate a persistent disk within GKE clusters:
Create a GCS bucket:
BUCKET=<your_bucket_name>
gsutil mb gs://$BUCKET/
Create a Google Service Account and store the associated email in a variable for later use:
GSA_NAME=<your_service_account_name>
gcloud iam service-accounts create $GSA_NAME \
--display-name "Velero service account"
SERVICE_ACCOUNT_EMAIL=$(gcloud iam service-accounts list \
--filter="displayName:Velero service account" \
--format 'value(email)')
Create a custom role for the Service Account:
PROJECT_ID=<your_project_id>
ROLE_PERMISSIONS=(
compute.disks.get
compute.disks.create
compute.disks.createSnapshot
compute.snapshots.get
compute.snapshots.create
compute.snapshots.useReadOnly
compute.snapshots.delete
compute.zones.get
storage.objects.create
storage.objects.delete
storage.objects.get
storage.objects.list
)
gcloud iam roles create velero.server \
--project $PROJECT_ID \
--title "Velero Server" \
--permissions "$(IFS=","; echo "${ROLE_PERMISSIONS[*]}")"
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member serviceAccount:$SERVICE_ACCOUNT_EMAIL \
--role projects/$PROJECT_ID/roles/velero.server
gsutil iam ch serviceAccount:$SERVICE_ACCOUNT_EMAIL:objectAdmin gs://${BUCKET}
Grant access to Velero:
gcloud iam service-accounts keys create credentials-velero \
--iam-account $SERVICE_ACCOUNT_EMAIL
Download and install Velero on the source cluster:
wget https://github.com/vmware-tanzu/velero/releases/download/v1.8.1/velero-v1.8.1-linux-amd64.tar.gz
tar -xvzf velero-v1.8.1-linux-amd64.tar.gz
sudo mv velero-v1.8.1-linux-amd64/velero /usr/local/bin/velero
velero install \
--provider gcp \
--plugins velero/velero-plugin-for-gcp:v1.4.0 \
--bucket $BUCKET \
--secret-file ./credentials-velero
Note: The download and installation was performed on a Linux system, which is the OS used by Cloud Shell. If you are managing your GCP resources via Cloud SDK, the release and installation process could vary.
Confirm that the velero pod is running:
$ kubectl get pods -n velero
NAME READY STATUS RESTARTS AGE
velero-xxxxxxxxxxx-xxxx 1/1 Running 0 11s
Create a backup for the PV,PVCs:
velero backup create <your_backup_name> --include-resources pvc,pv --selector app.kubernetes.io/<your_label_name>=<your_label_value>
Verify that your backup was successful with no errors or warnings:
$ velero backup describe <your_backup_name> --details
Name: your_backup_name
Namespace: velero
Labels: velero.io/storage-location=default
Annotations: velero.io/source-cluster-k8s-gitversion=v1.21.6-gke.1503
velero.io/source-cluster-k8s-major-version=1
velero.io/source-cluster-k8s-minor-version=21
Phase: Completed
Errors: 0
Warnings: 0
Now that the Persistent Volumes are backed up, you can proceed with the migration to the destination cluster following these steps:
Authenticate in the destination cluster
gcloud container clusters get-credentials <your_destination_cluster> --zone <your_zone> --project <your_project>
Install Velero using the same parameters as step 5 on the first part:
velero install \
--provider gcp \
--plugins velero/velero-plugin-for-gcp:v1.4.0 \
--bucket $BUCKET \
--secret-file ./credentials-velero
Confirm that the velero pod is running:
kubectl get pods -n velero
NAME READY STATUS RESTARTS AGE
velero-xxxxxxxxxx-xxxxx 1/1 Running 0 19s
To avoid the backup data being overwritten, change the bucket to read-only mode:
kubectl patch backupstoragelocation default -n velero --type merge --patch '{"spec":{"accessMode":"ReadOnly"}}'
Confirm Velero is able to access the backup from bucket:
velero backup describe <your_backup_name> --details
Restore the backed up Volumes:
velero restore create --from-backup <your_backup_name>
Confirm that the persistent volumes have been restored on the destination cluster:
kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
redis-data-my-release-redis-master-0 Bound pvc-ae11172a-13fa-4ac4-95c5-d0a51349d914 8Gi RWO standard 79s
redis-data-my-release-redis-replicas-0 Bound pvc-f2cc7e07-b234-415d-afb0-47dd7b9993e7 8Gi RWO standard 79s
redis-data-my-release-redis-replicas-1 Bound pvc-ef9d116d-2b12-4168-be7f-e30b8d5ccc69 8Gi RWO standard 79s
redis-data-my-release-redis-replicas-2 Bound pvc-65d7471a-7885-46b6-a377-0703e7b01484 8Gi RWO standard 79s
Check out this tutorial as a reference.

Related

How can I change my Velero credentials after they were reset

I have an Azure Kubernetes cluster with Velero installed. A Service Principal was created for Velero, per option 1 of the instructions.
Velero was working fine until the credentials for the Service Principal were reset. Now the scheduled backups are failing.
NAME STATUS ERRORS WARNINGS CREATED EXPIRES STORAGE LOCATION SELECTOR
daily-entire-cluster-20210727030055 Failed 0 0 2021-07-26 23:00:55 -0000 13d default <none>
How can I update the secret for Velero?
1. Update credentials file
First, update your credentials file (for most providers, this is credentials-velero and the contents are described in the plugin installation instructions: AWS, Azure, GCP)
2. Update secret
Now update the velero secret. On linux:
kubectl patch -n velero secret cloud-credentials -p '{"data": {"cloud": "'$(base64 -w 0 credentials-velero)'"}}'
patch tells kubectl to update a resource by merging the provided data
-n velero tells kubectl to use the velero namespace
secret is the resource type
cloud-credentials is the name of the secret used by Velero to store credentials
-p specifies that the next word is the patch data. It's more common to patch using JSON rather than YAML
'{"data": {"cloud": "<your-base64-encoded-secret-will-go-here>"}}' this is the JSON data that matches the existing structure of the Velero secret in Kubernetes. <your-base64-encoded-secret-will-go-here> is a placeholder for the command we'll insert.
$(base64 -w 0 credentials-velero) reads the file credentials-velero in the current directory, turns off word wrapping of the output (-w 0), BASE64-encodes the contents of the file, and inserts the result in the data.

Migrating Kubernetes cluster to other OpenStack region

I am trying to migrate Kubernetes cluster (master and worker instances) to different OpenStack region. I managed to start cluster after some simple modifications (changed cloud-config, node labels). There is one problem left - storage. In this setup I use OpenStack Internal Cloud Provider which manages cinder volumes as PV for pods. New region uses different zone name and volume types. Also volume ids have changed. It is not possible to change this values by modifying SC and PV definitions via e.g. kubectl.
I wonder if it is possible to change this directly in etcd database?
So far, I tried to modify PV definition, but it appears that Kubernetes inserts also additional characters and modifying it is not so obvious.
What I did:
Get PV definition from etcd and saved to file:
docker run --rm -it --net host -v /etc/kubernetes:/etc/kubernetes k8s.gcr.io/etcd:3.4.3-0 etcdctl --cert /etc/kubernetes/pki/etcd/healthcheck-client.crt --key /etc/kubernetes/pki/etcd/healthcheck-client.key --cacert /etc/kubernetes/pki/etcd/ca.crt --endpoints https://kube-dev02-master01:2379 get /registry/persistentvolumes/pvc-1625baa0-e36c-4e2b-ad3d-0dfecc910ae0 --print-value-only > pv1.txt.
I changed region name, zone name and volume id (with vi).
Loaded modified value to etcd:
docker run --rm -it --net host -v /etc/kubernetes:/etc/kubernetes k8s.gcr.io/etcd:3.4.3-0 etcdctl --cert /etc/kubernetes/pki/etcd/healthcheck-client.crt --key /etc/kubernetes/pki/etcd/healthcheck-client.key --cacert /etc/kubernetes/pki/etcd/ca.crt --endpoints https://kube-dev02-master01:2379 put /registry/persistentvolumes/pvc-1625baa0-e36c-4e2b-ad3d-0dfecc910ae0 "$(cat pv1.txt)"
Checked PV from kubectl:
[kubeadmin#kube-dev02-master01 ~]$ kubectl get pv
Error from server: illegal base64 data at input byte 5
So it seems that something could be wrong with encoding, but I do not know where.
Output of PV value stored in etcd
Kubernetes v1.17.5, etcd v.3.4.3-0 installed with Kubeadm.

Vault Injector with externalVaultAddr permission denied

I'm trying to connect Vault injector from my AWS Kubernetes cluster and login using the injector.externalVaultAddr="http://my-aws-instance-ip:8200"
Installation:
helm install vault \
--set='injector.externalVaultAddr=http://my-aws-instance-ip:8200' \
/path/to/my/vault-helm
After following the necessary steps here: https://www.hashicorp.com/blog/injecting-vault-secrets-into-kubernetes-pods-via-a-sidecar/ and customising this:
vault write auth/kubernetes/config \
token_reviewer_jwt="$(cat kube_token_from_my_k8s_cluster)" \
kubernetes_host="$KUBERNETES_HOST:443" \
kubernetes_ca_cert=#downloaded_ca_crt_from_my_k8s_cluster.crt
Now, after adding the deployment app.yaml and adding the vault annotations to read the secret/helloworld, it doesn't start my pod and shows this error on the container vault-agent-init
NAME READY STATUS RESTARTS AGE
app-aaaaaaa-bbbb 1/1 Running 0 34m
app-aaaaaaa-cccc 0/2 Init:0/1 0 34m
vault-agent-injector-xxxxxx-zzzzz 1/1 Running 0 35m
$ kubectl logs -f app-aaaaaaa-cccc -c vault-agent-init
...
URL: PUT http://my-aws-instance-ip:8200/v1/auth/kubernetes/login
Code: 403. Errors:
* permission denied" backoff=2.769289902
I have also tried manually doing it in my local:
$ export VAULT_ADDR="http://my-aws-instance-ip:8200"
$ curl --request POST \
--data "{\"role\": \"myapp\", \"jwt\": \"$(cat kube_token_from_my_k8s_cluster)\"}" \
$VAULT_ADDR/v1/auth/kubernetes/login
# RESPONSE OUTPUT
{"errors":["permission denied"]}
Try curling the Kubernetes API-service with the values you used to set up Kubernetes auth in Vault.
vault write auth/kubernetes/config \
token_reviewer_jwt="$TOKEN_REVIEW_JWT" \
kubernetes_host="$KUBE_HOST" \
kubernetes_ca_cert="$KUBE_CA_CERT"
curl --cacert <ca-cert-file> -H "Authorization: Bearer $TOKEN_REVIEW_JWT" $KUBE_HOST
If that doesn't work Vault won't be able to communicate with the cluster to get tokens verified.
I had this problem when trying to integrate Vault with a Rancher managed cluster, $KUBE_HOST was pointing to the rancher proxy so I needed to change it to the IP of the cluster and extract token and ca cert from the service account I was using.

Kubernetes on Azure AKS secrets for Docker Hub missing after autoscaling

I am dealing with some issues on Kubernetes on Azure (AKS) using Autoscaler and secrets for pulling images from Docker Hub.
I created the secret in my applications namespace while having 3 nodes enabled (initial cluster status).
kubectl create secret docker-registry mysecret --docker-server=https://index.docker.io/v1/ --docker-username=<docker_id> --docker-password=<docker_password> -n mynamespace
I deploy my application using the imagePullSecrets option after specifying images URL.
imagePullSecrets:
- name: mysecret
After deploying the application I created the autoscaler rule.
kubectl autoscale deployment mydeployment --cpu-percent=50 --min=1 --max=20 -n mynamespace
All new pods pull the image correctly. However at some point when new Kubernetes node is being automatically deployed, all new pods requiring the DockerHub based image can not start.
Failed to pull image "mydocherhubaccount/myimage:mytag": rpc error: code = Unknown desc = Error response from daemon: pull access denied for mydocherhubaccount/myimage:mytag, repository does not exist or may require 'docker login': denied: requested access to the resource is denied
Is there anything I am missing here? I waited for 15 minutes and recreated pods but it did not help.
I use Kubernetes 1.15.5 on Azure AKS. The cluster was created using the following command.
az aks create -g myresourcegroup -n mynamespace --location eastus --kubernetes-version 1.15.5 --node-count 3 --node-osdisk-size 100 --node-vm-size Standard_D4_v3 --enable-vmss --enable-cluster-autoscaler --min-count 3 --max-count 5
I appreciate any help provided. It really got me stuck here.

How to get Kubernetes secret from one cluster to apply to another?

For my e2e tests I'm spinning up a separate cluster into which I'd like to import my production TLS certificate. I'm having trouble to switch the context between the two clusters (export/get from one and import/apply (in)to another) because the cluster doesn't seem to be visible.
I extracted a MVCE using a GitLab CI and the following .gitlab-ci.yml where I create a secret for demonstration purposes:
stages:
- main
- tear-down
main:
image: google/cloud-sdk
stage: main
script:
- echo "$GOOGLE_KEY" > key.json
- gcloud config set project secret-transfer
- gcloud auth activate-service-account --key-file key.json --project secret-transfer
- gcloud config set compute/zone us-central1-a
- gcloud container clusters create secret-transfer-1-$CI_COMMIT_SHORT_SHA-$CI_PIPELINE_IID --project secret-transfer --machine-type=f1-micro
- kubectl create secret generic secret-1 --from-literal=key=value
- gcloud container clusters create secret-transfer-2-$CI_COMMIT_SHORT_SHA-$CI_PIPELINE_IID --project secret-transfer --machine-type=f1-micro
- gcloud config set container/use_client_certificate True
- gcloud config set container/cluster secret-transfer-1-$CI_COMMIT_SHORT_SHA-$CI_PIPELINE_IID
- kubectl get secret letsencrypt-prod --cluster=secret-transfer-1-$CI_COMMIT_SHORT_SHA-$CI_PIPELINE_IID -o yaml > secret-1.yml
- gcloud config set container/cluster secret-transfer-2-$CI_COMMIT_SHORT_SHA-$CI_PIPELINE_IID
- kubectl apply --cluster=secret-transfer-2-$CI_COMMIT_SHORT_SHA-$CI_PIPELINE_IID -f secret-1.yml
tear-down:
image: google/cloud-sdk
stage: tear-down
when: always
script:
- echo "$GOOGLE_KEY" > key.json
- gcloud config set project secret-transfer
- gcloud auth activate-service-account --key-file key.json
- gcloud config set compute/zone us-central1-a
- gcloud container clusters delete --quiet secret-transfer-1-$CI_COMMIT_SHORT_SHA-$CI_PIPELINE_IID
- gcloud container clusters delete --quiet secret-transfer-2-$CI_COMMIT_SHORT_SHA-$CI_PIPELINE_IID
I added secret-transfer-[1/2]-$CI_COMMIT_SHORT_SHA-$CI_PIPELINE_IID before kubectl statements in order to avoid error: no server found for cluster "secret-transfer-1-...-...", but it doesn't change the outcome.
I created a project secret-transfer, activated the Kubernetes API and got a JSON key for the Compute Engine service account which I'm providing in the environment variable GOOGLE_KEY. The output after checkout is
$ echo "$GOOGLE_KEY" > key.json
$ gcloud config set project secret-transfer
Updated property [core/project].
$ gcloud auth activate-service-account --key-file key.json --project secret-transfer
Activated service account credentials for: [131478687181-compute#developer.gserviceaccount.com]
$ gcloud config set compute/zone us-central1-a
Updated property [compute/zone].
$ gcloud container clusters create secret-transfer-1-$CI_COMMIT_SHORT_SHA-$CI_PIPELINE_IID --project secret-transfer --machine-type=f1-micro
WARNING: In June 2019, node auto-upgrade will be enabled by default for newly created clusters and node pools. To disable it, use the `--no-enable-autoupgrade` flag.
WARNING: Starting in 1.12, new clusters will have basic authentication disabled by default. Basic authentication can be enabled (or disabled) manually using the `--[no-]enable-basic-auth` flag.
WARNING: Starting in 1.12, new clusters will not have a client certificate issued. You can manually enable (or disable) the issuance of the client certificate using the `--[no-]issue-client-certificate` flag.
WARNING: Currently VPC-native is not the default mode during cluster creation. In the future, this will become the default mode and can be disabled using `--no-enable-ip-alias` flag. Use `--[no-]enable-ip-alias` flag to suppress this warning.
WARNING: Starting in 1.12, default node pools in new clusters will have their legacy Compute Engine instance metadata endpoints disabled by default. To create a cluster with legacy instance metadata endpoints disabled in the default node pool, run `clusters create` with the flag `--metadata disable-legacy-endpoints=true`.
WARNING: Your Pod address range (`--cluster-ipv4-cidr`) can accommodate at most 1008 node(s).
This will enable the autorepair feature for nodes. Please see https://cloud.google.com/kubernetes-engine/docs/node-auto-repair for more information on node autorepairs.
Creating cluster secret-transfer-1-9b219ea8-9 in us-central1-a...
...done.
Created [https://container.googleapis.com/v1/projects/secret-transfer/zones/us-central1-a/clusters/secret-transfer-1-9b219ea8-9].
To inspect the contents of your cluster, go to: https://console.cloud.google.com/kubernetes/workload_/gcloud/us-central1-a/secret-transfer-1-9b219ea8-9?project=secret-transfer
kubeconfig entry generated for secret-transfer-1-9b219ea8-9.
NAME LOCATION MASTER_VERSION MASTER_IP MACHINE_TYPE NODE_VERSION NUM_NODES STATUS
secret-transfer-1-9b219ea8-9 us-central1-a 1.12.8-gke.10 34.68.118.165 f1-micro 1.12.8-gke.10 3 RUNNING
$ kubectl create secret generic secret-1 --from-literal=key=value
secret/secret-1 created
$ gcloud container clusters create secret-transfer-2-$CI_COMMIT_SHORT_SHA-$CI_PIPELINE_IID --project secret-transfer --machine-type=f1-micro
WARNING: In June 2019, node auto-upgrade will be enabled by default for newly created clusters and node pools. To disable it, use the `--no-enable-autoupgrade` flag.
WARNING: Starting in 1.12, new clusters will have basic authentication disabled by default. Basic authentication can be enabled (or disabled) manually using the `--[no-]enable-basic-auth` flag.
WARNING: Starting in 1.12, new clusters will not have a client certificate issued. You can manually enable (or disable) the issuance of the client certificate using the `--[no-]issue-client-certificate` flag.
WARNING: Currently VPC-native is not the default mode during cluster creation. In the future, this will become the default mode and can be disabled using `--no-enable-ip-alias` flag. Use `--[no-]enable-ip-alias` flag to suppress this warning.
WARNING: Starting in 1.12, default node pools in new clusters will have their legacy Compute Engine instance metadata endpoints disabled by default. To create a cluster with legacy instance metadata endpoints disabled in the default node pool, run `clusters create` with the flag `--metadata disable-legacy-endpoints=true`.
WARNING: Your Pod address range (`--cluster-ipv4-cidr`) can accommodate at most 1008 node(s).
This will enable the autorepair feature for nodes. Please see https://cloud.google.com/kubernetes-engine/docs/node-auto-repair for more information on node autorepairs.
Creating cluster secret-transfer-2-9b219ea8-9 in us-central1-a...
...done.
Created [https://container.googleapis.com/v1/projects/secret-transfer/zones/us-central1-a/clusters/secret-transfer-2-9b219ea8-9].
To inspect the contents of your cluster, go to: https://console.cloud.google.com/kubernetes/workload_/gcloud/us-central1-a/secret-transfer-2-9b219ea8-9?project=secret-transfer
kubeconfig entry generated for secret-transfer-2-9b219ea8-9.
NAME LOCATION MASTER_VERSION MASTER_IP MACHINE_TYPE NODE_VERSION NUM_NODES STATUS
secret-transfer-2-9b219ea8-9 us-central1-a 1.12.8-gke.10 104.198.37.21 f1-micro 1.12.8-gke.10 3 RUNNING
$ gcloud config set container/use_client_certificate True
Updated property [container/use_client_certificate].
$ gcloud config set container/cluster secret-transfer-1-$CI_COMMIT_SHORT_SHA-$CI_PIPELINE_IID
Updated property [container/cluster].
$ kubectl get secret secret-1 --cluster=secret-transfer-1-$CI_COMMIT_SHORT_SHA-$CI_PIPELINE_IID -o yaml > secret-1.yml
error: no server found for cluster "secret-transfer-1-9b219ea8-9"
I'm expecting kubectl get secret to work because both clusters exist and the --cluster argument points to the right cluster.
Generally gcloud commands are used to manage gcloud resources and handle how you authenticate with gcloud, whereas kubectl commands affect how you interact with Kubernetes clusters, whether or not they happen to be running on GCP and/or created in GKE. As such, I would avoid doing:
$ gcloud config set container/use_client_certificate True
Updated property [container/use_client_certificate].
$ gcloud config set container/cluster \
secret-transfer-1-$CI_COMMIT_SHORT_SHA-$CI_PIPELINE_IID
Updated property [container/cluster].
It's not doing what you probably think it's doing (namely, changing anything about how kubectl targets clusters), and might mess with how future gcloud commands work.
Another consequence of gcloud and kubectl being separate, and in particular kubectl not knowing intimately about your gcloud settings, is that the cluster name from gcloud perspective is not the same as from the kubectl perspective. When you do things like gcloud config set compute/zone, kubectl doesn't know anything about that, so it has to be able to identify clusters uniquely which may have the same name but be in different projects and zone, and maybe not even in GKE (like minikube or some other cloud provider). That's why kubectl --cluster=<gke-cluster-name> <some_command> is not going to work, and it's why you're seeing the error message:
error: no server found for cluster "secret-transfer-1-9b219ea8-9"
As #coderanger pointed out, the cluster name that gets generated in your ~/.kube/config file after doing gcloud container clusters create ... has a more complex name, which currently has a pattern something like gke_[project]_[region]_[name].
So you could run commands with kubectl --cluster gke_[project]_[region]_[name] ... (or kubectl --context [project]_[region]_[name] ... which would be more idiomatic, although both will happen to work in this case since you're using the same service account for both clusters), however that requires knowledge of how gcloud generates these strings for context and cluster names.
An alternative would be to do something like:
$ KUBECONFIG=~/.kube/config1 gcloud container clusters create \
secret-transfer-1-$CI_COMMIT_SHORT_SHA-$CI_PIPELINE_IID \
--project secret-transfer --machine-type=f1-micro
$ KUBECONFIG=~/.kube/config1 kubectl create secret secret-1 --from-literal=key=value
$ KUBECONFIG=~/.kube/config2 gcloud container clusters create \
secret-transfer-2-$CI_COMMIT_SHORT_SHA-$CI_PIPELINE_IID \
--project secret-transfer --machine-type=f1-micro
$ KUBECONFIG=~/.kube/config1 kubectl get secret secret-1 -o yaml > secret-1.yml
$ KUBECONFIG=~/.kube/config2 kubectl apply -f secret-1.yml
By having separate KUBECONFIG files that you control, you don't have to guess any strings. Setting the KUBECONFIG variable when creating a cluster will result in creating that file and gcloud putting the credentials for kubectl to access that cluster in that file. Setting the KUBECONFIG environment variable when running kubectl command will ensure kubectl uses the context as set in that particular file.
You probably mean to be using --context rather than --cluster. The context sets both the cluster and user in use. Additionally the context and cluster (and user) names created by GKE are not just the cluster identifier, it's gke_[project]_[region]_[name].