Lets assume I have an existing GKE cluster that contains all my applications. They were all deployed using different methods. Now I want to deploy some resources to that cluster using Terraform. The trouble here is that terraform doesn't see it in his state file so it can't interact with it. Another problem is that even if I get that cluster to my state file, terraform doesn't see all of the created resources in that cluster. This could lead to some conflicts e.g. I'm trying to deploy two resources with the same name. Is there a way to solve this problem or do I just have to deal with the reality of my existence and create a new cluster for every new project that I deploy with terraform?
You can use terraform import command to import your existing GKE cluster to terraform state. Prior to run it, you need to have the adequate terraform configuration for your cluster.
example of import command :
terraform import google_container_cluster.<TF_RESOURCE_NAME> projects/<PROJECT_ID>/locations/<YOUR-CLUSTER-ZONE>/clusters/<CLUSTER_NAME>
for a terraform configuration :
resource "google_container_cluster" "<TF_RESOURCE_NAME>" {
name = "<CLUSTER_NAME>"
location = "<YOUR-CLUSTER-ZONE>"
}
The CLUSTER_NAME is the name displayed in your GKE clusters list on Google Cloud Console.
Then you need also to import the cluster node pool(s) in the same way using terraform google_container_node_pool resource.
Related
I've been using terraform for a while now and I have deployed everything into separate clusters. Now due to cost, we'd like to merge the different clusters into one cluster and use kubernetes namespaces.
My desired outcome would be that I could call terraform apply - var="kubernetes_namespace=my-namespace" and it would create the namespace which could live alongside my other namespaces, however, due to how terraform remote state is managed, any new deployment will overwrite the old and I can't have co-existing namespaces.
When I try to redeploy another namespace I get
namespace = "deploy-pr-image" -> "test-second-branch-pr" # forces replacement
I can see why because it's writing everything to a single workspace file.
terraform {
backend "s3" {
bucket = "my-tf-state"
key = "terraform-services.tfstate"
region = "us-west-1"
workspace_key_prefix = "workspaces"
#dynamodb_table = "terraform-state-lock-dynamo"
}
}
Is there some way to use the workspace/namespace combination to keep terraform from overwriting my other namespace ?
Since now you'll be merging all your clusters into a single one, it would make sense to only have a backend where you can manage the state of the cluster rather than having multiple backends per Kubernetes namespaces.
I suggest you to update your module or root deployment to be flexible enough that it can create X number of Kubernetes namespaces resources rather than a single one using count or for_each.
I successfully maintain a kubernetes cluster in digital ocean throught terraform. The core cluster configuration is the following:
resource "digitalocean_kubernetes_cluster" "cluster" {
name = var.name
region = var.region
version = var.k8s_version
vpc_uuid = digitalocean_vpc.network.id
node_pool {
name = "workers"
size = var.k8s_worker_size
node_count = var.k8s_worker_count
}
}
The problem is, I now need to increase the node size (stored in the variable k8s_worker_size).
If I simply change the variable to a new string, the terraform plan results in a full replace of the kubernetes cluster:
digitalocean_kubernetes_cluster.cluster must be replaced
This is not doable in our production environment.
The correct procedure to perform this operation inside digital ocean is to:
Create a new node pool, with the required size
Use kubectl drain to remove our pods from the 'old' nodes
Remove the previous node pool.
Of course, by doing this manually inside the digital ocean console, the terraform state is completely out-of-sync and is therefore unusable.
Is there a way to perform that operation through terraform?
As an alternative options, is it possible to "manually" update the terraform state in order to sync it with the real cluster state after I perform the migration manually?
Is there a way to perform that operation through terraform?
There might be some edge cases where there is a solution to this. Since I am not familiar with kubernetes inside DigitalOcean I can't share a specific solution.
As an alternative options, is it possible to "manually" update the terraform state in order to sync it with the real cluster state after I perform the migration manually?
Yes! Do as you proposed manually and then remove the out-of-sync cluster with
terraform state rm digitalocean_kubernetes_cluster.cluster
from the state. Please visit the corresponding documentation for state rm and update the address if your cluster is in a module etc. Then use
terraform import digitalocean_kubernetes_cluster.cluster <id of your cluster>
to reimport the cluster. Please consult the documentation for importing the cluster for the details. The documentations mentions something around tagging the default node pool.
When I use Terraform to create a cluster in GKE everything works fine and as expected.
After the cluster is created, I want to then use Terraform to deploy a workload.
My issue is, how to be able to point at the correct cluster, but I'm not sure I understand the best way of achieving this.
I want to automate the retrieval of the clusters kubeconfig file- the file which is generally stored at ~/.kube/config. This file is updated when users run this command manually to authenticate to the correct cluster.
I am aware if this file is stored on the host machine (the one I have Terraform running on) that it's possible to point at this file to authenticate to the cluster like so:
provider kubernetes {
# leave blank to pickup config from kubectl config of local system
config_path = "~/.kube/config"
}
However, running this command to generate the kubeconfig requires Cloud SDK to be installed on the same machine that Terraform is running on, and its manual execution doesn't exactly seem very elegant.
I am sure I must be missing something in how to achieve this.
Is there a better way to retrieve the kubeconfig file via Terraform from a cluster created by Terraform?
Actually, there is another way to access to fresh created gke.
data "google_client_config" "client" {}
provider "kubernetes" {
load_config_file = false
host = google_container_cluster.main.endpoint
cluster_ca_certificate = base64decode(google_container_cluster.main.master_auth.0.cluster_ca_certificate)
token = data.google_client_config.client.access_token
}
Basically in one step create you cluster. Export the kube config file to S3 for example.
In another step retrieve the file and move to the default folder. Terraform should work following this steps. Then you can apply your obejcts to cluster created previuoly.
I am deplyoing using gitlabCi pipeline, I have one repository code for k8s cluster (infra) and another with the k8s objects. The first pipeline triggers the second.
In my terraform infrastructure, I spin up several Kubernetes clusters based on parameters, then install some standard contents to those Kubernetes clusters using the kubernetes provider.
When I change the parameters and one of the clusters is no longer needed, terraform is unable to tear it down because the provider and resources are both in the module. I don't see an alternative, however, because I create the kubernetes cluster in that same module, and the kubernetes object are all per kubernetes cluster.
All solutions I can think of involve adding a bunch of boilerplate to my terraform config. Should I consider generating my terraform config from a script?
I made a git repo that shows exactly the problems I'm having:
https://github.com/bukzor/terraform-gke-k8s-demo
TL;DR
Two solutions:
Create two separate modules with Terraform
Use interpolations and depends_on between the code that creates your Kubernetes cluster and the kubernetes resources:
resource "kubernetes_service" "example" {
metadata {
name = "my-service"
}
depends_on = ["aws_vpc.kubernetes"]
}
resource "aws_vpc" "kubernetes" {
...
}
When destroying resources
You are encountering a dependency lifecycle issue
PS: I don't know the code you've used to create / provision your Kubernetes cluster but I guess it looks like this
Write code for the Kubernetes cluster (creates a VPC)
Apply it
Write code for provisionning Kubernetes (create an Service that creates an ELB)
Apply it
Try to destroy everything => Error
What is happenning is that by creating a LoadBalancer Service, Kubernetes will provision an ELB on AWS. But Terraform doesn't know that and there is no link between the ELB created and any other resources managed by Terraform.
So when terraform tries to destroy the resources in the code, it will try to destroy the VPC. But it can't because there is an ELB inside that VPC that terraform doesn't know about.
The first thing would be to make sure that Terraform "deprovision" the Kubernetes cluster and then destroy the cluster itself.
Two solutions here:
Use different modules so there is no dependency lifecycle. For example the first module could be k8s-infra and the other could be k8s-resources. The first one manages all the squeleton of Kubernetes and is apply first / destroy last. The second one manages what is inside the cluster and is apply last / destroy first.
Use the depends_on parameter to write the dependency lifecycle explicitly
When creating resources
You might also ran into a dependency issue when terraform apply cannot create resources even if nothing is applied yet. I'll give an other example with a postgres
Write code to create an RDS PostgreSQL server
Apply it with Terraform
Write code, in the same module, to provision that RDS instance with the postgres terraform provider
Apply it with Terraform
Destroy everything
Try to apply everything => ERROR
By debugging Terraform a bit I've learned that all the providers are initialized at the beggining of the plan / apply so if one has an invalid config (wrong API keys / unreachable endpoint) then Terraform will fail.
The solution here is to use the target parameter of a plan / apply command.
Terraform will only initialize providers that are related to the resources that are applied.
Apply the RDS code with the AWS provider: terraform apply -target=aws_db_instance
Apply everything terraform apply. Because the RDS instance is already reachable, the PostgreSQL provider can also initiate itself
I've followed the instructions to create an EKS cluster in AWS using Terraform.
https://www.terraform.io/docs/providers/aws/guides/eks-getting-started.html
I've also copied the output for connecting to the cluster to ~/.kube/config-eks. I've verified this successfully works as I've been able to connect to the cluster and manually deploy containers. However, now i'm trying to use the Terraform Kubernetes provider to connect to the cluster but cannot seem to be able to configure the provider properly.
I've configured the provider to use my kubectl configuration but when attempting to push a simple configmap, i get an error stating the following:
configmaps is forbidden: User "system:anonymous" cannot create configmaps in the namespace "kube-system"
I know that the provider is picking up part of the configuration but I cannot seem to get it to authenticate. I suspect this is because EKS uses heptio for authentication and i'm not sure if the K8s Go client used by Terraform can support heptio. However, given that Terraform released their AWS EKS support when EKS went GA, I'd doubt that they wouldn't also update their Terraform provider to work with it.
Is it possible to even do this now? Are there alternatives?
Exec auth was added here: https://github.com/kubernetes/client-go/commit/19c591bac28a94ca793a2f18a0cf0f2e800fad04
This is what is utilized for custom authentication plugins and was published Feb 7th.
Right now, Terraform doesn't support the new exec-based authentication provider, but there is an issue open with a workaround: https://github.com/terraform-providers/terraform-provider-kubernetes/issues/161
That said, if I get some free time I will work on a PR.