Add secret to freshly created Azure AKS using Terraform Kubernetes provider fails - kubernetes

I am creating a kubernetes cluster with the Azure Terraform provider and trying to add a secret to it. The cluster gets created fine but I am getting errors with authenticating to the cluster when creating the secret. I tried 2 different Terraform Kubernetes provider configurations. Here is the main configuration:
variable "client_id" {}
variable "client_secret" {}
resource "azurerm_resource_group" "rg-example" {
name = "rg-example"
location = "East US"
}
resource "azurerm_kubernetes_cluster" "k8s-example" {
name = "k8s-example"
location = azurerm_resource_group.rg-example.location
resource_group_name = azurerm_resource_group.rg-example.name
dns_prefix = "k8s-example"
default_node_pool {
name = "default"
node_count = 1
vm_size = "Standard_B2s"
}
service_principal {
client_id = var.client_id
client_secret = var.client_secret
}
role_based_access_control {
enabled = true
}
}
resource "kubernetes_secret" "secret_example" {
metadata {
name = "mysecret"
}
data = {
"something" = "super secret"
}
depends_on = [
azurerm_kubernetes_cluster.k8s-example
]
}
provider "azurerm" {
version = "=2.29.0"
features {}
}
output "host" {
value = azurerm_kubernetes_cluster.k8s-example.kube_config.0.host
}
output "cluster_username" {
value = azurerm_kubernetes_cluster.k8s-example.kube_config.0.username
}
output "cluster_password" {
value = azurerm_kubernetes_cluster.k8s-example.kube_config.0.password
}
output "client_key" {
value = azurerm_kubernetes_cluster.k8s-example.kube_config.0.client_key
}
output "client_certificate" {
value = azurerm_kubernetes_cluster.k8s-example.kube_config.0.client_certificate
}
output "cluster_ca_certificate" {
value = azurerm_kubernetes_cluster.k8s-example.kube_config.0.cluster_ca_certificate
}
Here is the first Kubernetes provider configuration using certificates:
provider "kubernetes" {
version = "=1.13.2"
load_config_file = "false"
host = azurerm_kubernetes_cluster.k8s-example.kube_config.0.host
client_certificate = azurerm_kubernetes_cluster.k8s-example.kube_config.0.client_certificate
client_key = azurerm_kubernetes_cluster.k8s-example.kube_config.0.client_key
cluster_ca_certificate = azurerm_kubernetes_cluster.k8s-example.kube_config.0.cluster_ca_certificate
}
And the error I'm receiving:
kubernetes_secret.secret_example: Creating...
Error: Failed to configure client: tls: failed to find any PEM data in certificate input
Here is the second Kubernetes provider configuration using HTTP Basic Authorization:
provider "kubernetes" {
version = "=1.13.2"
load_config_file = "false"
host = azurerm_kubernetes_cluster.k8s-example.kube_config.0.host
username = azurerm_kubernetes_cluster.k8s-example.kube_config.0.username
password = azurerm_kubernetes_cluster.k8s-example.kube_config.0.password
}
And the error I'm receiving:
kubernetes_secret.secret_example: Creating...
Error: Post "https://k8s-example-c4a78c03.hcp.eastus.azmk8s.io:443/api/v1/namespaces/default/secrets": x509: certificate signed by unknown authority
ANALYSIS
I checked the outputs of azurerm_kubernetes_cluster.k8s-example and the data seems valid (username, password, host, etc..) Maybe I need a SSL certificate on my Kubernetes cluster, however I'm am not certain, as I'm new to this. Can someone help me out ?

According to this issue in hashicorp/terraform-provider-kubernetes, you need to use base64decode(). The example that author used:
provider "kubernetes" {
host = "${google_container_cluster.k8sexample.endpoint}"
username = "${var.master_username}"
password = "${var.master_password}"
client_certificate = "${base64decode(google_container_cluster.k8sexample.master_auth.0.client_certificate)}"
client_key = "${base64decode(google_container_cluster.k8sexample.master_auth.0.client_key)}"
cluster_ca_certificate = "${base64decode(google_container_cluster.k8sexample.master_auth.0.cluster_ca_certificate)}"
}
That author said they got the same error as you if they left out the base64decode. You can read more about that function here: https://www.terraform.io/docs/configuration/functions/base64decode.html

Related

Terraform kubectl provider error: failed to created kubernetes rest client for read of resource

I have a Terraform config that (among other resources) creates a Google Kubernetes Engine cluster on Google Cloud. I'm using the kubectl provider to add YAML manifests for a ManagedCertificate and a FrontendConfig, since these are not part of the kubernetes or google providers.
This works as expected when applying the Terraform config from my local machine, but when I try to execute it in our CI pipeline, I get the following error for both of the kubectl_manifest resources:
Error: failed to create kubernetes rest client for read of resource: Get "http://localhost/api?timeout=32s": dial tcp 127.0.0.1:80: connect: connection refused
Since I'm only facing this issue during CI, my first guess is that the service account is missing the right scopes, but as far as I can tell, all scopes are present. Any suggestions and ideas are greatly appreciated!
The provider trying to connect with localhost, which means either to you need to provide a proper kube-config file or set it dynamically in the terraform.
Although you didn't mention how are setting the auth, but here is two way
Poor way
resource "null_resource" "deploy-app" {
provisioner "local-exec" {
interpreter = ["/bin/bash", "-c"]
command = <<EOT
kubectl apply -f myapp.yaml ./temp/kube-config.yaml;
EOT
}
# will run always, its bad
triggers = {
always_run = "${timestamp()}"
}
depends_on = [
local_file.kube_config
]
}
resource "local_file" "kube_config" {
content = var.my_kube_config # pass the config file from ci variable
filename = "${path.module}/temp/kube-config.yaml"
}
Proper way
data "google_container_cluster" "cluster" {
name = "your_cluster_name"
}
data "google_client_config" "current" {
}
provider "kubernetes" {
host = data.google_container_cluster.cluster.endpoint
token = data.google_client_config.current.access_token
cluster_ca_certificate = base64decode(
data.google_container_cluster.cluster.master_auth[0].cluster_ca_certificate
)
}
data "kubectl_file_documents" "app_yaml" {
content = file("myapp.yaml")
}
resource "kubectl_manifest" "app_installer" {
for_each = data.kubectl_file_documents.app_yaml.manifests
yaml_body = each.value
}
If the cluster in the same module , then provider should be
provider "kubernetes" {
load_config_file = "false"
host = google_container_cluster.my_cluster.endpoint
client_certificate = google_container_cluster.my_cluster.master_auth.0.client_certificate
client_key = google_container_cluster.my_cluster.master_auth.0.client_key
cluster_ca_certificate = google_container_cluster.my_cluster.master_auth.0.cluster_ca_certificate
}
Fixed the issue by adding load_config_file = false to the kubectl provider config. My provider config now looks like this:
data "google_client_config" "default" {}
provider "kubernetes" {
host = "https://${endpoint from GKE}"
token = data.google_client_config.default.access_token
cluster_ca_certificate = base64decode(CA certificate from GKE)
}
provider "kubectl" {
host = "https://${endpoint from GKE}"
token = data.google_client_config.default.access_token
cluster_ca_certificate = base64decode(CA certificate from GKE)
load_config_file = false
}

Automated .kube/config file created with Terraform GKE module doesn't work

We are using the below terraform module to create the GKE cluster and the local config file. But the kubectl command doesn't work. The terminal just keeps on waiting. Then we have to manually run the command gcloud container clusters get-credentials to fetch and set up the local config credentials post the cluster creation, and then the kubectl command works. It doesn't feel elegant to execute the gcloud command after the terraform setup. Therefore please help to correct what's wrong.
terraform {
required_providers {
google = {
source = "hashicorp/google"
version = ">= 3.42.0"
}
google-beta = {
source = "hashicorp/google-beta"
version = ">= 3.0.0"
}
}
}
provider "google" {
credentials = var.gcp_creds
project = var.project_id
region = var.region
zone = var.zone
}
provider "google-beta" {
credentials = var.gcp_creds
project = var.project_id
region = var.region
zone = var.zone
}
module "gke" {
source = "terraform-google-modules/kubernetes-engine/google//modules/beta-private-cluster"
project_id = var.project_id
name = "cluster"
regional = true
region = var.region
zones = ["${var.region}-a", "${var.region}-b", "${var.region}-c"]
network = module.vpc.network_name
subnetwork = module.vpc.subnets_names[0]
ip_range_pods = "gke-pod-ip-range"
ip_range_services = "gke-service-ip-range"
horizontal_pod_autoscaling = true
enable_private_nodes = true
master_ipv4_cidr_block = "${var.control_plane_cidr}"
node_pools = [
{
name = "node-pool"
machine_type = "${var.machine_type}"
node_locations = "${var.region}-a,${var.region}-b,${var.region}-c"
min_count = "${var.node_pools_min_count}"
max_count = "${var.node_pools_max_count}"
disk_size_gb = "${var.node_pools_disk_size_gb}"
auto_repair = true
auto_upgrade = true
},
]
}
# Configure the authentication and authorisation of the cluster
module "gke_auth" {
source = "terraform-google-modules/kubernetes-engine/google//modules/auth"
depends_on = [module.gke]
project_id = var.project_id
location = module.gke.location
cluster_name = module.gke.name
}
# Define a local file that will store the necessary info such as certificate, user and endpoint to access the cluster
resource "local_file" "kubeconfig" {
content = module.gke_auth.kubeconfig_raw
filename = "~/.kube/config"
}
UPDATE:
We found that the automated .kube/config file was created with the wrong public IP of the API Server whereas the one created with the gcloud command contains the correct IP of the API Server. Any guesses why terraform one is fetching the wrong public IP?

Helm - Kubernetes cluster unreachable: the server has asked for the client to provide credentials

I'm trying to deploy an EKS self managed with Terraform. While I can deploy the cluster with addons, vpc, subnet and all other resources, it always fails at helm:
Error: Kubernetes cluster unreachable: the server has asked for the client to provide credentials
with module.eks-ssp-kubernetes-addons.module.ingress_nginx[0].helm_release.nginx[0]
on .terraform/modules/eks-ssp-kubernetes-addons/modules/kubernetes-addons/ingress-nginx/main.tf line 19, in resource "helm_release" "nginx":
resource "helm_release" "nginx" {
This error repeats for metrics_server, lb_ingress, argocd, but cluster-autoscaler throws:
Warning: Helm release "cluster-autoscaler" was created but has a failed status.
with module.eks-ssp-kubernetes-addons.module.cluster_autoscaler[0].helm_release.cluster_autoscaler[0]
on .terraform/modules/eks-ssp-kubernetes-addons/modules/kubernetes-addons/cluster-autoscaler/main.tf line 1, in resource "helm_release" "cluster_autoscaler":
resource "helm_release" "cluster_autoscaler" {
My main.tf looks like this:
terraform {
backend "remote" {}
required_providers {
aws = {
source = "hashicorp/aws"
version = ">= 3.66.0"
}
kubernetes = {
source = "hashicorp/kubernetes"
version = ">= 2.7.1"
}
helm = {
source = "hashicorp/helm"
version = ">= 2.4.1"
}
}
}
data "aws_eks_cluster" "cluster" {
name = module.eks-ssp.eks_cluster_id
}
data "aws_eks_cluster_auth" "cluster" {
name = module.eks-ssp.eks_cluster_id
}
provider "aws" {
access_key = "xxx"
secret_key = "xxx"
region = "xxx"
assume_role {
role_arn = "xxx"
}
}
provider "kubernetes" {
host = data.aws_eks_cluster.cluster.endpoint
cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
token = data.aws_eks_cluster_auth.cluster.token
}
provider "helm" {
kubernetes {
host = data.aws_eks_cluster.cluster.endpoint
token = data.aws_eks_cluster_auth.cluster.token
cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
}
}
My eks.tf looks like this:
module "eks-ssp" {
source = "github.com/aws-samples/aws-eks-accelerator-for-terraform"
# EKS CLUSTER
tenant = "DevOpsLabs2b"
environment = "dev-test"
zone = ""
terraform_version = "Terraform v1.1.4"
# EKS Cluster VPC and Subnet mandatory config
vpc_id = "xxx"
private_subnet_ids = ["xxx","xxx", "xxx", "xxx"]
# EKS CONTROL PLANE VARIABLES
create_eks = true
kubernetes_version = "1.19"
# EKS SELF MANAGED NODE GROUPS
self_managed_node_groups = {
self_mg = {
node_group_name = "DevOpsLabs2b"
subnet_ids = ["xxx","xxx", "xxx", "xxx"]
create_launch_template = true
launch_template_os = "bottlerocket" # amazonlinux2eks or bottlerocket or windows
custom_ami_id = "xxx"
public_ip = true # Enable only for public subnets
pre_userdata = <<-EOT
yum install -y amazon-ssm-agent \
systemctl enable amazon-ssm-agent && systemctl start amazon-ssm-agent \
EOT
disk_size = 10
instance_type = "t2.small"
desired_size = 2
max_size = 10
min_size = 0
capacity_type = "" # Optional Use this only for SPOT capacity as capacity_type = "spot"
k8s_labels = {
Environment = "dev-test"
Zone = ""
WorkerType = "SELF_MANAGED_ON_DEMAND"
}
additional_tags = {
ExtraTag = "t2x-on-demand"
Name = "t2x-on-demand"
subnet_type = "public"
}
create_worker_security_group = false # Creates a dedicated sec group for this Node Group
},
}
}
enable_amazon_eks_vpc_cni = true
amazon_eks_vpc_cni_config = {
addon_name = "vpc-cni"
addon_version = "v1.7.5-eksbuild.2"
service_account = "aws-node"
resolve_conflicts = "OVERWRITE"
namespace = "kube-system"
additional_iam_policies = []
service_account_role_arn = ""
tags = {}
}
enable_amazon_eks_kube_proxy = true
amazon_eks_kube_proxy_config = {
addon_name = "kube-proxy"
addon_version = "v1.19.8-eksbuild.1"
service_account = "kube-proxy"
resolve_conflicts = "OVERWRITE"
namespace = "kube-system"
additional_iam_policies = []
service_account_role_arn = ""
tags = {}
}
#K8s Add-ons
enable_aws_load_balancer_controller = true
enable_metrics_server = true
enable_cluster_autoscaler = true
enable_aws_for_fluentbit = true
enable_argocd = true
enable_ingress_nginx = true
depends_on = [module.eks-ssp.self_managed_node_groups]
}
OP has confirmed in the comment that the problem was resolved:
Of course. I think I found the issue. Doing "kubectl get svc" throws: "An error occurred (AccessDenied) when calling the AssumeRole operation: User: arn:aws:iam::xxx:user/terraform_deploy is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::xxx:user/terraform_deploy"
Solved it by using my actual role, that's crazy. No idea why it was calling itself.
For similar problem look also this issue.
I solved this error by adding dependencies in the helm installations.
The depends_on will wait for the step to successfully complete and then helm module runs.
module "nginx-ingress" {
depends_on = [module.eks, module.aws-load-balancer-controller]
source = "terraform-module/release/helm"
...}
module "aws-load-balancer-controller" {
depends_on = [module.eks]
source = "terraform-module/release/helm"
...}
module "helm_autoscaler" {
depends_on = [module.eks]
source = "terraform-module/release/helm"
...}

Terraform GKE x509: certificate signed by unknown authority

Following this tutorial,
https://learn.hashicorp.com/tutorials/terraform/gke?in=terraform/kubernetes
I have deployed a GKE cluster in GCloud.
Now when I try to schedule a deployment following this link,
https://learn.hashicorp.com/tutorials/terraform/kubernetes-provider
It fails with,
kubernetes_deployment.nginx: Creating...
Error: Failed to create deployment: Post "https://<ip>/apis/apps/v1/namespaces/default/deployments": x509: certificate signed by unknown authority
on kubernetes.tf line 21, in resource "kubernetes_deployment" "nginx":
21: resource "kubernetes_deployment" "nginx" {
My kubernetes.tf looks like this,
terraform {
required_providers {
kubernetes = {
source = "hashicorp/kubernetes"
}
}
}
provider "kubernetes" {
load_config_file = false
host = google_container_cluster.primary.endpoint
username = var.gke_username
password = var.gke_password
client_certificate = google_container_cluster.primary.master_auth.0.client_certificate
client_key = google_container_cluster.primary.master_auth.0.client_key
cluster_ca_certificate = google_container_cluster.primary.master_auth.0.cluster_ca_certificate
}
resource "kubernetes_deployment" "nginx" {
metadata {
name = "scalable-nginx-example"
labels = {
App = "ScalableNginxExample"
}
}
spec {
replicas = 2
selector {
match_labels = {
App = "ScalableNginxExample"
}
}
template {
metadata {
labels = {
App = "ScalableNginxExample"
}
}
spec {
container {
image = "nginx:1.7.8"
name = "example"
port {
container_port = 80
}
resources {
limits {
cpu = "0.5"
memory = "512Mi"
}
requests {
cpu = "250m"
memory = "50Mi"
}
}
}
}
}
}
}
I am using MacOS to run terraform. Any help is appreciated.
Please note that kubectl get pods --all-namespaces is working fine, so I don't think it's an issue with kube config.
Thanks,
Arun
It was because the certificate was base64 encoded, changing the provider section to the below snippet, got rid of the issue.
provider "kubernetes" {
load_config_file = false
host = google_container_cluster.primary.endpoint
username = var.gke_username
password = var.gke_password
client_certificate = base64decode(google_container_cluster.primary.master_auth.0.client_certificate)
client_key = base64decode(google_container_cluster.primary.master_auth.0.client_key)
cluster_ca_certificate = base64decode(google_container_cluster.primary.master_auth.0.cluster_ca_certificate)
}

Error by creating a namespace with terraform kubernetes provider

I'm struggling to create a namespace with kubernetes provider.
This is the simple terraform code I'm using:
provider "kubernetes" {
host = "https://ocp-test-1.srv.xxxx.it:8443"
username = "admin"
password = "admin"
load_config_file = "false" # when you wish not to load the local config file
}
resource "kubernetes_namespace" "gfexample" {
metadata {
annotations = {
name = "exampleannotation"
}
labels = {
mylabel = "labelvalue"
}
name = "terraformspace"
}
}
And here is the error:
kubernetes_namespace.gfexample: Creating...
Error: namespaces is forbidden: User "system:anonymous" cannot create namespaces at the cluster scope: no RBAC policy matched
on create_nm.tf line 14, in resource "kubernetes_namespace" "gfexample":
14: resource "kubernetes_namespace" "gfexample" {
Any suggestion will be welcome.
Gian Filippo
Finally I found the solution:
client_certificate = file("/terraform/certificates/admin.crt")
client_key = file("/terraform/certificates/admin.key")
cluster_ca_certificate = file("/terraform/certificates/ca.crt")
This worked fine. I found out the certificates above mentioned under /etc/origin/master (I'm running Openshift 3.11)