ERROR controller.provisioning Could not schedule pod, incompatible with provisioner "default", incompatible requirements, key karpenter.sh/provisioner - kubernetes

I read through the karpenter document at https://karpenter.sh/v0.16.1/getting-started/getting-started-with-terraform/#install-karpenter-helm-chart. I followed instructions step by step. I got errors at the end.
kubectl logs -f -n karpenter -l app.kubernetes.io/name=karpenter -c controller
DEBUG controller.provisioning Relaxing soft constraints for pod since it previously failed to schedule, removing: spec.topologySpreadConstraints = {"maxSkew":1,"topologyKey":"topology.kubernetes.io/zone","whenUnsatisfiable":"ScheduleAnyway","labelSelector":{"matchLabels":{"app.kubernetes.io/instance":"karpenter","app.kubernetes.io/name":"karpenter"}}} {"commit": "b157d45", "pod": "karpenter/karpenter-5755bb5b54-rh65t"}
2022-09-10T00:13:13.122Z
ERROR controller.provisioning Could not schedule pod, incompatible with provisioner "default", incompatible requirements, key karpenter.sh/provisioner-name, karpenter.sh/provisioner-name DoesNotExist not in karpenter.sh/provisioner-name In [default] {"commit": "b157d45", "pod": "karpenter/karpenter-5755bb5b54-rh65t"}
Below is the source code:
cat main.tf
terraform {
required_version = "~> 1.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 4.0"
}
helm = {
source = "hashicorp/helm"
version = "~> 2.5"
}
kubectl = {
source = "gavinbunney/kubectl"
version = "~> 1.14"
}
}
}
provider "aws" {
region = "us-east-1"
}
locals {
cluster_name = "karpenter-demo"
# Used to determine correct partition (i.e. - `aws`, `aws-gov`, `aws-cn`, etc.)
partition = data.aws_partition.current.partition
}
data "aws_partition" "current" {}
module "vpc" {
# https://registry.terraform.io/modules/terraform-aws-modules/vpc/aws/latest
source = "terraform-aws-modules/vpc/aws"
version = "3.14.4"
name = local.cluster_name
cidr = "10.0.0.0/16"
azs = ["us-east-1a", "us-east-1b", "us-east-1c"]
private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
public_subnets = ["10.0.101.0/24", "10.0.102.0/24", "10.0.103.0/24"]
enable_nat_gateway = true
single_nat_gateway = true
one_nat_gateway_per_az = false
public_subnet_tags = {
"kubernetes.io/cluster/${local.cluster_name}" = "shared"
"kubernetes.io/role/elb" = 1
}
private_subnet_tags = {
"kubernetes.io/cluster/${local.cluster_name}" = "shared"
"kubernetes.io/role/internal-elb" = 1
}
}
module "eks" {
# https://registry.terraform.io/modules/terraform-aws-modules/eks/aws/latest
source = "terraform-aws-modules/eks/aws"
version = "18.29.0"
cluster_name = local.cluster_name
cluster_version = "1.22"
vpc_id = module.vpc.vpc_id
subnet_ids = module.vpc.private_subnets
# Required for Karpenter role below
enable_irsa = true
node_security_group_additional_rules = {
ingress_nodes_karpenter_port = {
description = "Cluster API to Node group for Karpenter webhook"
protocol = "tcp"
from_port = 8443
to_port = 8443
type = "ingress"
source_cluster_security_group = true
}
}
node_security_group_tags = {
# NOTE - if creating multiple security groups with this module, only tag the
# security group that Karpenter should utilize with the following tag
# (i.e. - at most, only one security group should have this tag in your account)
"karpenter.sh/discovery/${local.cluster_name}" = local.cluster_name
}
# Only need one node to get Karpenter up and running.
# This ensures core services such as VPC CNI, CoreDNS, etc. are up and running
# so that Karpenter can be deployed and start managing compute capacity as required
eks_managed_node_groups = {
initial = {
instance_types = ["m5.large"]
# Not required nor used - avoid tagging two security groups with same tag as well
create_security_group = false
min_size = 1
max_size = 1
desired_size = 1
iam_role_additional_policies = [
"arn:${local.partition}:iam::aws:policy/AmazonSSMManagedInstanceCore", # Required by Karpenter
"arn:${local.partition}:iam::aws:policy/AmazonEKSWorkerNodePolicy",
"arn:${local.partition}:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly", #for access to ECR images
"arn:${local.partition}:iam::aws:policy/CloudWatchAgentServerPolicy"
]
tags = {
# This will tag the launch template created for use by Karpenter
"karpenter.sh/discovery/${local.cluster_name}" = local.cluster_name
}
}
}
}
#The EKS module creates an IAM role for the EKS managed node group nodes. We’ll use that for Karpenter.
#We need to create an instance profile we can reference.
#Karpenter can use this instance profile to launch new EC2 instances and those instances will be able to connect to your cluster.
resource "aws_iam_instance_profile" "karpenter" {
name = "KarpenterNodeInstanceProfile-${local.cluster_name}"
role = module.eks.eks_managed_node_groups["initial"].iam_role_name
}
#Create the KarpenterController IAM Role
#Karpenter requires permissions like launching instances, which means it needs an IAM role that grants it access. The config
#below will create an AWS IAM Role, attach a policy, and authorize the Service Account to assume the role using IRSA. We will
#create the ServiceAccount and connect it to this role during the Helm chart install.
module "karpenter_irsa" {
source = "terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks"
version = "5.3.3"
role_name = "karpenter-controller-${local.cluster_name}"
attach_karpenter_controller_policy = true
karpenter_tag_key = "karpenter.sh/discovery/${local.cluster_name}"
karpenter_controller_cluster_id = module.eks.cluster_id
karpenter_controller_node_iam_role_arns = [
module.eks.eks_managed_node_groups["initial"].iam_role_arn
]
oidc_providers = {
ex = {
provider_arn = module.eks.oidc_provider_arn
namespace_service_accounts = ["karpenter:karpenter"]
}
}
}
#Install Karpenter Helm Chart
#Use helm to deploy Karpenter to the cluster. We are going to use the helm_release Terraform resource to do the deploy and pass in the
#cluster details and IAM role Karpenter needs to assume.
provider "helm" {
kubernetes {
host = module.eks.cluster_endpoint
cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)
exec {
api_version = "client.authentication.k8s.io/v1beta1"
command = "aws"
args = ["eks", "get-token", "--cluster-name", local.cluster_name]
}
}
}
resource "helm_release" "karpenter" {
namespace = "karpenter"
create_namespace = true
name = "karpenter"
repository = "https://charts.karpenter.sh"
chart = "karpenter"
version = "v0.16.1"
set {
name = "serviceAccount.annotations.eks\\.amazonaws\\.com/role-arn"
value = module.karpenter_irsa.iam_role_arn
}
set {
name = "clusterName"
value = module.eks.cluster_id
}
set {
name = "clusterEndpoint"
value = module.eks.cluster_endpoint
}
set {
name = "aws.defaultInstanceProfile"
value = aws_iam_instance_profile.karpenter.name
}
}
#Provisioner
#Create a default provisioner using the command below. This provisioner configures instances to connect to your cluster’s endpoint and
#discovers resources like subnets and security groups using the cluster’s name.
#This provisioner will create capacity as long as the sum of all created capacity is less than the specified limit.
provider "kubectl" {
apply_retry_count = 5
host = module.eks.cluster_endpoint
cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)
load_config_file = false
exec {
api_version = "client.authentication.k8s.io/v1beta1"
command = "aws"
args = ["eks", "get-token", "--cluster-name", module.eks.cluster_id]
}
}
resource "kubectl_manifest" "karpenter_provisioner" {
yaml_body = <<-YAML
apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
name: default
spec:
requirements:
- key: karpenter.sh/capacity-type
operator: In
values: ["spot"]
limits:
resources:
cpu: 1000
provider:
subnetSelector:
Name: "*private*"
securityGroupSelector:
karpenter.sh/discovery/${module.eks.cluster_id}: ${module.eks.cluster_id}
tags:
karpenter.sh/discovery/${module.eks.cluster_id}: ${module.eks.cluster_id}
ttlSecondsAfterEmpty: 30
YAML
depends_on = [
helm_release.karpenter
]
}
cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
name: inflate
spec:
replicas: 0
selector:
matchLabels:
app: inflate
template:
metadata:
labels:
app: inflate
spec:
terminationGracePeriodSeconds: 0
containers:
- name: inflate
image: public.ecr.aws/eks-distro/kubernetes/pause:3.2
resources:
requests:
cpu: 1
EOF
kubectl scale deployment inflate --replicas 5
kubectl logs -f -n karpenter -l app.kubernetes.io/name=karpenter -c controller
DEBUG controller.provisioning Relaxing soft constraints for pod since it previously failed to schedule, removing: spec.topologySpreadConstraints = {"maxSkew":1,"topologyKey":"topology.kubernetes.io/zone","whenUnsatisfiable":"ScheduleAnyway","labelSelector":{"matchLabels":{"app.kubernetes.io/instance":"karpenter","app.kubernetes.io/name":"karpenter"}}} {"commit": "b157d45", "pod": "karpenter/karpenter-5755bb5b54-rh65t"}
2022-09-10T00:13:13.122Z
ERROR controller.provisioning Could not schedule pod, incompatible with provisioner "default", incompatible requirements, key karpenter.sh/provisioner-name, karpenter.sh/provisioner-name DoesNotExist not in karpenter.sh/provisioner-name In [default] {"commit": "b157d45", "pod": "karpenter/karpenter-5755bb5b54-rh65t"}

I belive this is due to the pod topology defined in the Karpenter deployment here:
https://github.com/aws/karpenter/blob/main/charts/karpenter/values.yaml#L73-L77
, you can read further on what pod topologySpreadConstraints does here:
https://kubernetes.io/docs/concepts/scheduling-eviction/topology-spread-constraints/
If you increase the desired_size to 2 which matches the default deployment replicas above, that should resove the error.

Related

Helm - Kubernetes cluster unreachable: the server has asked for the client to provide credentials

I'm trying to deploy an EKS self managed with Terraform. While I can deploy the cluster with addons, vpc, subnet and all other resources, it always fails at helm:
Error: Kubernetes cluster unreachable: the server has asked for the client to provide credentials
with module.eks-ssp-kubernetes-addons.module.ingress_nginx[0].helm_release.nginx[0]
on .terraform/modules/eks-ssp-kubernetes-addons/modules/kubernetes-addons/ingress-nginx/main.tf line 19, in resource "helm_release" "nginx":
resource "helm_release" "nginx" {
This error repeats for metrics_server, lb_ingress, argocd, but cluster-autoscaler throws:
Warning: Helm release "cluster-autoscaler" was created but has a failed status.
with module.eks-ssp-kubernetes-addons.module.cluster_autoscaler[0].helm_release.cluster_autoscaler[0]
on .terraform/modules/eks-ssp-kubernetes-addons/modules/kubernetes-addons/cluster-autoscaler/main.tf line 1, in resource "helm_release" "cluster_autoscaler":
resource "helm_release" "cluster_autoscaler" {
My main.tf looks like this:
terraform {
backend "remote" {}
required_providers {
aws = {
source = "hashicorp/aws"
version = ">= 3.66.0"
}
kubernetes = {
source = "hashicorp/kubernetes"
version = ">= 2.7.1"
}
helm = {
source = "hashicorp/helm"
version = ">= 2.4.1"
}
}
}
data "aws_eks_cluster" "cluster" {
name = module.eks-ssp.eks_cluster_id
}
data "aws_eks_cluster_auth" "cluster" {
name = module.eks-ssp.eks_cluster_id
}
provider "aws" {
access_key = "xxx"
secret_key = "xxx"
region = "xxx"
assume_role {
role_arn = "xxx"
}
}
provider "kubernetes" {
host = data.aws_eks_cluster.cluster.endpoint
cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
token = data.aws_eks_cluster_auth.cluster.token
}
provider "helm" {
kubernetes {
host = data.aws_eks_cluster.cluster.endpoint
token = data.aws_eks_cluster_auth.cluster.token
cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
}
}
My eks.tf looks like this:
module "eks-ssp" {
source = "github.com/aws-samples/aws-eks-accelerator-for-terraform"
# EKS CLUSTER
tenant = "DevOpsLabs2b"
environment = "dev-test"
zone = ""
terraform_version = "Terraform v1.1.4"
# EKS Cluster VPC and Subnet mandatory config
vpc_id = "xxx"
private_subnet_ids = ["xxx","xxx", "xxx", "xxx"]
# EKS CONTROL PLANE VARIABLES
create_eks = true
kubernetes_version = "1.19"
# EKS SELF MANAGED NODE GROUPS
self_managed_node_groups = {
self_mg = {
node_group_name = "DevOpsLabs2b"
subnet_ids = ["xxx","xxx", "xxx", "xxx"]
create_launch_template = true
launch_template_os = "bottlerocket" # amazonlinux2eks or bottlerocket or windows
custom_ami_id = "xxx"
public_ip = true # Enable only for public subnets
pre_userdata = <<-EOT
yum install -y amazon-ssm-agent \
systemctl enable amazon-ssm-agent && systemctl start amazon-ssm-agent \
EOT
disk_size = 10
instance_type = "t2.small"
desired_size = 2
max_size = 10
min_size = 0
capacity_type = "" # Optional Use this only for SPOT capacity as capacity_type = "spot"
k8s_labels = {
Environment = "dev-test"
Zone = ""
WorkerType = "SELF_MANAGED_ON_DEMAND"
}
additional_tags = {
ExtraTag = "t2x-on-demand"
Name = "t2x-on-demand"
subnet_type = "public"
}
create_worker_security_group = false # Creates a dedicated sec group for this Node Group
},
}
}
enable_amazon_eks_vpc_cni = true
amazon_eks_vpc_cni_config = {
addon_name = "vpc-cni"
addon_version = "v1.7.5-eksbuild.2"
service_account = "aws-node"
resolve_conflicts = "OVERWRITE"
namespace = "kube-system"
additional_iam_policies = []
service_account_role_arn = ""
tags = {}
}
enable_amazon_eks_kube_proxy = true
amazon_eks_kube_proxy_config = {
addon_name = "kube-proxy"
addon_version = "v1.19.8-eksbuild.1"
service_account = "kube-proxy"
resolve_conflicts = "OVERWRITE"
namespace = "kube-system"
additional_iam_policies = []
service_account_role_arn = ""
tags = {}
}
#K8s Add-ons
enable_aws_load_balancer_controller = true
enable_metrics_server = true
enable_cluster_autoscaler = true
enable_aws_for_fluentbit = true
enable_argocd = true
enable_ingress_nginx = true
depends_on = [module.eks-ssp.self_managed_node_groups]
}
OP has confirmed in the comment that the problem was resolved:
Of course. I think I found the issue. Doing "kubectl get svc" throws: "An error occurred (AccessDenied) when calling the AssumeRole operation: User: arn:aws:iam::xxx:user/terraform_deploy is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::xxx:user/terraform_deploy"
Solved it by using my actual role, that's crazy. No idea why it was calling itself.
For similar problem look also this issue.
I solved this error by adding dependencies in the helm installations.
The depends_on will wait for the step to successfully complete and then helm module runs.
module "nginx-ingress" {
depends_on = [module.eks, module.aws-load-balancer-controller]
source = "terraform-module/release/helm"
...}
module "aws-load-balancer-controller" {
depends_on = [module.eks]
source = "terraform-module/release/helm"
...}
module "helm_autoscaler" {
depends_on = [module.eks]
source = "terraform-module/release/helm"
...}

Scheduled restarts on kubernetes using terraform

I run a kubernetes cluster on aws managed by terraform. I'd like to automatically restart the pods in the cluster at some regular interval, maybe weekly. Since the entire cluster is managed by terraform, I'd like to run the auto restart command through terraform as well.
At first I assumed that kubernetes would have some kind of ttl for its pods, but that does not seem to be the case.
Elsewhere on SO I've seen the ability to run auto restarts using a cron job managed by kubernetes (eg: How to schedule pods restart). Terraform has a relevant resource -- the kubernetes_cron_job -- but I can't fully understand how to set it up with the permissions necessary to actually run.
Would appreciate some feedback!
Below is what I've tried:
resource "kubernetes_cron_job" "deployment_restart" {
metadata {
name = "deployment-restart"
}
spec {
concurrency_policy = "Forbid"
schedule = "0 8 * * *"
starting_deadline_seconds = 10
successful_jobs_history_limit = 10
job_template {
metadata {}
spec {
backoff_limit = 2
active_deadline_seconds = 600
template {
metadata {}
spec {
service_account_name = var.service_account.name
container {
name = "kubectl"
image = "bitnami/kubectl"
command = ["kubectl rollout restart deploy"]
}
}
}
}
}
}
}
resource "kubernetes_role" "deployment_restart" {
metadata {
name = "deployment-restart"
}
rule {
api_groups = ["apps", "extensions"]
resources = ["deployments"]
verbs = ["get", "list", "patch", "watch"]
}
}
resource "kubernetes_role_binding" "deployment_restart" {
metadata {
name = "deployment-restart"
}
role_ref {
api_group = "rbac.authorization.k8s.io"
kind = "Role"
name = kubernetes_role.deployment_restart.metadata[0].name
}
subject {
kind = "ServiceAccount"
name = var.service_account.name
api_group = "rbac.authorization.k8s.io"
}
}
This was based on a combination of Granting RBAC roles in k8s cluster using terraform and How to schedule pods restart.
Currently getting the following error:
Error: RoleBinding.rbac.authorization.k8s.io "deployment-restart" is invalid: subjects[0].apiGroup: Unsupported value: "rbac.authorization.k8s.io": supported values: ""
As per the official documentation rolebinding.subjects.apiGroup for Service Accounts should be empty.
kubectl explain rolebinding.subjects.apiGroup
KIND: RoleBinding VERSION: rbac.authorization.k8s.io/v1
FIELD: apiGroup
DESCRIPTION:
APIGroup holds the API group of the referenced subject. Defaults to "" for
ServiceAccount subjects. Defaults to "rbac.authorization.k8s.io" for User
and Group subjects.

How to create routing for kubernetes with nginx ingress in terraform - scaleway

I have a kubernetes setup with a cluster and two pools (nodes), I have also setup an (nginx) ingress server for kubernetes with helm. All of this is written in terraform for scaleway. What I am struggling with is how to config the ingress server to route to my kubernetes pools/nodes depending on the url path.
For example, I want [url]/api to go to my scaleway_k8s_pool.api and [url]/auth to go to my scaleway_k8s_pool.auth.
This is my terraform code
provider "scaleway" {
zone = "fr-par-1"
region = "fr-par"
}
resource "scaleway_registry_namespace" "main" {
name = "main_container_registry"
description = "Main container registry"
is_public = false
}
resource "scaleway_k8s_cluster" "main" {
name = "main"
description = "The main cluster"
version = "1.20.5"
cni = "calico"
tags = ["i'm an awsome tag"]
autoscaler_config {
disable_scale_down = false
scale_down_delay_after_add = "5m"
estimator = "binpacking"
expander = "random"
ignore_daemonsets_utilization = true
balance_similar_node_groups = true
expendable_pods_priority_cutoff = -5
}
}
resource "scaleway_k8s_pool" "api" {
cluster_id = scaleway_k8s_cluster.main.id
name = "api"
node_type = "DEV1-M"
size = 1
autoscaling = true
autohealing = true
min_size = 1
max_size = 5
}
resource "scaleway_k8s_pool" "auth" {
cluster_id = scaleway_k8s_cluster.main.id
name = "auth"
node_type = "DEV1-M"
size = 1
autoscaling = true
autohealing = true
min_size = 1
max_size = 5
}
resource "null_resource" "kubeconfig" {
depends_on = [scaleway_k8s_pool.api, scaleway_k8s_pool.auth] # at least one pool here
triggers = {
host = scaleway_k8s_cluster.main.kubeconfig[0].host
token = scaleway_k8s_cluster.main.kubeconfig[0].token
cluster_ca_certificate = scaleway_k8s_cluster.main.kubeconfig[0].cluster_ca_certificate
}
}
output "cluster_url" {
value = scaleway_k8s_cluster.main.apiserver_url
}
provider "helm" {
kubernetes {
host = null_resource.kubeconfig.triggers.host
token = null_resource.kubeconfig.triggers.token
cluster_ca_certificate = base64decode(
null_resource.kubeconfig.triggers.cluster_ca_certificate
)
}
}
resource "helm_release" "ingress" {
name = "ingress"
chart = "ingress-nginx"
repository = "https://kubernetes.github.io/ingress-nginx"
namespace = "kube-system"
}
How would i go about configuring the nginx ingress server for routing to my kubernetes pools?

Create kubernetes secret for docker registry - Terraform

Using kubectl we can create docker registry authentication secret as follows
kubectl create secret docker-registry regsecret \
--docker-server=docker.example.com \
--docker-username=kube \
--docker-password=PW_STRING \
--docker-email=my#email.com \
How do i create this secret using terraform, i saw this link, it has data, in the flow of terraform the kubernetes instance is being created in azure and i get the data required from there and i created something like below
resource "kubernetes_secret" "docker-registry" {
metadata {
name = "registry-credentials"
}
data = {
docker-server = data.azurerm_container_registry.docker_registry_data.login_server
docker-username = data.azurerm_container_registry.docker_registry_data.admin_username
docker-password = data.azurerm_container_registry.docker_registry_data.admin_password
}
}
It seems that it is wrong as the images are not being pulled. What am i missing here.
If you run following command
kubectl create secret docker-registry regsecret \
--docker-server=docker.example.com \
--docker-username=kube \
--docker-password=PW_STRING \
--docker-email=my#email.com
It will create a secret like following
$ kubectl get secrets regsecret -o yaml
apiVersion: v1
data:
.dockerconfigjson: eyJhdXRocyI6eyJkb2NrZXIuZXhhbXBsZS5jb20iOnsidXNlcm5hbWUiOiJrdWJlIiwicGFzc3dvcmQiOiJQV19TVFJJTkciLCJlbWFpbCI6Im15QGVtYWlsLmNvbSIsImF1dGgiOiJhM1ZpWlRwUVYxOVRWRkpKVGtjPSJ9fX0=
kind: Secret
metadata:
creationTimestamp: "2020-06-01T18:31:07Z"
name: regsecret
namespace: default
resourceVersion: "42304"
selfLink: /api/v1/namespaces/default/secrets/regsecret
uid: 59054483-2789-4dd2-9321-74d911eef610
type: kubernetes.io/dockerconfigjson
If we decode .dockerconfigjson we will get
{"auths":{"docker.example.com":{"username":"kube","password":"PW_STRING","email":"my#email.com","auth":"a3ViZTpQV19TVFJJTkc="}}}
So, how can we do that using terraform?
I created a file config.json with following data
{"auths":{"${docker-server}":{"username":"${docker-username}","password":"${docker-password}","email":"${docker-email}","auth":"${auth}"}}}
Then in main.tf file
resource "kubernetes_secret" "docker-registry" {
metadata {
name = "regsecret"
}
data = {
".dockerconfigjson" = "${data.template_file.docker_config_script.rendered}"
}
type = "kubernetes.io/dockerconfigjson"
}
data "template_file" "docker_config_script" {
template = "${file("${path.module}/config.json")}"
vars = {
docker-username = "${var.docker-username}"
docker-password = "${var.docker-password}"
docker-server = "${var.docker-server}"
docker-email = "${var.docker-email}"
auth = base64encode("${var.docker-username}:${var.docker-password}")
}
}
then run
$ terraform apply
This will generate same secrets. Hope it will helps
I would suggest creating a azurerm_role_assignement to give aks access to the acr:
resource "azurerm_role_assignment" "aks_sp_acr" {
scope = azurerm_container_registry.acr.id
role_definition_name = "AcrPull"
principal_id = var.service_principal_obj_id
depends_on = [
azurerm_kubernetes_cluster.aks,
azurerm_container_registry.acr
]
}
Update
You can create the service principal in the azure portal or with az cli and use client_id, client_secret and object-id in terraform.
Get Client_id and Object_id by running az ad sp list --filter "displayName eq '<name>'". The secret has to be created in the Certificates & secrets tab of the service principal. See this guide: https://pixelrobots.co.uk/2018/11/first-look-at-terraform-and-the-azure-cloud-shell/
Just set all three as variable, eg for obj_id:
variable "service_principal_obj_id" {
default = "<object-id>"
}
Now use the credentials with aks:
resource "azurerm_kubernetes_cluster" "aks" {
...
service_principal {
client_id = var.service_principal_app_id
client_secret = var.service_principal_password
}
...
}
And set the object id in the acr as described above.
Alternative
You can create the service principal with terraform (only works if you have the necessary permissions). https://www.terraform.io/docs/providers/azuread/r/service_principal.html combined with a random_password resource:
resource "azuread_application" "aks_sp" {
name = "somename"
available_to_other_tenants = false
oauth2_allow_implicit_flow = false
}
resource "azuread_service_principal" "aks_sp" {
application_id = azuread_application.aks_sp.application_id
depends_on = [
azuread_application.aks_sp
]
}
resource "azuread_service_principal_password" "aks_sp_pwd" {
service_principal_id = azuread_service_principal.aks_sp.id
value = random_password.aks_sp_pwd.result
end_date = "2099-01-01T01:02:03Z"
depends_on = [
azuread_service_principal.aks_sp
]
}
You need to assign the role "Conributer" to the sp and can use it directly in aks / acr.
resource "azurerm_role_assignment" "aks_sp_role_assignment" {
scope = var.subscription_id
role_definition_name = "Contributor"
principal_id = azuread_service_principal.aks_sp.id
depends_on = [
azuread_service_principal_password.aks_sp_pwd
]
}
Use them with aks:
resource "azurerm_kubernetes_cluster" "aks" {
...
service_principal {
client_id = azuread_service_principal.aks_sp.app_id
client_secret = azuread_service_principal_password.aks_sp_pwd.value
}
...
}
and the role assignment:
resource "azurerm_role_assignment" "aks_sp_acr" {
scope = azurerm_container_registry.acr.id
role_definition_name = "AcrPull"
principal_id = azuread_service_principal.aks_sp.object_id
depends_on = [
azurerm_kubernetes_cluster.aks,
azurerm_container_registry.acr
]
}
Update secret example
resource "random_password" "aks_sp_pwd" {
length = 32
special = true
}

How can I configure an AWS EKS autoscaler with Terraform?

I'm using the AWS EKS provider (github.com/terraform-aws-modules/terraform-aws-eks ). I'm following along the tutorial with https://learn.hashicorp.com/terraform/aws/eks-intro
However this does not seem to have autoscaling enabled... It seems it's missing the cluster-autoscaler pod / daemon?
Is Terraform able to provision this functionality? Or do I need to set this up following a guide like: https://eksworkshop.com/scaling/deploy_ca/
You can deploy Kubernetes resources using Terraform. There is both a Kubernetes provider and a Helm provider.
data "aws_eks_cluster_auth" "authentication" {
name = "${var.cluster_id}"
}
provider "kubernetes" {
# Use the token generated by AWS iam authenticator to connect as the provider does not support exec auth
# see: https://github.com/terraform-providers/terraform-provider-kubernetes/issues/161
host = "${var.cluster_endpoint}"
cluster_ca_certificate = "${base64decode(var.cluster_certificate_authority_data)}"
token = "${data.aws_eks_cluster_auth.authentication.token}"
load_config_file = false
}
provider "helm" {
install_tiller = "true"
tiller_image = "gcr.io/kubernetes-helm/tiller:v2.12.3"
}
resource "helm_release" "cluster_autoscaler" {
name = "cluster-autoscaler"
repository = "stable"
chart = "cluster-autoscaler"
namespace = "kube-system"
version = "0.12.2"
set {
name = "autoDiscovery.enabled"
value = "true"
}
set {
name = "autoDiscovery.clusterName"
value = "${var.cluster_name}"
}
set {
name = "cloudProvider"
value = "aws"
}
set {
name = "awsRegion"
value = "${data.aws_region.current_region.name}"
}
set {
name = "rbac.create"
value = "true"
}
set {
name = "sslCertPath"
value = "/etc/ssl/certs/ca-bundle.crt"
}
}
This answer below is still not complete... But at least it gets me partially further...
1.
kubectl create clusterrolebinding add-on-cluster-admin --clusterrole=cluster-admin --serviceaccount=kube-system:default
helm install stable/cluster-autoscaler --name my-release --set "autoscalingGroups[0].name=demo,autoscalingGroups[0].maxSize=10,autoscalingGroups[0].minSize=1" --set rbac.create=true
And then manually fix the certificate path:
kubectl edit deployments my-release-aws-cluster-autoscaler
replace the following:
path: /etc/ssl/certs/ca-bundle.crt
With
path: /etc/ssl/certs/ca-certificates.crt
2.
In the AWS console, give AdministratorAccess policy to the terraform-eks-demo-node role.
3.
Update the nodes parameter with (kubectl edit deployments my-release-aws-cluster-autoscaler)
- --nodes=1:10:terraform-eks-demo20190922124246790200000007