I would like to deploy an application and the pod should not go to running status(it should be non-operational). User might trigger this when it really requires using Infrastructure as Code (Terraform). I am aware of using kubectl scale -- replicas=0 . Any other leads or info will be well appreciated.
You can keep the replica count to zero for the Deployment or POD into your YAML file if you are using it.
Or if you are using the Terraform
resource "kubernetes_deployment" "example" {
metadata {
name = "terraform-example"
labels = {
test = "MyExampleApp"
}
}
spec {
replicas = 0
selector {
match_labels = {
test = "MyExampleApp"
}
}
template {
metadata {
labels = {
test = "MyExampleApp"
}
}
spec {
container {
image = "nginx:1.7.8"
name = "example"
resources {
limits = {
cpu = "0.5"
memory = "512Mi"
}
requests = {
cpu = "250m"
memory = "50Mi"
}
}
liveness_probe {
http_get {
path = "/nginx_status"
port = 80
http_header {
name = "X-Custom-Header"
value = "Awesome"
}
}
initial_delay_seconds = 3
period_seconds = 3
}
}
}
}
}
}
There is no other way around you can use the client of Kubernetes to do this if don't want to use the Terraform.
If you want to edit the local file using the terraform checkout local-exec
This invokes a process on the machine running Terraform, not on the
resource.
resource "aws_instance" "web" {
# ...
provisioner "local-exec" {
command = "echo ${self.private_ip} >> private_ips.txt"
}
}
using sed command in local-exec or any other command you can update the YAML and apply it.
https://www.terraform.io/docs/language/resources/provisioners/local-exec.html
Related
I have a K8 cluster that has smb mounted drives connected to an AWS Storage Gateway / file share. We've recently undergone a migration of that SGW to another AWS account and while doing that the IP address and password for that SGW changed.
I noticed that our existing setup has a K8 storage class that looks for a K8 secret called "smbcreds". In that K8 secret they have keys "username" and "password". I'm assuming it's in line with the setup guide for the Helm chart we're using "csi-driver-smb".
I assumed changing the secret used for the storage class would update everything downstream that uses that storage class, but apparently it does not. I'm obviously a little cautious when it comes to potentially blowing away important data, what do I need to do to update everything to use the new secret and IP config?
Here is a simple example of our setup in Terraform -
provider "kubernetes" {
config_path = "~/.kube/config"
config_context = "minikube"
}
resource "helm_release" "container_storage_interface_for_aws" {
count = 1
name = "local-filesystem-csi"
repository = "https://raw.githubusercontent.com/kubernetes-csi/csi-driver-smb/master/charts"
chart = "csi-driver-smb"
namespace = "default"
}
resource "kubernetes_storage_class" "aws_storage_gateway" {
count = 1
metadata {
name = "smbmount"
}
storage_provisioner = "smb.csi.k8s.io"
reclaim_policy = "Retain"
volume_binding_mode = "WaitForFirstConsumer"
parameters = {
source = "//1.2.3.4/old-file-share"
"csi.storage.k8s.io/node-stage-secret-name" = "smbcreds"
"csi.storage.k8s.io/node-stage-secret-namespace" = "default"
}
mount_options = ["vers=3.0", "dir_mode=0777", "file_mode=0777"]
}
resource "kubernetes_persistent_volume_claim" "aws_storage_gateway" {
count = 1
metadata {
name = "smbmount-volume-claim"
}
spec {
access_modes = ["ReadWriteMany"]
resources {
requests = {
storage = "10Gi"
}
}
storage_class_name = "smbmount"
}
}
resource "kubernetes_deployment" "main" {
metadata {
name = "sample-pod"
}
spec {
replicas = 1
selector {
match_labels = {
app = "sample-pod"
}
}
template {
metadata {
labels = {
app = "sample-pod"
}
}
spec {
volume {
name = "shared-fileshare"
persistent_volume_claim {
claim_name = "smbmount-volume-claim"
}
}
container {
name = "ubuntu"
image = "ubuntu"
command = ["sleep", "3600"]
image_pull_policy = "IfNotPresent"
volume_mount {
name = "shared-fileshare"
read_only = false
mount_path = "/data"
}
}
}
}
}
}
My original change was to change the K8 secret "smbcreds" and change source = "//1.2.3.4/old-file-share" to source = "//5.6.7.8/new-file-share"
The solution I settled on was to create a second K8 storage class and persistent volume claim that's connected to the new AWS Storage Gateway. I then switched the K8 deployments to use the new PVC.
Following this tutorial,
https://learn.hashicorp.com/tutorials/terraform/gke?in=terraform/kubernetes
I have deployed a GKE cluster in GCloud.
Now when I try to schedule a deployment following this link,
https://learn.hashicorp.com/tutorials/terraform/kubernetes-provider
It fails with,
kubernetes_deployment.nginx: Creating...
Error: Failed to create deployment: Post "https://<ip>/apis/apps/v1/namespaces/default/deployments": x509: certificate signed by unknown authority
on kubernetes.tf line 21, in resource "kubernetes_deployment" "nginx":
21: resource "kubernetes_deployment" "nginx" {
My kubernetes.tf looks like this,
terraform {
required_providers {
kubernetes = {
source = "hashicorp/kubernetes"
}
}
}
provider "kubernetes" {
load_config_file = false
host = google_container_cluster.primary.endpoint
username = var.gke_username
password = var.gke_password
client_certificate = google_container_cluster.primary.master_auth.0.client_certificate
client_key = google_container_cluster.primary.master_auth.0.client_key
cluster_ca_certificate = google_container_cluster.primary.master_auth.0.cluster_ca_certificate
}
resource "kubernetes_deployment" "nginx" {
metadata {
name = "scalable-nginx-example"
labels = {
App = "ScalableNginxExample"
}
}
spec {
replicas = 2
selector {
match_labels = {
App = "ScalableNginxExample"
}
}
template {
metadata {
labels = {
App = "ScalableNginxExample"
}
}
spec {
container {
image = "nginx:1.7.8"
name = "example"
port {
container_port = 80
}
resources {
limits {
cpu = "0.5"
memory = "512Mi"
}
requests {
cpu = "250m"
memory = "50Mi"
}
}
}
}
}
}
}
I am using MacOS to run terraform. Any help is appreciated.
Please note that kubectl get pods --all-namespaces is working fine, so I don't think it's an issue with kube config.
Thanks,
Arun
It was because the certificate was base64 encoded, changing the provider section to the below snippet, got rid of the issue.
provider "kubernetes" {
load_config_file = false
host = google_container_cluster.primary.endpoint
username = var.gke_username
password = var.gke_password
client_certificate = base64decode(google_container_cluster.primary.master_auth.0.client_certificate)
client_key = base64decode(google_container_cluster.primary.master_auth.0.client_key)
cluster_ca_certificate = base64decode(google_container_cluster.primary.master_auth.0.cluster_ca_certificate)
}
I am using terraform 12.20.0 and I have provisioned an EKS cluster with 2 node groups.
How can I add name tags to EKS node workers according to their node group names?
I have tried adding "Name" tag in the additional tag sections of each node-group but the tags did not take and my EC2 instance names are empty, while other tags appear.
Here is the configuration - I have skipped the less relevant bits:
module "eks-cluster" {
...
node_groups_defaults = {
disk_size = 128
key_name = var.key_name
subnets = [
aws_subnet.{{}}.id,
aws_subnet.{{}}.id,
]
k8s_labels = {
env = var.environment
}
additional_tags = {
env = var.environment
"k8s.io/cluster-autoscaler/enabled" = "true"
"k8s.io/cluster-autoscaler/${var.cluster-name}" = "true"
}
}
node_groups = {
app = {
name = "app"
.....
k8s_labels = {
nodegroup = "app"
}
additional_tags = {
nodegroup = "app"
Name = "${var.environment}-app-node"
}
}
ml = {
name = "ml"
...
instance_type = "m5.xlarge"
k8s_labels = {
nodegroup = "ml"
}
additional_tags = {
nodegroup = "ml"
Name = "${var.environment}-ml-node"
}
}
}
tags = {
env = var.environment
}
map_roles = [{
......
}]
}
As per documentation Resource: aws_eks_node_group doesn't allow for modifying tags on your instances.
There is a nice feature coming soon to EKS node groups which will allow you to pass a custom userdata script. Using that you will be able to modify programatically tags for your instances. Issues can be tracked -> https://github.com/aws/containers-roadmap/issues/596
UPDATE:
As of 20/08/2020, you can now utilise launch_template with your node group. This will allow you to pass in Name tag. Example:
resource "aws_launch_template" "cluster" {
image_id = data.aws_ssm_parameter.cluster.value
instance_type = "t3.medium"
name = "eks-launch-template-test"
update_default_version = true
tag_specifications {
resource_type = "instance"
tags = {
Name = "eks-node-group-instance-name"
}
}
}
The following Terraform resource works.
resource "aws_autoscaling_group_tag" "your_group_tag" {
autoscaling_group_name = aws_eks_node_group.your_group.resources[0].autoscaling_groups[0].name
tag {
key = "Name"
value = "enter-your-name-here"
propagate_at_launch = true
}
depends_on = [
aws_eks_node_group.your_group
]
}
Had the same issue, can this is the solution I came up with which works great.
First, create the ASG tags via the aws_autoscaling_group_tag resource.
resource "aws_autoscaling_group_tag" "mytag" {
autoscaling_group_name = aws_eks_node_group.main.resources[0].autoscaling_groups[0].name
tag {
key = "foo"
value = "bar"
propagate_at_launch = true
}
depends_on = [aws_eks_node_group.main]
}
Unfortunately this resource block doesn't accept multiple tags, so you'd have to create this resource block individually for each tag.
Another thing to keep in mind, is that the tags are applied to future scaled EC2 instances, not the currently running ones.
Which means, that you need to either manually scale down your nodes and scale back up, or write a bash script and run it as a local provisioner with terraform.
resource "null_resource" "refresh_autoscale" {
provisioner "local-exec" {
command = "cd ${path.module}/scripts ; bash ./scale_refresh.sh"
environment = {
ASG_NAME = aws_eks_node_group.main.resources[0].autoscaling_groups[0].name
CLUSTER_NAME = "foo_cluster"
NODE_GROUP_NAME = "foo_cluster_node"
REGION = var.region
AWS_PROFILE = var.aws_profile
DESIRED_SIZE = var.desired_size
MIN_SIZE = var.min_size
MAX_SIZE = var.max_size
}
}
depends_on = [aws_eks_node_group.main]
}
Your bash script can run commands via the AWS CLI to scale down and up your node groups.
aws --profile ${AWS_PROFILE} --region ${REGION} eks update-nodegroup-config --cluster-name ${CLUSTER_NAME} \
--scaling-config "minSize=0,maxSize=1,desiredSize=0" --nodegroup-name ${NODE_GROUP_NAME}
Because the instances are not scaled down immediately, there is a period of waiting for the scale down to complete. If you have jq installed, you can periodically query the state of your ASG and see how many instances are currently running.
INSTANCE_COUNT=$(aws --profile ${AWS_PROFILE} --region ${REGION} autoscaling describe-auto-scaling-groups --auto-scaling-group-name ${ASG_NAME} \
| jq '.[][0] | .Instances | length')
I just noticed that
"k8s.io/cluster-autoscaler/${var.cluster-name}" = "true"
Might need to be
"k8s.io/cluster-autoscaler/${var.cluster-name}" = "owned"
There is an existing issue with node group to add the "Name" tag on ASG. https://github.com/aws/containers-roadmap/issues/608 (open) and this on terraform end https://github.com/terraform-aws-modules/terraform-aws-eks/issues/860 (closed)
However, there is an alternative to use aws cli command to add tag explicitly.
Try below to use in terraform
resource "null_resource" "add_custom_tags_to_asg" {
for_each = module.eks-cluster.node_groups
provisioner "local-exec" {
command = <<EOF
aws autoscaling create-or-update-tags \
--tags ResourceId=${each.value.resources[0].autoscaling_groups[0].name},ResourceType=auto-scaling-group,Key=Name,Value=k8s-node-groups-${each.value.labels["env"]},PropagateAtLaunch=true
EOF
}
}
I am trying to deploy Windows VM on Google Cloud through terraform. The VM is getting deployed and I am able to execute PowerShell scripts by using windows-startup-script-url.
With this approach, I can only use scripts which are already stored in Google Storage. If the script has parameters / variables, then how to pass that parameter, any clue !
provider "google" {
project = "my-project"
region = "my-location"
zone = "my-zone"
}
resource "google_compute_instance" "default" {
name = "my-name"
machine_type = "n1-standard-2"
zone = "my-zone"
boot_disk {
initialize_params {
image = "windows-cloud/windows-2019"
}
}
metadata {
windows-startup-script-url = "gs://<my-storage>/<my-script.ps1>"
}
network_interface {
network = "default"
access_config {
}
}
tags = ["http-server", "windows-server"]
}
resource "google_compute_firewall" "http-server" {
name = "default-allow-http"
network = "default"
allow {
protocol = "tcp"
ports = ["80"]
}
source_ranges = ["0.0.0.0/0"]
target_tags = ["http-server"]
}
resource "google_compute_firewall" "windows-server" {
name = "windows-server"
network = "default"
allow {
protocol = "tcp"
ports = ["3389"]
}
source_ranges = ["0.0.0.0/0"]
target_tags = ["windows-server"]
}
output "ip" {
value = "${google_compute_instance.default.network_interface.0.access_config.0.nat_ip}"
}
Terraform doesn't require startup scripts to be pulled from GCS buckets necessarily.
The example here shows:
}
metadata = {
foo = "bar"
}
metadata_startup_script = "echo hi > /test.txt"
service_account {
scopes = ["userinfo-email", "compute-ro", "storage-ro"]
}
}
More in Official docs for GCE and Powershell scripting here
EDIT:
SOLUTION: I forgot to add target_cpu_utilization_percentage to autoscaler.tf file
I want a web-service in Python (or other language) running on Kubernetes but with auto scaling.
I created a Deployment and a Horizontal Autoscaler but is not working.
I'm using Terraform to configure Kubernetes.
I have this files:
Deployments.tf
resource "kubernetes_deployment" "rui-test" {
metadata {
name = "rui-test"
labels {
app = "rui-test"
}
}
spec {
strategy = {
type = "RollingUpdate"
rolling_update = {
max_unavailable = "26%" # This is not working
}
}
selector = {
match_labels = {
app = "rui-test"
}
}
template = {
metadata = {
labels = {
app = "rui-test"
}
}
spec = {
container {
name = "python-test1"
image = "***************************"
}
}
}
}
}
Autoscaler.tf
resource "kubernetes_horizontal_pod_autoscaler" "test-rui" {
metadata {
name = "test-rui"
}
spec {
max_replicas = 10 # THIS IS NOT WORKING
min_replicas = 3 # THIS IS NOT WORKING
scale_target_ref {
kind = "Deployment"
name = "test-rui" # Name of deployment
}
}
}
Service.tf
resource "kubernetes_service" "rui-test" {
metadata {
name = "rui-test"
labels {
app = "rui-test"
}
}
spec {
selector {
app = "rui-test"
}
type = "LoadBalancer" # Use 'cluster_ip = "None"' or 'type = "LoadBalancer"'
port {
name = "http"
port = 8080
}
}
}
When I run kubectl get hpa I see this:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
rui-test Deployment/rui-test <unknown>/80% 1 3 1 1h
Instead of:
rui-test Deployment/rui-test <unknown>/79% 3 10 1 1h
That is what I want.
But if I run kubectl autoscale deployment rui-test --min=3 --max=10 --cpu-percent=81 I see this:
Error from server (AlreadyExists): horizontalpodautoscalers.autoscaling "rui-test" already exists
In kubernetes appear this
You are missing the metrics server. Kubernetes needs to determine current CPU/Memory usage so that it can autoscale up and down.
One way to know if you have the metrics server installed is to run:
$ kubectl top node
$ kubectl top pod
The Horizontal Pod AutoScaler is dependent on resource limits being configured for your Deployment.
From the documentation:
Please note that if some of the pod’s containers do not have the relevant resource request set, CPU utilization for the pod will not be defined and the autoscaler will not take any action for that metric.