Terraform with Google Container Engine (Kubernetes): Error executing access token command "...\gcloud.cmd" - gcloud

I'm trying to deploy some module (Docker image) to google Google Container Engine. What I got in my Terraformconfig file:
terraform.tf
# Google Cloud provider
provider "google" {
credentials = "${file("google_credentials.json")}"
project = "${var.google_project_id}"
region = "${var.google_region}"
}
# Google Container Engine (Kubernetes) cluster resource
resource "google_container_cluster" "secureskye" {
name = "secureskye"
zone = "${var.google_kubernetes_zone}"
additional_zones = "${var.google_kubernetes_additional_zones}"
initial_node_count = 2
}
# Kubernetes provider
provider "kubernetes" {
host = "${google_container_cluster.secureskye.endpoint}"
username = "${var.google_kubernetes_username}"
password = "${var.google_kubernetes_password}"
client_certificate = "${base64decode(google_container_cluster.secureskye.master_auth.0.client_certificate)}"
client_key = "${base64decode(google_container_cluster.secureskye.master_auth.0.client_key)}"
cluster_ca_certificate = "${base64decode(google_container_cluster.secureskye.master_auth.0.cluster_ca_certificate)}"
}
# Module UI
module "ui" {
source = "./modules/ui"
}
My problem is: google_container_cluster was created successfully, but it fails on module ui creation (which contains 2 resource kubernetes_service and kubernetes_pod) with error
* kubernetes_pod.ui: Post https://<ip>/api/v1/namespaces/default/pods: error executing access token command "<user_path>\\AppData\\Local\\Google\\Cloud SDK\\google-cloud-sdk\\bin\\gcloud.cmd config config-helper --format=json": err=exec: "<user_path>\\AppData\\Local\\Google\\Cloud SDK\\google-cloud-sdk\\bin\\gcloud.cmd": file does not exist output=
So, questions:
1. Do I need gcloud + kubectl installed? Even though google_container_cluster was created successfully before I install gcloud or kubectl installed.
2. I want to use independent, separated credentials info, project, region from the one in gcloud, kubectl CLI. Am I doing this right?

I have been able to reproduce your scenario running the Terraform config file you provided (except the Module UI part), in a Linux machine, so your issue should be related to that last part of the code.
Regarding your questions:
I am not sure, because I tried from Google Cloud Shell, and both gcloud and kubectl are already preinstalled there, although I would recommend you to install them just to make sure that is not the issue here.
For the credentials part, I added two new variables to the variables.tf Terraform configuration file, as in this example (those credentials do not need to be the sames as in gcloud or kubectl:
Use your prefered credentials in this case.
variable "google_kubernetes_username" {
default = "<YOUR_USERNAME>"
}
variable "google_kubernetes_password" {
default = "<YOUR_PASSWORD>"
}
Maybe you could share more information regarding what can be found in your Module UI, in order to understand which file does not exist. I guess you are trying the deployment from a Windows machine, as for the notation in the paths to your files, but that should not be an important issue.

Related

How can I redeploy a docker-compose stack with terraform?

I use terraform to configure a GCE instance which runs a docker-compose stack. The docker-compose stack references an image with a tag and I would like to be able to rerun docker-compose up when the tag changes, so that a new version of the service can be run.
Currently, I do the following in my terraform files:
provisioner "file" {
source = "training-server/docker-compose.yml"
destination = "/home/curry/docker-compose.yml"
connection {
type = "ssh"
user = "curry"
host = google_compute_address.training-address.address
private_key = file(var.private_key_file)
}
}
provisioner "remote-exec" {
inline = [
"IMAGE_ID=${var.image_id} docker-compose -f /home/curry/docker-compose.yml up -d"
]
connection {
type = "ssh"
user = "root"
host = google_compute_address.training-address.address
private_key = file(var.private_key_file)
}
}
but this is wrong for various reasons:
Provisioners are somewhat frowned upon according to terraform documentation
If the image_id change this won't be considered a change in configuration by terraform so it won't run the provisioners
What I want is to consider my application stack like a resource, so that when one of its attributes change, eg. the image_id, the resource is recreated but the VM instance itself is not.
How can I do that with terraform? Or is there another better approach?
Terraform has a Docker provider, and if you wanted to use Terraform to manage your container stack, that's probably the right tool. But, using it requires essentially translating your Compose file into Terraform syntax.
I'm a little more used to a split where you use Terraform to manage infrastructure – set up EC2 instances and their network setup, for example – but use another tool like Ansible, Chef, or Salt Stack to actually run software on them. Then to update the software (Docker containers) you'd update your configuration management tool's settings to say which version (Docker image tag) you want, and then re-run that.
One trick that may help is to use the null resource which will let you "reprovision the resource" whenever the image ID changes:
resource "null_resource" "docker_compose" {
triggers = {
image_id = "${var.image_id}"
}
provisioner "remote_exec" {
...
}
}
If you wanted to go down the all-Terraform route, in theory you could write a Terraform configuration like
provider "docker" {
host = "ssh://root#${google_compute_address.training-address.address}"
# (where do its credentials come from?)
}
resource "docker_image" "myapp" {
name = "myapp:${var.image_id}"
}
resource "docker_container" "myapp" {
name = "myapp"
image = "${docker_image.myapp.latest}"
}
but you'd have to translate your entire Docker Compose configuration to this syntax, and set it up so that there's an option for developers to run it locally, and replicate Compose features like the default network, and so on. I don't feel like this is generally done in practice.

Pass output (database password) from Terraform to Kubernetes manifest in CICD pipeline

I am using Terraform to provision resources in Azure, one of which is a Postgres database. My Terraform module includes the following to generate a random password and output to console.
resource "random_string" "db_master_pass" {
length = 40
special = true
min_special = 5
override_special = "!-_"
keepers = {
pass_version = 1
}
}
# For postgres
output "db_master_pass" {
value = "${module.postgres.db_master_pass}"
}
I am using Kubernetes deployment manifest to deploy the application to Azure managed Kubernetes service. Is there a way of passing the database password to Kubernetes in the deployment pipeline? I am using CircleCI for CICD. Currently, I'm copying the password, encoding it to base64 and pasting it to the secrets manifest before running the deployment.
One solution is to generate the Kubernetes yaml from a template.
The pattern uses templatefile function in Terraform 0.12 or the template provider earlier versions to read and local_file resource to write. For example:
data "template_file" "service_template" {
template = "${file("${path.module}/templates/service.tpl")}"
vars {
postgres_password = ""${module.postgres.db_master_pass}"
}
}
resource "local_file" "template" {
content = "${data.template_file.service_template.rendered}"
filename = "postegres_service.yaml"
}
There are many other options, like using to the Kubernetes provider, but I think this better matches your question.

Retrieve auto scaling group instance ip's and provide it to ansible

Im currently developing terraform script and ansible roles in order to install mongodb with the replication. im using auto scaling group and i need to pass, ec2 instance private ip's to ansible as extra vars. is there any way to do that?
When it's come to rs.initiate() is there any way to add ec2 private ip to mongo cluster when terraform creating the instances.
Not really sure about how it's done in ASGs, probably a combination of user-data and EC2 metadata would be helpful.
But I do it as below in case we have a fixed number of nodes. Posting this answer as it can be helpful to someone in some way.
Using EC2 dynamic inventory scripts.
Ref - https://docs.ansible.com/ansible/2.5/user_guide/intro_dynamic_inventory.html
This is basically a python script i.e ec2.py which gets the instance private IP using tags etc. It comes with a config file named ec2.ini.
Tag your instance in TF script (you add a role tag) -
resource "aws_instance" "ec2" {
....
tags = "${merge(var.tags, map(
"description","mongodb-node",
"role", "mongodb-node",
"Environment", "${local.env}",))}"
}
output "ip" {
value = ["${aws_instance.ec2.private_ip}"]
}
Get the instance private IP in playbook -
- hosts: localhost
connection: local
tasks:
- debug: msg="MongoDB Node IP is - {{ hostvars[groups['tag_role_mongodb-node'][0]].inventory_hostname }}"
Now run the playbook using TF null_resource -
resource null_resource "ansible_run" {
triggers {
ansible_file = "${sha1(file("${path.module}/${var.ansible_play}"))}"
}
provisioner "local-exec" {
command = "ANSIBLE_HOST_KEY_CHECKING=False ansible-playbook -i ./ec2.py --private-key ${var.private_key} ${var.ansible_play}"
}
}
You got to make sure AWS related environment variables are present/exported for ansible to fetch AWS EC2 metadata. Also make sure ec2.py is executable.
If you want to get the private IP, change the following config in ec2.ini -
destination_variable = private_ip_address
vpc_destination_variable = private_ip_address

Run kubernetes build from terraform

I'm trying to make a simple test to build a simple nginx on kubernetes from terraform.
This is the first time working terraform.
This is the basic terraform file:
provider "kubernetes" {
host = "https://xxx.xxx.xxx.xxx:8443"
client_certificate = "${file("~/.kube/master.server.crt")}"
client_key = "${file("~/.kube/master.server.key")}"
cluster_ca_certificate = "${file("~/.kube/ca.crt")}"
username = "xxxxxx"
password = "xxxxxx"
}
resource "kubernetes_service" "nginx" {
metadata {
name = "nginx-example"
}
spec {
selector {
App = "${kubernetes_pod.nginx.metadata.0.labels.App}"
}
port {
port = 80
target_port = 80
}
type = "LoadBalancer"
}
}
resource "kubernetes_pod" "nginx" {
metadata {
name = "nginx-example"
labels {
App = "nginx"
}
}
spec {
container {
image = "nginx:1.7.8"
name = "example"
port {
container_port = 80
}
}
}
}
I'm getting the following error after running the terraform apply.
Error: Error applying plan:
1 error(s) occurred:
kubernetes_pod.nginx: 1 error(s) occurred:
kubernetes_pod.nginx: the server has asked for the client to provide credentials (post pods)
Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with any
resources that successfully completed. Please address the error above
and apply again to incrementally change your infrastructure.
I have admin permissions on kubernetes and everything is working correctly.
But for some reason I'm getting that error.
What I'm doing wrong?
Thanks
Regarding #matthew-l-daniel question
When I'm only using the username/password I get this error:
Error: Error applying plan:
1 error(s) occurred:
kubernetes_pod.nginx: 1 error(s) occurred:
kubernetes_pod.nginx: Post https://xxx.xxx.xxx.xxx:8443/api/v1/namespaces/default/pods:
x509: certificate signed by unknown authority
Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with any
resources that successfully completed. Please address the error above
and apply again to incrementally change your infrastructure.
I tried using the server name or the server ip and got the same error everytime.
When using the certs I got the error from the original post, regarding the "credentials"
I forgot to mention that this is an openshift installation. I don't believe it will have any impact in the end, but I thought I should mention it.
The solution was rather simple, I was using the master crt and key from openshift on terraform.
Then I tested it using the admin crt and key from openshift and it worked.
Aside from the official kubernetes provider documentation suggesting only certificate or basic (user/pass) should be required, this sounds like an OpenShift issue. Have you been able to obtain any logs from the OpenShift cluster?
Some searching links the message you are seeing to some instability bugs within Kubernetes wherein the kubelet does not properly register after a reboot. I would manually confirm the node shows as Ready in OpenShift before you attempt a deployment, as until this occurs Terraform will not be able to interact with it.
If in fact the node is not Ready, Terraform is just surfacing the underlying error passed back from OpenShift.
Separately, the error you are seeing when trying to authenticate using purely certificate parameters is indicative of a misconfiguration. A similar question was raised on the Kubernetes GitHub, and the suggestion there was to investigate the Certificate Authority loaded on to the cluster.

Creating Kubernetes Endpoint in VSTS generates error

What setting up a new Kubernetes endpoint and clicking "Verify Connection" the error message:
"The Kubconfig does not contain user field. Please check the kubeconfig. " - is always displayed.
Have tried multiple ways of outputting the config file to no avail. I've also copy and pasted many sample config files from the web and all end up with the same issue. Anyone been successful in creating a new endpoint?
This is followed by TsuyoshiUshio/KubernetesTask issue 35
I try to reproduce, however, I can't do it.
I'm not sure, however, I can guess it might the mismatch of the version of the cluster/kubectl which you download by the download task/kubeconfig.
Workaround might be like this:
kubectl version in your local machine and check the current server/client version
specify the same version as the server on the download task. (by default it is 1.5.2)
See the log of your release pipeline which is fail, you can see which kubectl command has been executed, do the same thing on your local machine with fitting your local pc's environment.
The point is, before go to the VSTS, download the kubectl by yourself.
Then, put the kubeconfg on the default folder like ~/.kube/config or set environment variables KUBECONFIG to the binary.
Then execute kubectl get nodes and make sure if it works.
My kubeconfig is different format with yours. If you use AKS, az aks install-cli command and az aks get-credentials command.
Please refer https://learn.microsoft.com/en-us/azure/aks/kubernetes-walkthrough .
If it works locally, the config file must work on the VSTS task environment. (or this task or VSTS has a bug)
I had the same problem on VSTS.
Here is my workaround to get a Service Connection working (in my case to GCloud):
Switched Authentication to "Service Account"
Run the two commands told by the info icon next to the fields Token and Certificate: "Token to authenticate against Kubernetes.
Use the ‘kubectl get serviceaccounts -o yaml’ and ‘kubectl get secret
-o yaml’ commands to get the token."
kubectl get secret -o yaml > kubectl-secret.yaml
Search inside the the file kubectl-secret.yaml the values ca.crt and token
Enter the values inside VSTS to the required fields
The generated config I was using had a duplicate line, removing this corrected the issue for me.
users:
- name: cluster_stuff_here
- name: cluster_stuff_here