Vault Helm Chart not using Config from values.yaml - kubernetes

I'm trying to install Hashicorp Vault with the official Helm chart from Hashicorp. I'm installing it via Argocd via the UI. I have a git repo with values.yaml file that specifies some config thats not default (for example, ha mode and AWS KMS unseal). When I set up the chart via the Argocd web UI, I can point it to the values.yaml file, and see the values I set in the parameters section of the app. However, when I deploy the chart, the config doesn't get applied. I checked the configmap created by the chart, and it seems to follow the defaults despite my overrides. I'm thinking perhaps I'm using argocd wrong as I'm fairly new to it, although it very clearly shows the overrides from my values.yaml in the app's parameters.
Here is the relevant section of my values.yaml
server:
extraSecretEnvironmentVars:
- envName: AWS_SECRET_ACCESS_KEY
secretName: vault
secretKey: AWS_SECRET_ACCESS_KEY
- envName: AWS_ACCESS_KEY_ID
secretName: vault
secretKey: AWS_ACCESS_KEY_ID
- envName: AWS_KMS_KEY_ID
secretName: vault
secretKey: AWS_KMS_KEY_ID
ha:
enabled: true
replicas: 3
apiAddr: https://myvault.com:8200
clusterAddr: https://myvault.com:8201
raft:
enabled: true
setNodeId: false
config: |
ui = true
listener "tcp" {
tls_disable = 1
address = "[::]:8200"
cluster_address = "[::]:8201"
}
storage "raft" {
path = "/vault/data"
}
service_registration "kubernetes" {}
seal "awskms" {
region = "us-west-2"
kms_key_id = "$VAULT_KMS_KEY_ID"
}
However, the deployed config looks like this
disable_mlock = true
ui = true
listener "tcp" {
tls_disable = 1
address = "[::]:8200"
cluster_address = "[::]:8201"
# Enable unauthenticated metrics access (necessary for Prometheus Operator)
#telemetry {
# unauthenticated_metrics_access = "true"
#}
}
storage "file" {
path = "/vault/data"
}
# Example configuration for using auto-unseal, using Google Cloud KMS. The
# GKMS keys must already exist, and the cluster must have a service account
# that is authorized to access GCP KMS.
#seal "gcpckms" {
# project = "vault-helm-dev"
# region = "global"
# key_ring = "vault-helm-unseal-kr"
# crypto_key = "vault-helm-unseal-key"
#}
# Example configuration for enabling Prometheus metrics in your config.
#telemetry {
# prometheus_retention_time = "30s",
# disable_hostname = true
#}
I've tried several changes to this config, such as setting the AWS_KMS_UNSEAL environment variable, which doesnt seem to get applied. I've also execed into the containers and none of my environment variables seem to be set when I run a printenv command. I can't seem to figure out why its deploying the pods with the default config.

With the help of murtiko I figured this out. My indentation of the config block was off. It needs to be nested below the ha block. My working config looks like this:
global:
enabled: true
server:
extraSecretEnvironmentVars:
- envName: AWS_REGION
secretName: vault
secretKey: AWS_REGION
- envName: AWS_ACCESS_KEY_ID
secretName: vault
secretKey: AWS_ACCESS_KEY_ID
- envName: AWS_SECRET_ACCESS_KEY
secretName: vault
secretKey: AWS_SECRET_ACCESS_KEY
- envName: VAULT_AWSKMS_SEAL_KEY_ID
secretName: vault
secretKey: VAULT_AWSKMS_SEAL_KEY_ID
ha:
enabled: true
config: |
ui = true
listener "tcp" {
tls_disable = 1
address = "[::]:8200"
cluster_address = "[::]:8201"
}
seal "awskms" {
}
storage "raft" {
path = "/vault/data"
}
raft:
enabled: true
setNodeId: true
config: |
ui = true
listener "tcp" {
tls_disable = 1
address = "[::]:8200"
cluster_address = "[::]:8201"
}
seal "awskms" {
}
storage "raft" {
path = "/vault/data"
}

Related

Deploying HA Vault To GKE - dial tcp 127.0.0.1:8200: connect: connection refused

As per the official documentation (https://developer.hashicorp.com/vault/tutorials/kubernetes/kubernetes-google-cloud-gke), the following works as expected:
helm install vault hashicorp/vault \
--set='server.ha.enabled=true' \
--set='server.ha.raft.enabled=true'
I can then run:
kubectl exec vault-0 -- vault status
And this works perfectly fine.
However, I've noticed that when if I don't have raft enabled, I get the dial tcp 127.0.0.1:8200: connect: connection refused" error message:
helm install vault hashicorp/vault \
--set='server.ha.enabled=true'
I'm trying to work out why my Vault deployment is giving me the same issue.
I'm trying to deploy Vault into GKE with auto-unseal keys and a Google Cloud Storage backend configured.
My values.yaml file contains:
global:
enabled: true
tlsDisable: false
injector:
enabled: true
replicas: 1
port: 8080
leaderElector:
enabled: true
image:
repository: "hashicorp/vault-k8s"
tag: "latest"
pullPolicy: IfNotPresent
agentImage:
repository: "hashicorp/vault"
tag: "latest"
authPath: "auth/kubernetes"
webhook:
failurePolicy: Ignore
matchPolicy: Exact
objectSelector: |
matchExpressions:
- key: app.kubernetes.io/name
operator: NotIn
values:
- {{ template "vault.name" . }}-agent-injector
certs:
secretName: vault-lab.company.com-cert
certName: tls.crt
keyName: tls.key
server:
enabled: true
image:
repository: "hashicorp/vault"
tag: "latest"
pullPolicy: IfNotPresent
extraEnvironmentVars:
GOOGLE_APPLICATION_CREDENTIALS: /vault/userconfig/vault-gcs/service-account.json
GOOGLE_REGION: europe-west2
GOOGLE_PROJECT: sandbox-vault-lab
volumes:
- name: vault-gcs
secret:
secretName: vault-gcs
- name: vault-lab-cert
secret:
secretName: vault-lab.company.com-cert
volumeMounts:
- name: vault-gcs
mountPath: /vault/userconfig/vault-gcs
readOnly: true
- name: vault-lab-cert
mountPath: /etc/tls
readOnly: true
service:
enabled: true
type: NodePort
externalTrafficPolicy: Cluster
port: 8200
targetPort: 8200
annotations:
cloud.google.com/app-protocols: '{"http":"HTTPS"}'
beta.cloud.google.com/backend-config: '{"ports": {"http":"config-default"}}'
ha:
enabled: true
replicas: 3
config: |
listener "tcp" {
tls_disable = 0
tls_min_version = "tls12"
address = "[::]:8200"
cluster_address = "[::]:8201"
}
storage "gcs" {
bucket = "vault-lab-bucket"
ha_enabled = "true"
}
service_registration "kubernetes" {}
# Example configuration for using auto-unseal, using Google Cloud KMS. The
# GKMS keys must already exist, and the cluster must have a service account
# that is authorized to access GCP KMS.
seal "gcpckms" {
project = "sandbox-vault-lab"
region = "global"
key_ring = "vault-helm-unseal-kr"
crypto_key = "vault-helm-unseal-key"
}
Something here must be misconfigured, but what, I'm unsure.
Any help would be appreciated.
EDIT:
Even after configuring Raft, I still encounter the same issue:
raft:
enabled: true
setNodeId: false
config: |
ui = false
listener "tcp" {
# tls_disable = 0
address = "[::]:8200"
cluster_address = "[::]:8201"
tls_cert_file = "/etc/tls/tls.crt"
tls_key_file = "/etc/tls/tls.key"
}
#storage "raft" {
# path = "/vault/data"
#}
storage "gcs" {
bucket = "vault-lab-bucket"
ha_enabled = "true"
}
service_registration "kubernetes" {}

UI 404 - Vault Kubernetes

I'm testing out Vault in Kubernetes and am installing via the Helm chart. I've created an overrides file, it's an amalgamation of a few different pages from the official docs.
The pods seem to come up OK and into Ready status and I can unseal vault manually using 3 of the keys generated. I'm having issues getting 404 when browsing the UI though, the UI is presented externally on a Load Balancer in AKS. Here's my config:
global:
enabled: true
tlsDisable: false
injector:
enabled: false
server:
readinessProbe:
enabled: true
path: "/v1/sys/health?standbyok=true&sealedcode=204&uninitcode=204"
# livenessProbe:
# enabled: true
# path: "/v1/sys/health?standbyok=true"
# initialDelaySeconds: 60
extraEnvironmentVars:
VAULT_CACERT: /vault/userconfig/vault-server-tls/vault.ca
extraVolumes:
- type: secret
name: vault-server-tls # Matches the ${SECRET_NAME} from above
standalone:
enabled: true
config: |
listener "tcp" {
address = "[::]:8200"
cluster_address = "[::]:8201"
tls_cert_file = "/vault/userconfig/vault-server-tls/vault.crt"
tls_key_file = "/vault/userconfig/vault-server-tls/vault.key"
tls_client_ca_file = "/vault/userconfig/vault-server-tls/vault.ca"
}
storage "file" {
path = "/vault/data"
}
# Vault UI
ui:
enabled: true
serviceType: "LoadBalancer"
serviceNodePort: null
externalPort: 443
# For Added Security, edit the below
# loadBalancerSourceRanges:
# 5.69.25.6/32
I'm still trying to get to grips with Vault. My liveness probe is commented out because it was permanently failing and causing the pod to be re-scheduled, even though checking the vault service status it appeared to be healthy and awaiting an unseal. That's a side issue though compared to the UI, just mentioning in case the failing liveness is related.
Thanks!
So, I don't think the documentation around deploying in Kubernetes from Helm is really that clear but I was basically missing a ui = true flag from the HCL config stanza. It's to be noted that this is in addition to the value passed to the helm chart:
# Vault UI
ui:
enabled: true
serviceType: "LoadBalancer"
serviceNodePort: null
externalPort: 443
Which I had mistakenly assumed was enough to enable the UI.
Here's the config now, with working UI:
global:
enabled: true
tlsDisable: false
injector:
enabled: false
server:
readinessProbe:
enabled: true
path: "/v1/sys/health?standbyok=true&sealedcode=204&uninitcode=204"
extraEnvironmentVars:
VAULT_CACERT: /vault/userconfig/vault-server-tls/vault.ca
extraVolumes:
- type: secret
name: vault-server-tls # Matches the ${SECRET_NAME} from above
standalone:
enabled: true
config: |
ui = true
listener "tcp" {
address = "[::]:8200"
cluster_address = "[::]:8201"
tls_cert_file = "/vault/userconfig/vault-server-tls/vault.crt"
tls_key_file = "/vault/userconfig/vault-server-tls/vault.key"
tls_client_ca_file = "/vault/userconfig/vault-server-tls/vault.ca"
}
storage "file" {
path = "/vault/data"
}
# Vault UI
ui:
enabled: true
serviceType: "LoadBalancer"
serviceNodePort: null
externalPort: 443

Vault is already initialized error message

I deployed the following helm chart for vault and I get the following error "Vault is already initialized" when doing "vault operator init" command. I do not understand why it is already initialized.
Also, when I enable readinessProbe the pod keeps restating I assume because it is not initialized properly.
global:
enabled: true
tlsDisable: false
server:
extraEnvironmentVars:
VAULT_CACERT: /vault/userconfig/vault-server-tls/ca.crt
logLevel: debug
logFormat: standard
readinessProbe:
enabled: false
authDelegator:
enabled: true
extraVolumes:
- type: secret
name: vault-server-tls # Matches the ${SECRET_NAME} from above
standalone:
enabled: true
config: |
listener "tcp" {
address = "[::]:8200"
cluster_address = "[::]:8201"
tls_cert_file = "/vault/userconfig/vault-server-tls/tls.crt"
tls_key_file = "/vault/userconfig/vault-server-tls/tls.key"
}
storage "file" {
path = "/vault/data"
}

How to convert yaml configmap file to terraform

I am trying to integrate Kubewatch in a kubernetes cluster. The cluster was built using Terraform's kubernetes provider. How do I convert the data section of this configmap yaml file to terraform?
YAML
apiVersion: v1
kind: ConfigMap
metadata:
name: kubewatch
data:
.kubewatch.yaml: |
namespace: "default"
handler:
slack:
token: xoxb-OUR-BOT-TOKEN
channel: kubernetes-events
resource:
deployment: true
replicationcontroller: false
replicaset: false
daemonset: false
services: true
pod: true
secret: true
configmap: false
While I haven't done very complex config maps, this should get you pretty close.
resource "kubernetes_config_map" "example" {
metadata {
name = "kubewatch"
}
data {
namespace = "default"
handler {
slack {
token = "xoxb-OUR-BOT-TOKEN"
channel = "kubernetes-events"
}
}
resource {
deployment = true
replicationcontroller = false
replicaset = false
daemonset = false
services = true
pod = true
secret = true
configmap = false
}
api_host = "myhost:443"
db_host = "dbhost:5432"
}
}

Bad latency in GKE between Pods

We are having a very strange behavior with unacceptable big latency for communication within a kubernetes cluster (GKE).
The latency is jumping between 600ms and 1s for a endpoint that has a Memorystore get/store action and a CloudSQL query. The same setup running locally in dev enivornment (although without k8s) is not showing this kind of latency.
About our architecture:
We are running a k8s cluster on GKE using terraform and service / deployment (yaml) files for the creation (I added those below).
We're running two node APIs (koa.js 2.5). One API is exposed with an ingress to the public and connects via a nodeport to the API pod.
The other API pod is private reachable through an internal loadbalancer from google. This API is connected to all the resource we need (CloudSQL, Cloud Storage).
Both APIs are also connected to a Memorystore (Redis).
The communication between those pods is secured with self-signed server/client certificates (which isn't the problem, we already removed it temporarily to test).
We checked the logs and saw that the request from the public API to the private one is taking about 200ms only to reach it.
Also the response to the public API from the private one took about 600ms (messured from the point when the whole business logic of the private API went throw until we received that response back at the pubilc API)
We're really out of things to try... We already connected all the Google Cloud resources to our local environment which didn't show that kind of bad latency.
In a complete local setup the latency is only about 1/5 to 1/10 of what we see in the cloud setup.
We also tried to ping the private POD from the public one which was in the 0.100ms area.
Do you have any ideas where we can further investigate ?
Here is the terraform script about our Google Cloud setup
// Configure the Google Cloud provider
provider "google" {
project = "${var.project}"
region = "${var.region}"
}
data "google_compute_zones" "available" {}
# Ensuring relevant service APIs are enabled in your project. Alternatively visit and enable the needed services
resource "google_project_service" "serviceapi" {
service = "serviceusage.googleapis.com"
disable_on_destroy = false
}
resource "google_project_service" "sqlapi" {
service = "sqladmin.googleapis.com"
disable_on_destroy = false
depends_on = ["google_project_service.serviceapi"]
}
resource "google_project_service" "redisapi" {
service = "redis.googleapis.com"
disable_on_destroy = false
depends_on = ["google_project_service.serviceapi"]
}
# Create a VPC and a subnetwork in our region
resource "google_compute_network" "appnetwork" {
name = "${var.environment}-vpn"
auto_create_subnetworks = "false"
}
resource "google_compute_subnetwork" "network-with-private-secondary-ip-ranges" {
name = "${var.environment}-vpn-subnet"
ip_cidr_range = "10.2.0.0/16"
region = "europe-west1"
network = "${google_compute_network.appnetwork.self_link}"
secondary_ip_range {
range_name = "kubernetes-secondary-range-pods"
ip_cidr_range = "10.60.0.0/16"
}
secondary_ip_range {
range_name = "kubernetes-secondary-range-services"
ip_cidr_range = "10.70.0.0/16"
}
}
# GKE cluster setup
resource "google_container_cluster" "primary" {
name = "${var.environment}-cluster"
zone = "${data.google_compute_zones.available.names[1]}"
initial_node_count = 1
description = "Kubernetes Cluster"
network = "${google_compute_network.appnetwork.self_link}"
subnetwork = "${google_compute_subnetwork.network-with-private-secondary-ip-ranges.self_link}"
depends_on = ["google_project_service.serviceapi"]
additional_zones = [
"${data.google_compute_zones.available.names[0]}",
"${data.google_compute_zones.available.names[2]}",
]
master_auth {
username = "xxxxxxx"
password = "xxxxxxx"
}
ip_allocation_policy {
cluster_secondary_range_name = "kubernetes-secondary-range-pods"
services_secondary_range_name = "kubernetes-secondary-range-services"
}
node_config {
oauth_scopes = [
"https://www.googleapis.com/auth/compute",
"https://www.googleapis.com/auth/devstorage.read_only",
"https://www.googleapis.com/auth/logging.write",
"https://www.googleapis.com/auth/monitoring",
"https://www.googleapis.com/auth/trace.append"
]
tags = ["kubernetes", "${var.environment}"]
}
}
##################
# MySQL DATABASES
##################
resource "google_sql_database_instance" "core" {
name = "${var.environment}-sql-core"
database_version = "MYSQL_5_7"
region = "${var.region}"
depends_on = ["google_project_service.sqlapi"]
settings {
# Second-generation instance tiers are based on the machine
# type. See argument reference below.
tier = "db-n1-standard-1"
}
}
resource "google_sql_database_instance" "tenant1" {
name = "${var.environment}-sql-tenant1"
database_version = "MYSQL_5_7"
region = "${var.region}"
depends_on = ["google_project_service.sqlapi"]
settings {
# Second-generation instance tiers are based on the machine
# type. See argument reference below.
tier = "db-n1-standard-1"
}
}
resource "google_sql_database_instance" "tenant2" {
name = "${var.environment}-sql-tenant2"
database_version = "MYSQL_5_7"
region = "${var.region}"
depends_on = ["google_project_service.sqlapi"]
settings {
# Second-generation instance tiers are based on the machine
# type. See argument reference below.
tier = "db-n1-standard-1"
}
}
resource "google_sql_database" "core" {
name = "project_core"
instance = "${google_sql_database_instance.core.name}"
}
resource "google_sql_database" "tenant1" {
name = "project_tenant_1"
instance = "${google_sql_database_instance.tenant1.name}"
}
resource "google_sql_database" "tenant2" {
name = "project_tenant_2"
instance = "${google_sql_database_instance.tenant2.name}"
}
##################
# MySQL USERS
##################
resource "google_sql_user" "core-user" {
name = "${var.sqluser}"
instance = "${google_sql_database_instance.core.name}"
host = "cloudsqlproxy~%"
password = "${var.sqlpassword}"
}
resource "google_sql_user" "tenant1-user" {
name = "${var.sqluser}"
instance = "${google_sql_database_instance.tenant1.name}"
host = "cloudsqlproxy~%"
password = "${var.sqlpassword}"
}
resource "google_sql_user" "tenant2-user" {
name = "${var.sqluser}"
instance = "${google_sql_database_instance.tenant2.name}"
host = "cloudsqlproxy~%"
password = "${var.sqlpassword}"
}
##################
# REDIS
##################
resource "google_redis_instance" "redis" {
name = "${var.environment}-redis"
tier = "BASIC"
memory_size_gb = 1
depends_on = ["google_project_service.redisapi"]
authorized_network = "${google_compute_network.appnetwork.self_link}"
region = "${var.region}"
location_id = "${data.google_compute_zones.available.names[0]}"
redis_version = "REDIS_3_2"
display_name = "Redis Instance"
}
# The following outputs allow authentication and connectivity to the GKE Cluster.
output "client_certificate" {
value = "${google_container_cluster.primary.master_auth.0.client_certificate}"
}
output "client_key" {
value = "${google_container_cluster.primary.master_auth.0.client_key}"
}
output "cluster_ca_certificate" {
value = "${google_container_cluster.primary.master_auth.0.cluster_ca_certificate}"
}
The service and deployment of the private API
# START CRUD POD
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: crud-pod
labels:
app: crud
spec:
template:
metadata:
labels:
app: crud
spec:
containers:
- name: crud
image: eu.gcr.io/dev-xxxxx/crud:latest-unstable
ports:
- containerPort: 3333
env:
- name: NODE_ENV
value: develop
volumeMounts:
- [..MountedConfigFiles..]
# [START proxy_container]
- name: cloudsql-proxy
image: gcr.io/cloudsql-docker/gce-proxy:1.11
command: ["/cloud_sql_proxy",
"-instances=dev-xxxx:europe-west1:dev-sql-core=tcp:3306,dev-xxxx:europe-west1:dev-sql-tenant1=tcp:3307,dev-xxxx:europe-west1:dev-sql-tenant2=tcp:3308",
"-credential_file=xxxx"]
volumeMounts:
- name: cloudsql-instance-credentials
mountPath: /secrets/cloudsql
readOnly: true
# [END proxy_container]
# [START volumes]
volumes:
- name: cloudsql-instance-credentials
secret:
secretName: cloudsql-instance-credentials
- [..ConfigFilesVolumes..]
# [END volumes]
# END CRUD POD
-------
# START CRUD SERVICE
apiVersion: v1
kind: Service
metadata:
name: crud
annotations:
cloud.google.com/load-balancer-type: "Internal"
spec:
type: LoadBalancer
loadBalancerSourceRanges:
- 10.60.0.0/16
ports:
- name: crud-port
port: 3333
protocol: TCP # default; can also specify UDP
selector:
app: crud # label selector for Pods to target
# END CRUD SERVICE
And the public one (including ingress)
# START SAPI POD
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: sapi-pod
labels:
app: sapi
spec:
template:
metadata:
labels:
app: sapi
spec:
containers:
- name: sapi
image: eu.gcr.io/dev-xxx/sapi:latest-unstable
ports:
- containerPort: 8080
env:
- name: NODE_ENV
value: develop
volumeMounts:
- [..MountedConfigFiles..]
volumes:
- [..ConfigFilesVolumes..]
# END SAPI POD
-------------
# START SAPI SERVICE
kind: Service
apiVersion: v1
metadata:
name: sapi # Service name
spec:
selector:
app: sapi
ports:
- port: 8080
targetPort: 8080
type: NodePort
# END SAPI SERVICE
--------------
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: dev-ingress
annotations:
kubernetes.io/ingress.global-static-ip-name: api-dev-static-ip
labels:
app: sapi-ingress
spec:
backend:
serviceName: sapi
servicePort: 8080
tls:
- hosts:
- xxxxx
secretName: xxxxx
We fixed the issue by removing the #google-cloud/logging-winston from our logTransport.
For some reason it blocked our traffic so that we got such bad latency.