UI 404 - Vault Kubernetes - kubernetes

I'm testing out Vault in Kubernetes and am installing via the Helm chart. I've created an overrides file, it's an amalgamation of a few different pages from the official docs.
The pods seem to come up OK and into Ready status and I can unseal vault manually using 3 of the keys generated. I'm having issues getting 404 when browsing the UI though, the UI is presented externally on a Load Balancer in AKS. Here's my config:
global:
enabled: true
tlsDisable: false
injector:
enabled: false
server:
readinessProbe:
enabled: true
path: "/v1/sys/health?standbyok=true&sealedcode=204&uninitcode=204"
# livenessProbe:
# enabled: true
# path: "/v1/sys/health?standbyok=true"
# initialDelaySeconds: 60
extraEnvironmentVars:
VAULT_CACERT: /vault/userconfig/vault-server-tls/vault.ca
extraVolumes:
- type: secret
name: vault-server-tls # Matches the ${SECRET_NAME} from above
standalone:
enabled: true
config: |
listener "tcp" {
address = "[::]:8200"
cluster_address = "[::]:8201"
tls_cert_file = "/vault/userconfig/vault-server-tls/vault.crt"
tls_key_file = "/vault/userconfig/vault-server-tls/vault.key"
tls_client_ca_file = "/vault/userconfig/vault-server-tls/vault.ca"
}
storage "file" {
path = "/vault/data"
}
# Vault UI
ui:
enabled: true
serviceType: "LoadBalancer"
serviceNodePort: null
externalPort: 443
# For Added Security, edit the below
# loadBalancerSourceRanges:
# 5.69.25.6/32
I'm still trying to get to grips with Vault. My liveness probe is commented out because it was permanently failing and causing the pod to be re-scheduled, even though checking the vault service status it appeared to be healthy and awaiting an unseal. That's a side issue though compared to the UI, just mentioning in case the failing liveness is related.
Thanks!

So, I don't think the documentation around deploying in Kubernetes from Helm is really that clear but I was basically missing a ui = true flag from the HCL config stanza. It's to be noted that this is in addition to the value passed to the helm chart:
# Vault UI
ui:
enabled: true
serviceType: "LoadBalancer"
serviceNodePort: null
externalPort: 443
Which I had mistakenly assumed was enough to enable the UI.
Here's the config now, with working UI:
global:
enabled: true
tlsDisable: false
injector:
enabled: false
server:
readinessProbe:
enabled: true
path: "/v1/sys/health?standbyok=true&sealedcode=204&uninitcode=204"
extraEnvironmentVars:
VAULT_CACERT: /vault/userconfig/vault-server-tls/vault.ca
extraVolumes:
- type: secret
name: vault-server-tls # Matches the ${SECRET_NAME} from above
standalone:
enabled: true
config: |
ui = true
listener "tcp" {
address = "[::]:8200"
cluster_address = "[::]:8201"
tls_cert_file = "/vault/userconfig/vault-server-tls/vault.crt"
tls_key_file = "/vault/userconfig/vault-server-tls/vault.key"
tls_client_ca_file = "/vault/userconfig/vault-server-tls/vault.ca"
}
storage "file" {
path = "/vault/data"
}
# Vault UI
ui:
enabled: true
serviceType: "LoadBalancer"
serviceNodePort: null
externalPort: 443

Related

Vault Helm Chart not using Config from values.yaml

I'm trying to install Hashicorp Vault with the official Helm chart from Hashicorp. I'm installing it via Argocd via the UI. I have a git repo with values.yaml file that specifies some config thats not default (for example, ha mode and AWS KMS unseal). When I set up the chart via the Argocd web UI, I can point it to the values.yaml file, and see the values I set in the parameters section of the app. However, when I deploy the chart, the config doesn't get applied. I checked the configmap created by the chart, and it seems to follow the defaults despite my overrides. I'm thinking perhaps I'm using argocd wrong as I'm fairly new to it, although it very clearly shows the overrides from my values.yaml in the app's parameters.
Here is the relevant section of my values.yaml
server:
extraSecretEnvironmentVars:
- envName: AWS_SECRET_ACCESS_KEY
secretName: vault
secretKey: AWS_SECRET_ACCESS_KEY
- envName: AWS_ACCESS_KEY_ID
secretName: vault
secretKey: AWS_ACCESS_KEY_ID
- envName: AWS_KMS_KEY_ID
secretName: vault
secretKey: AWS_KMS_KEY_ID
ha:
enabled: true
replicas: 3
apiAddr: https://myvault.com:8200
clusterAddr: https://myvault.com:8201
raft:
enabled: true
setNodeId: false
config: |
ui = true
listener "tcp" {
tls_disable = 1
address = "[::]:8200"
cluster_address = "[::]:8201"
}
storage "raft" {
path = "/vault/data"
}
service_registration "kubernetes" {}
seal "awskms" {
region = "us-west-2"
kms_key_id = "$VAULT_KMS_KEY_ID"
}
However, the deployed config looks like this
disable_mlock = true
ui = true
listener "tcp" {
tls_disable = 1
address = "[::]:8200"
cluster_address = "[::]:8201"
# Enable unauthenticated metrics access (necessary for Prometheus Operator)
#telemetry {
# unauthenticated_metrics_access = "true"
#}
}
storage "file" {
path = "/vault/data"
}
# Example configuration for using auto-unseal, using Google Cloud KMS. The
# GKMS keys must already exist, and the cluster must have a service account
# that is authorized to access GCP KMS.
#seal "gcpckms" {
# project = "vault-helm-dev"
# region = "global"
# key_ring = "vault-helm-unseal-kr"
# crypto_key = "vault-helm-unseal-key"
#}
# Example configuration for enabling Prometheus metrics in your config.
#telemetry {
# prometheus_retention_time = "30s",
# disable_hostname = true
#}
I've tried several changes to this config, such as setting the AWS_KMS_UNSEAL environment variable, which doesnt seem to get applied. I've also execed into the containers and none of my environment variables seem to be set when I run a printenv command. I can't seem to figure out why its deploying the pods with the default config.
With the help of murtiko I figured this out. My indentation of the config block was off. It needs to be nested below the ha block. My working config looks like this:
global:
enabled: true
server:
extraSecretEnvironmentVars:
- envName: AWS_REGION
secretName: vault
secretKey: AWS_REGION
- envName: AWS_ACCESS_KEY_ID
secretName: vault
secretKey: AWS_ACCESS_KEY_ID
- envName: AWS_SECRET_ACCESS_KEY
secretName: vault
secretKey: AWS_SECRET_ACCESS_KEY
- envName: VAULT_AWSKMS_SEAL_KEY_ID
secretName: vault
secretKey: VAULT_AWSKMS_SEAL_KEY_ID
ha:
enabled: true
config: |
ui = true
listener "tcp" {
tls_disable = 1
address = "[::]:8200"
cluster_address = "[::]:8201"
}
seal "awskms" {
}
storage "raft" {
path = "/vault/data"
}
raft:
enabled: true
setNodeId: true
config: |
ui = true
listener "tcp" {
tls_disable = 1
address = "[::]:8200"
cluster_address = "[::]:8201"
}
seal "awskms" {
}
storage "raft" {
path = "/vault/data"
}

Prometheus/Alertmanager - many alerts firing - Why?

I have a 4 node K8s cluster set up via kubeadm on a local VM cluster. I am using the following:
Kubernetes 1.24
Helm 3.10.0
kube-prometheus-stack Helm chart 41.7.4 (app version 0.60.1)
When I go into either Prometheus or Alertmanager, there are many alerts that are always firing. Another thing to note is that Alertmanager "cluster status" is reporting as "disabled". Not sure what bearing (if any) that may have on this. I have not added any new alerts of my own - everything was presumably deployed with the Helm chart.
I do not understand what these alerts are triggering for other than what I can glean from their names. It does not seem a good thing that these alerts should be firing. Either there is something seriously wrong with the cluster or something is poorly configured in the alerting configuration of the Helm chart. I'm leaning toward the second case, but will admit, I really don't know.
Here is a listing of the firing alerts, along with label info:
etcdMembersDown
alertname=etcdMembersDown, job=kube-etcd, namespace=kube-system, pod=etcd-gagnon-m1, service=prometheus-stack-kube-prom-kube-etcd, severity=critical
etcdInsufficientMembers
alertname=etcdInsufficientMembers, endpoint=http-metrics, job=kube-etcd, namespace=kube-system, pod=etcd-gagnon-m1, service=prometheus-stack-kube-prom-kube-etcd, severity=critical
TargetDown
alertname=TargetDown, job=kube-scheduler, namespace=kube-system, service=prometheus-stack-kube-prom-kube-scheduler, severity=warning
alertname=TargetDown, job=kube-etcd, namespace=kube-system, service=prometheus-stack-kube-prom-kube-etcd, severity=warning
alertname=TargetDown, job=kube-proxy, namespace=kube-system, service=prometheus-stack-kube-prom-kube-proxy, severity=warning
alertname=TargetDown, job=kube-controller-manager, namespace=kube-system, service=prometheus-stack-kube-prom-kube-controller-manager, severity=warning
KubePodNotReady
alertname=KubePodNotReady, namespace=monitoring, pod=prometheus-stack-grafana-759774797c-r44sb, severity=warning
KubeDeploymentReplicasMismatch
alertname=KubeDeploymentReplicasMismatch, container=kube-state-metrics, deployment=prometheus-stack-grafana, endpoint=http, instance=192.168.42.19:8080, job=kube-state-metrics, namespace=monitoring, pod=prometheus-stack-kube-state-metrics-848f74474d-gp6pw, service=prometheus-stack-kube-state-metrics, severity=warning
KubeControllerManagerDown
alertname=KubeControllerManagerDown, severity=critical
KubeProxyDown
alertname=KubeProxyDown, severity=critical
KubeSchedulerDown
alertname=KubeSchedulerDown, severity=critical
Here is my values.yaml:
defaultRules:
create: true
rules:
alertmanager: true
etcd: true
configReloaders: true
general: true
k8s: true
kubeApiserverAvailability: true
kubeApiserverBurnrate: true
kubeApiserverHistogram: true
kubeApiserverSlos: true
kubeControllerManager: true
kubelet: true
kubeProxy: true
kubePrometheusGeneral: true
kubePrometheusNodeRecording: true
kubernetesApps: true
kubernetesResources: true
kubernetesStorage: true
kubernetesSystem: true
kubeSchedulerAlerting: true
kubeSchedulerRecording: true
kubeStateMetrics: true
network: true
node: true
nodeExporterAlerting: true
nodeExporterRecording: true
prometheus: true
prometheusOperator: true
prometheus:
enabled: true
ingress:
enabled: true
ingressClassName: nginx
hosts:
- prometheus.<hidden>
paths:
- /
pathType: ImplementationSpecific
grafana:
enabled: true
ingress:
enabled: true
ingressClassName: nginx
hosts:
- grafana.<hidden>
path: /
persistence:
enabled: true
size: 10Gi
alertmanager:
enabled: true
ingress:
enabled: true
ingressClassName: nginx
hosts:
- alerts.<hidden>
paths:
- /
pathType: ImplementationSpecific
config:
global:
slack_api_url: '<hidden>'
route:
receiver: "slack-default"
group_by:
- alertname
- cluster
- service
group_wait: 30s
group_interval: 5m # 5m
repeat_interval: 2h # 4h
routes:
- receiver: "slack-warn-critical"
matchers:
- severity =~ "warning|critical"
continue: true
receivers:
- name: "null"
- name: "slack-default"
slack_configs:
- send_resolved: true # false
channel: "#alerts-test"
- name: "slack-warn-critical"
slack_configs:
- send_resolved: true # false
channel: "#alerts-test"
kubeControllerManager:
service:
enabled: true
ports:
http: 10257
targetPorts:
http: 10257
serviceMonitor:
https: true
insecureSkipVerify: "true"
kubeEtcd:
serviceMonitor:
scheme: https
servername: <do I need it - don't know what this should be>
cafile: <do I need it - don't know what this should be>
certFile: <do I need it - don't know what this should be>
keyFile: <do I need it - don't know what this should be>
kubeProxy:
serviceMonitor:
https: true
kubeScheduler:
service:
enabled: true
ports:
http: 10259
targetPorts:
http: 10259
serviceMonitor:
https: true
insecureSkipVerify: "true"
Is there something wrong with this configuration? Are there any Kubernetes objects that might be missing or misconfigured? It seems very odd that one could install this Helm chart and experience this many "failures". Is there perhaps, a major problem with my cluster? I would think that if there was really something wrong with etcd, the kube-scheduler or kube-proxy that I would experience problems everywhere, but I am not.
If there is any other information I can pull from the cluster or related artifacts that might help, let me know and I will include them.

Deploying HA Vault To GKE - dial tcp 127.0.0.1:8200: connect: connection refused

As per the official documentation (https://developer.hashicorp.com/vault/tutorials/kubernetes/kubernetes-google-cloud-gke), the following works as expected:
helm install vault hashicorp/vault \
--set='server.ha.enabled=true' \
--set='server.ha.raft.enabled=true'
I can then run:
kubectl exec vault-0 -- vault status
And this works perfectly fine.
However, I've noticed that when if I don't have raft enabled, I get the dial tcp 127.0.0.1:8200: connect: connection refused" error message:
helm install vault hashicorp/vault \
--set='server.ha.enabled=true'
I'm trying to work out why my Vault deployment is giving me the same issue.
I'm trying to deploy Vault into GKE with auto-unseal keys and a Google Cloud Storage backend configured.
My values.yaml file contains:
global:
enabled: true
tlsDisable: false
injector:
enabled: true
replicas: 1
port: 8080
leaderElector:
enabled: true
image:
repository: "hashicorp/vault-k8s"
tag: "latest"
pullPolicy: IfNotPresent
agentImage:
repository: "hashicorp/vault"
tag: "latest"
authPath: "auth/kubernetes"
webhook:
failurePolicy: Ignore
matchPolicy: Exact
objectSelector: |
matchExpressions:
- key: app.kubernetes.io/name
operator: NotIn
values:
- {{ template "vault.name" . }}-agent-injector
certs:
secretName: vault-lab.company.com-cert
certName: tls.crt
keyName: tls.key
server:
enabled: true
image:
repository: "hashicorp/vault"
tag: "latest"
pullPolicy: IfNotPresent
extraEnvironmentVars:
GOOGLE_APPLICATION_CREDENTIALS: /vault/userconfig/vault-gcs/service-account.json
GOOGLE_REGION: europe-west2
GOOGLE_PROJECT: sandbox-vault-lab
volumes:
- name: vault-gcs
secret:
secretName: vault-gcs
- name: vault-lab-cert
secret:
secretName: vault-lab.company.com-cert
volumeMounts:
- name: vault-gcs
mountPath: /vault/userconfig/vault-gcs
readOnly: true
- name: vault-lab-cert
mountPath: /etc/tls
readOnly: true
service:
enabled: true
type: NodePort
externalTrafficPolicy: Cluster
port: 8200
targetPort: 8200
annotations:
cloud.google.com/app-protocols: '{"http":"HTTPS"}'
beta.cloud.google.com/backend-config: '{"ports": {"http":"config-default"}}'
ha:
enabled: true
replicas: 3
config: |
listener "tcp" {
tls_disable = 0
tls_min_version = "tls12"
address = "[::]:8200"
cluster_address = "[::]:8201"
}
storage "gcs" {
bucket = "vault-lab-bucket"
ha_enabled = "true"
}
service_registration "kubernetes" {}
# Example configuration for using auto-unseal, using Google Cloud KMS. The
# GKMS keys must already exist, and the cluster must have a service account
# that is authorized to access GCP KMS.
seal "gcpckms" {
project = "sandbox-vault-lab"
region = "global"
key_ring = "vault-helm-unseal-kr"
crypto_key = "vault-helm-unseal-key"
}
Something here must be misconfigured, but what, I'm unsure.
Any help would be appreciated.
EDIT:
Even after configuring Raft, I still encounter the same issue:
raft:
enabled: true
setNodeId: false
config: |
ui = false
listener "tcp" {
# tls_disable = 0
address = "[::]:8200"
cluster_address = "[::]:8201"
tls_cert_file = "/etc/tls/tls.crt"
tls_key_file = "/etc/tls/tls.key"
}
#storage "raft" {
# path = "/vault/data"
#}
storage "gcs" {
bucket = "vault-lab-bucket"
ha_enabled = "true"
}
service_registration "kubernetes" {}

Hashicorp Vault error - Client sent an HTTP request to an HTTPS server

Having trouble deploying Hashicorp Vault on kubernetes/helm. Can't get vault to work at all. I've really tried changing almost all the parameters I could and still can't get it to work and I don't know where the issue lies exactly.
The error I get is mainly based on Error Checking Seal status/Client sent an HTTP request to an HTTPS server.
If I set tls_disable=true inside the .Values.ha.config then I get an error that vault is sealed but I still can't view the UI... I feel like deploying vault has been bipolar and it sometimes works and sometimes doesn't. Then I can't replicate where the bug lied either. This has been a headache.
Here is my values.yaml file:
server:
enabled: true
ingress:
enabled: true
annotations:
cert.<issuer>.cloud/issuer: <intermediate-hostname>
cert.<issuer>.cloud/secretname: vault-server-tls
cert.<issuer>.cloud/purpose: managed
dns.<issuer>.cloud/class: <class>
dns.<issuer>.cloud/dnsnames: "<hostname>"
dns.<issuer>.cloud/ttl: "600"
hosts:
- host: "vault.<hostname>"
paths: []
tls:
- secretName: vault-server-tls
hosts:
- vault.<hostname>
extraVolumes:
- type: secret
name: vault-server-tls
service:
enabled: true
port: 8200
targetPort: 443
ha:
enabled: true
replicas: 3
raft:
enabled: true
config: |
ui = true
listener "tcp" {
tls_disable = false
address = "[::]:8200"
cluster_address = "[::]:8201"
tls_cert_file = "/vault/userconfig/vault-server-tls/tls.crt"
tls_key_file = "/vault/userconfig/vault-server-tls/tls.key"
tls_client_ca_file = "/vault/userconfig/vault-server-tls/vault.ca"
}
storage "raft" {
path = "/vault/data"
}
config: |
ui = true
listener "tcp" {
tls_disable = false
address = "[::]:443"
cluster_address = "[::]:8201"
tls_cert_file = "/vault/userconfig/vault-server-tls/tls.crt"
tls_key_file = "/vault/userconfig/vault-server-tls/tls.key"
tls_client_ca_file = "/vault/userconfig/vault-server-tls/vault.ca"
tls_require_and_verify_client_cert = false
}
storage "consul" {
path = "vault"
address = "HOST_IP:8500"
}
disable_mlock = true
ui:
enabled: true
serviceType: LoadBalancer
externalPort: 443
targetPort: 8200
EDIT: I'm now able to view the UI from the LoadBalancer but not from the hostname set in dns.<issuer>.cloud/dnsnames: "<hostname>" under the ingress.annotations
Still get the error but can view the UI via the LoadBalancer: Readiness probe failed. Error unsealing: Error making API request. URL: PUT http://127.0.0.1:8200/v1/sys/unsealCode: 400. Raw Message: Client sent an HTTP request to an HTTPS server.
As you mentioned you faced issued of Error Checking Seal status/Client sent an HTTP request to an HTTPS server & vault is sealed
Once you have deployed the vault using the helm chart you have to unseal the vault using the CLI first time and after that UI will be available to use.
Reference document : https://learn.hashicorp.com/tutorials/vault/kubernetes-raft-deployment-guide?in=vault/kubernetes#initialize-and-unseal-vault
Get the list of pods
kubectl get pods --selector='app.kubernetes.io/name=vault' --namespace=' vault'
Exec into the pods
kubectl exec --stdin=true --tty=true vault-0 -- vault operator init
kubectl exec --stdin=true --tty=true vault-0 -- vault operator unseal
once you will unseal the vault your PODs status will get changed to 1/1 in Ready instead of 0/1

Vault is already initialized error message

I deployed the following helm chart for vault and I get the following error "Vault is already initialized" when doing "vault operator init" command. I do not understand why it is already initialized.
Also, when I enable readinessProbe the pod keeps restating I assume because it is not initialized properly.
global:
enabled: true
tlsDisable: false
server:
extraEnvironmentVars:
VAULT_CACERT: /vault/userconfig/vault-server-tls/ca.crt
logLevel: debug
logFormat: standard
readinessProbe:
enabled: false
authDelegator:
enabled: true
extraVolumes:
- type: secret
name: vault-server-tls # Matches the ${SECRET_NAME} from above
standalone:
enabled: true
config: |
listener "tcp" {
address = "[::]:8200"
cluster_address = "[::]:8201"
tls_cert_file = "/vault/userconfig/vault-server-tls/tls.crt"
tls_key_file = "/vault/userconfig/vault-server-tls/tls.key"
}
storage "file" {
path = "/vault/data"
}