I have vault deployed from the official helm chart and it's running in HA mode, with auto-unseal, TLS enabled, raft as the backend, and the cluster is 1.17 in EKS. I have all of the raft followers joined to the vault-0 pod as the leader. I have followed this tutorial to the tee and I always end up with tls bad certificate. http: TLS handshake error from 123.45.6.789:52936: remote error: tls: bad certificate is the exact error.
I did find an issue with following this tutorial exactly. The part where they pipe the kubernetes CA to base64. For me this was multi-line and failed to deploy. So I pipped that output to tr -d '\n'. But this is where I get this error. I've tried the part of launching a container and testing it with curl, and it fails, then tailing the agent injector logs, I get that bad cert error.
Here is my values.yaml if it helps.
global:
tlsDisable: false
injector:
metrics:
enabled: true
certs:
secretName: vault-tls
caBundle: "(output of cat vault-injector.ca | base64 | tr -d '\n')"
certName: vault.crt
keyName: vault.key
server:
extraEnvironmentVars:
VAULT_CACERT: "/vault/userconfig/vault-tls/vault.ca"
extraSecretEnvironmentVars:
- envName: AWS_ACCESS_KEY_ID
secretName: eks-creds
secretKey: AWS_ACCESS_KEY_ID
- envName: AWS_SECRET_ACCESS_KEY
secretName: eks-creds
secretKey: AWS_SECRET_ACCESS_KEY
- envName: VAULT_UNSEAL_KMS_KEY_ID
secretName: vault-kms-id
secretKey: VAULT_UNSEAL_KMS_KEY_ID
extraVolumes:
- type: secret
name: vault-tls
- type: secret
name: eks-creds
- type: secret
name: vault-kms-id
resources:
requests:
memory: 256Mi
cpu: 250m
limits:
memory: 512Mi
cpu: 500m
auditStorage:
enabled: true
storageClass: gp2
standalone:
enabled: false
ha:
enabled: true
raft:
enabled: true
config: |
ui = true
api_addr = "[::]:8200"
cluster_addr = "[::]:8201"
listener "tcp" {
tls_disable = 0
tls_cert_file = "/vault/userconfig/vault-tls/vault.crt"
tls_key_file = "/vault/userconfig/vault-tls/vault.key"
tls_client_ca_file = "/vault/userconfig/vault-tls/vault.ca"
tls_min_version = "tls12"
address = "[::]:8200"
cluster_address = "[::]:8201"
}
storage "raft" {
path = "/vault/data"
}
disable_mlock = true
service_registration "kubernetes" {}
seal "awskms" {
region = "us-east-1"
kms_key_id = "VAULT_UNSEAL_KMS_KEY_ID"
}
ui:
enabled: true
I've exec'd into the agent-injector and poked around. I can see the /etc/webhook/certs/ are there and they look correct.
Here is my vault-agent-injector pod
kubectl describe pod vault-agent-injector-6bbf84484c-q8flv
Name: vault-agent-injector-6bbf84484c-q8flv
Namespace: default
Priority: 0
Node: ip-172-16-3-151.ec2.internal/172.16.3.151
Start Time: Sat, 19 Dec 2020 16:27:14 -0800
Labels: app.kubernetes.io/instance=vault
app.kubernetes.io/name=vault-agent-injector
component=webhook
pod-template-hash=6bbf84484c
Annotations: kubernetes.io/psp: eks.privileged
Status: Running
IP: 172.16.3.154
IPs:
IP: 172.16.3.154
Controlled By: ReplicaSet/vault-agent-injector-6bbf84484c
Containers:
sidecar-injector:
Container ID: docker://2201b12c9bd72b6b85d855de6917548c9410e2b982fb5651a0acd8472c3554fa
Image: hashicorp/vault-k8s:0.6.0
Image ID: docker-pullable://hashicorp/vault-k8s#sha256:5697b85bc69aa07b593fb2a8a0cd38daefb5c3e4a4b98c139acffc9cfe5041c7
Port: <none>
Host Port: <none>
Args:
agent-inject
2>&1
State: Running
Started: Sat, 19 Dec 2020 16:27:15 -0800
Ready: True
Restart Count: 0
Liveness: http-get https://:8080/health/ready delay=1s timeout=5s period=2s #success=1 #failure=2
Readiness: http-get https://:8080/health/ready delay=2s timeout=5s period=2s #success=1 #failure=2
Environment:
AGENT_INJECT_LISTEN: :8080
AGENT_INJECT_LOG_LEVEL: info
AGENT_INJECT_VAULT_ADDR: https://vault.default.svc:8200
AGENT_INJECT_VAULT_AUTH_PATH: auth/kubernetes
AGENT_INJECT_VAULT_IMAGE: vault:1.5.4
AGENT_INJECT_TLS_CERT_FILE: /etc/webhook/certs/vault.crt
AGENT_INJECT_TLS_KEY_FILE: /etc/webhook/certs/vault.key
AGENT_INJECT_LOG_FORMAT: standard
AGENT_INJECT_REVOKE_ON_SHUTDOWN: false
AGENT_INJECT_TELEMETRY_PATH: /metrics
Mounts:
/etc/webhook/certs from webhook-certs (ro)
/var/run/secrets/kubernetes.io/serviceaccount from vault-agent-injector-token-k8ltm (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
webhook-certs:
Type: Secret (a volume populated by a Secret)
SecretName: vault-tls
Optional: false
vault-agent-injector-token-k8ltm:
Type: Secret (a volume populated by a Secret)
SecretName: vault-agent-injector-token-k8ltm
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 40m default-scheduler Successfully assigned default/vault-agent-injector-6bbf84484c-q8flv to ip-172-16-3-151.ec2.internal
Normal Pulled 40m kubelet, ip-172-16-3-151.ec2.internal Container image "hashicorp/vault-k8s:0.6.0" already present on machine
Normal Created 40m kubelet, ip-172-16-3-151.ec2.internal Created container sidecar-injector
Normal Started 40m kubelet, ip-172-16-3-151.ec2.internal Started container sidecar-injector
My vault deployment
kubectl describe deployment vault
Name: vault-agent-injector
Namespace: default
CreationTimestamp: Sat, 19 Dec 2020 16:27:14 -0800
Labels: app.kubernetes.io/instance=vault
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=vault-agent-injector
component=webhook
Annotations: deployment.kubernetes.io/revision: 1
Selector: app.kubernetes.io/instance=vault,app.kubernetes.io/name=vault-agent-injector,component=webhook
Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app.kubernetes.io/instance=vault
app.kubernetes.io/name=vault-agent-injector
component=webhook
Service Account: vault-agent-injector
Containers:
sidecar-injector:
Image: hashicorp/vault-k8s:0.6.0
Port: <none>
Host Port: <none>
Args:
agent-inject
2>&1
Liveness: http-get https://:8080/health/ready delay=1s timeout=5s period=2s #success=1 #failure=2
Readiness: http-get https://:8080/health/ready delay=2s timeout=5s period=2s #success=1 #failure=2
Environment:
AGENT_INJECT_LISTEN: :8080
AGENT_INJECT_LOG_LEVEL: info
AGENT_INJECT_VAULT_ADDR: https://vault.default.svc:8200
AGENT_INJECT_VAULT_AUTH_PATH: auth/kubernetes
AGENT_INJECT_VAULT_IMAGE: vault:1.5.4
AGENT_INJECT_TLS_CERT_FILE: /etc/webhook/certs/vault.crt
AGENT_INJECT_TLS_KEY_FILE: /etc/webhook/certs/vault.key
AGENT_INJECT_LOG_FORMAT: standard
AGENT_INJECT_REVOKE_ON_SHUTDOWN: false
AGENT_INJECT_TELEMETRY_PATH: /metrics
Mounts:
/etc/webhook/certs from webhook-certs (ro)
Volumes:
webhook-certs:
Type: Secret (a volume populated by a Secret)
SecretName: vault-tls
Optional: false
Conditions:
Type Status Reason
---- ------ ------
Available True MinimumReplicasAvailable
Progressing True NewReplicaSetAvailable
OldReplicaSets: <none>
NewReplicaSet: vault-agent-injector-6bbf84484c (1/1 replicas created)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScalingReplicaSet 46m deployment-controller Scaled up replica set vault-agent-injector-6bbf84484c to 1
What else can I check and verify or troubleshoot in order to figure out why the agent injector is causing this error?
Related
I keep getting this error when Deploying into k8s
how can i get more info about what is happening in the pod and container?
Here is my helm :
global:
enabled: true
tlsDisable: false
extraEnvironmentVars:
VAULT_CACERT: /vault/userconfig/vault-tls/vault.ca
server:
extraVolumes:
- type: secret
name: vault-tls
extraSecretEnvironmentVars:
- envName: AWS_ACCESS_KEY_ID
secretName: eks-creds
secretKey: AWS_ACCESS_KEY_ID
- envName: AWS_SECRET_ACCESS_KEY
secretName: eks-creds
secretKey: AWS_SECRET_ACCESS_KEY
ha:
enabled: true
replicas: 3
raft:
enabled: true
setNodeId: false
config: |
ui = true
serviceType: "LoadBalancer"
serviceNodePort: null
externalPort: 8200
listener "tcp" {
address = "0.0.0.0:8200"
cluster_address = "0.0.0.0:8201"
tls_cert_file = "/vault/userconfig/vault-tls/vault.crt"
tls_key_file = "/vault/userconfig/vault-tls/vault.key"
tls_client_ca_file = "/vault/userconfig/vault-tls/vault.ca"
}
storage "raft" {
path = "/vault/data"
}
seal "awskms" {
region = "us-east-1"
kms_key_id = "xxxxxxxxxxxx"
}
service_registration "kubernetes" {}
running :
kubectl -n vault-perso logs -p vault-0
I'm getting :
error loading configuration from /tmp/storageconfig.hcl: At 3:12: illegal char
$ kubectl describe pod vault-0 -n vault-xxx
Name: vault-0
Namespace: vault-xxx
Priority: 0
Node: ip-10-xxx-0-xxx.ec2.internal/10.xxx.0.98
Start Time: Mon, 01 Feb 2021 16:48:47 +0200
Labels: app.kubernetes.io/instance=vault
app.kubernetes.io/name=vault
component=server
controller-revision-hash=vault-785bc949ff
helm.sh/chart=vault-0.9.0
statefulset.kubernetes.io/pod-name=vault-0
Annotations: kubernetes.io/psp: eks.privileged
Status: Running
IP: 1.1.1.1
IPs:
IP: 1.1.1.1
Controlled By: StatefulSet/vault
Containers:
vault:
Container ID: docker://57ef1439640967f6824031xxxxfa6b64cb95efae72
Image: vault:1.6.1
Image ID: docker-pullable://vault#sha256:efe6036315xxxx2643666a4aab1ad4
Ports: 8200/TCP, 8201/TCP, 8202/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP
Command:
/bin/sh
-ec
Args:
cp /vault/config/extraconfig-from-values.hcl /tmp/storageconfig.hcl;
[ -n "${HOST_IP}" ] && sed -Ei "s|HOST_IP|${HOST_IP?}|g" /tmp/storageconfig.hcl;
[ -n "${POD_IP}" ] && sed -Ei "s|POD_IP|${POD_IP?}|g" /tmp/storageconfig.hcl;
[ -n "${HOSTNAME}" ] && sed -Ei "s|HOSTNAME|${HOSTNAME?}|g" /tmp/storageconfig.hcl;
[ -n "${API_ADDR}" ] && sed -Ei "s|API_ADDR|${API_ADDR?}|g" /tmp/storageconfig.hcl;
[ -n "${TRANSIT_ADDR}" ] && sed -Ei "s|TRANSIT_ADDR|${TRANSIT_ADDR?}|g" /tmp/storageconfig.hcl;
[ -n "${RAFT_ADDR}" ] && sed -Ei "s|RAFT_ADDR|${RAFT_ADDR?}|g" /tmp/storageconfig.hcl;
/usr/local/bin/docker-entrypoint.sh vault server -config=/tmp/storageconfig.hcl
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Mon, 01 Feb 2021 16:54:46 +0200
Finished: Mon, 01 Feb 2021 16:54:46 +0200
Ready: False
Restart Count: 6
Readiness: exec [/bin/sh -ec vault status -tls-skip-verify] delay=5s timeout=3s period=5s #success=1 #failure=2
Environment:
HOST_IP: (v1:status.hostIP)
POD_IP: (v1:status.podIP)
VAULT_K8S_POD_NAME: vault-0 (v1:metadata.name)
VAULT_K8S_NAMESPACE: vault-xxx (v1:metadata.namespace)
VAULT_ADDR: https://127.0.0.1:8200
VAULT_API_ADDR: https://$(POD_IP):8200
SKIP_CHOWN: true
SKIP_SETCAP: true
HOSTNAME: vault-0 (v1:metadata.name)
VAULT_CLUSTER_ADDR: https://$(HOSTNAME).vault-internal:8201
HOME: /home/vault
AWS_ACCESS_KEY_ID: <set to the key 'AWS_ACCESS_KEY_ID' in secret 'eks-creds'> Optional: false
AWS_SECRET_ACCESS_KEY: <set to the key 'AWS_SECRET_ACCESS_KEY' in secret 'eks-creds'> Optional: false
Mounts:
/home/vault from home (rw)
/var/run/secrets/kubernetes.io/serviceaccount from vault-token-xls5s (ro)
/vault/config from config (rw)
/vault/data from data (rw)
/vault/userconfig/vault-tls from userconfig-vault-tls (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: data-vault-0
ReadOnly: false
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: vault-config
Optional: false
userconfig-vault-tls:
Type: Secret (a volume populated by a Secret)
SecretName: vault-tls
Optional: false
home:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
vault-token-xls5s:
Type: Secret (a volume populated by a Secret)
SecretName: vault-token-xls5s
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 8m9s default-scheduler Successfully assigned vault-xxx/vault-0 to ip-10-101-0-98.ec2.internal
Normal SuccessfulAttachVolume 8m7s attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-626895easssscec00cb845"
Normal Pulled 6m23s (x5 over 8m4s) kubelet Container image "vault:1.6.1" already present on machine
Normal Created 6m23s (x5 over 8m4s) kubelet Created container vault
Normal Started 6m23s (x5 over 8m4s) kubelet Started container vault
Warning BackOff 3m3s (x26 over 8m2s) kubelet Back-off restarting failed container
Your config is wrong. You have the following:
config: |
ui = true
serviceType: "LoadBalancer"
serviceNodePort: null
externalPort: 8200
listener "tcp" {
The serviceType, serviceNodePort and externalPort looks like copy/pasted from some other place.
See Vault Helm docs, right at the end, they do mention a snippet with ui = true, then the listener "tcp"..
Currently having a problem where the readiness probe is failing when deploying the Vault Helm chart. Vault is working but whenever I describe the pods get this error. How do I get the probe to use HTTPS instead of HTTP if anyone knows how to solve this I would be great as losing my mind slowly?
Kubectl Describe pod
Name: vault-0
Namespace: default
Priority: 0
Node: ip-192-168-221-250.eu-west-2.compute.internal/192.168.221.250
Start Time: Mon, 24 Aug 2020 16:41:59 +0100
Labels: app.kubernetes.io/instance=vault
app.kubernetes.io/name=vault
component=server
controller-revision-hash=vault-768cd675b9
helm.sh/chart=vault-0.6.0
statefulset.kubernetes.io/pod-name=vault-0
Annotations: kubernetes.io/psp: eks.privileged
Status: Running
IP: 192.168.221.251
IPs:
IP: 192.168.221.251
Controlled By: StatefulSet/vault
Containers:
vault:
Container ID: docker://445d7cdc34cd01ef1d3a46f2d235cb20a94e48279db3fcdd84014d607af2fe1c
Image: vault:1.4.2
Image ID: docker-pullable://vault#sha256:12587718b79dc5aff542c410d0bcb97e7fa08a6b4a8d142c74464a9df0c76d4f
Ports: 8200/TCP, 8201/TCP, 8202/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP
Command:
/bin/sh
-ec
Args:
sed -E "s/HOST_IP/${HOST_IP?}/g" /vault/config/extraconfig-from-values.hcl > /tmp/storageconfig.hcl;
sed -Ei "s/POD_IP/${POD_IP?}/g" /tmp/storageconfig.hcl;
/usr/local/bin/docker-entrypoint.sh vault server -config=/tmp/storageconfig.hcl
State: Running
Started: Mon, 24 Aug 2020 16:42:00 +0100
Ready: False
Restart Count: 0
Readiness: exec [/bin/sh -ec vault status -tls-skip-verify] delay=5s timeout=5s period=3s #success=1 #failure=2
Environment:
HOST_IP: (v1:status.hostIP)
POD_IP: (v1:status.podIP)
VAULT_K8S_POD_NAME: vault-0 (v1:metadata.name)
VAULT_K8S_NAMESPACE: default (v1:metadata.namespace)
VAULT_ADDR: http://127.0.0.1:8200
VAULT_API_ADDR: http://$(POD_IP):8200
SKIP_CHOWN: true
SKIP_SETCAP: true
HOSTNAME: vault-0 (v1:metadata.name)
VAULT_CLUSTER_ADDR: https://$(HOSTNAME).vault-internal:8201
HOME: /home/vault
VAULT_CACERT: /vault/userconfig/vault-server-tls/vault.ca
Mounts:
/home/vault from home (rw)
/var/run/secrets/kubernetes.io/serviceaccount from vault-token-cv9vx (ro)
/vault/config from config (rw)
/vault/userconfig/vault-server-tls from userconfig-vault-server-tls (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: vault-config
Optional: false
userconfig-vault-server-tls:
Type: Secret (a volume populated by a Secret)
SecretName: vault-server-tls
Optional: false
home:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
vault-token-cv9vx:
Type: Secret (a volume populated by a Secret)
SecretName: vault-token-cv9vx
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 7s default-scheduler Successfully assigned default/vault-0 to ip-192-168-221-250.eu-west-2.compute.internal
Normal Pulled 6s kubelet, ip-192-168-221-250.eu-west-2.compute.internal Container image "vault:1.4.2" already present on machine
Normal Created 6s kubelet, ip-192-168-221-250.eu-west-2.compute.internal Created container vault
Normal Started 6s kubelet, ip-192-168-221-250.eu-west-2.compute.internal Started container vault
Warning Unhealthy 0s kubelet, ip-192-168-221-250.eu-west-2.compute.internal Readiness probe failed: Error checking seal status: Error making API request.
URL: GET http://127.0.0.1:8200/v1/sys/seal-status
Code: 400. Raw Message:
Client sent an HTTP request to an HTTPS server.
Vault Config File
# global:
# tlsDisable: false
injector:
enabled: false
server:
extraEnvironmentVars:
VAULT_CACERT: /vault/userconfig/vault-server-tls/vault.ca
extraVolumes:
- type: secret
name: vault-server-tls # Matches the ${SECRET_NAME} from above
affinity: ""
readinessProbe:
enabled: true
path: /v1/sys/health
# # livelinessProbe:
# # enabled: true
# # path: /v1/sys/health?standbyok=true
# # initialDelaySeconds: 60
ha:
enabled: true
config: |
ui = true
api_addr = "https://127.0.0.1:8200" # Unsure if this is correct
storage "dynamodb" {
ha_enabled = "true"
region = "eu-west-2"
table = "global-vault-data"
access_key = "KEY"
secret_key = "SECRET"
}
# listener "tcp" {
# address = "0.0.0.0:8200"
# tls_disable = "true"
# }
listener "tcp" {
address = "0.0.0.0:8200"
cluster_address = "0.0.0.0:8201"
tls_cert_file = "/vault/userconfig/vault-server-tls/vault.crt"
tls_key_file = "/vault/userconfig/vault-server-tls/vault.key"
tls_client_ca_file = "/vault/userconfig/vault-server-tls/vault.ca"
}
seal "awskms" {
region = "eu-west-2"
access_key = "KEY"
secret_key = "SECRET"
kms_key_id = "ID"
}
ui:
enabled: true
serviceType: LoadBalancer
In your environment variable definitions you have:
VAULT_ADDR: http://127.0.0.1:8200
And non TLS is diable on your Vault configs (TLS enabled):
listener "tcp" {
address = "0.0.0.0:8200"
cluster_address = "0.0.0.0:8201"
tls_cert_file = "/vault/userconfig/vault-server-tls/vault.crt"
tls_key_file = "/vault/userconfig/vault-server-tls/vault.key"
tls_client_ca_file = "/vault/userconfig/vault-server-tls/vault.ca"
}
And your Readiness probe is executing in the pod:
vault status -tls-skip-verify
So that's trying to connect to http://127.0.0.1:8200, you can try changing the environment variable to use HTTPS: VAULT_ADDR=https://127.0.0.1:8200
You may have another (different) issue with your configs and env variable not matching:
K8s manifest:
VAULT_API_ADDR: http://$(POD_IP):8200
Vault configs:
api_addr = "https://127.0.0.1:8200"
✌️
If you are on Mac add the Vault URL to your .zshrc or .bash_profile file.
On the terminal open either .zshrc or .bash_profile file by doing this:
$ open .zshrc
Copy and paste this into it export VAULT_ADDR='http://127.0.0.1:8200'
Save the file by issuing on the terminal
$ source .zshrc
You can also set the tlsDisable to false in the global settings like this:
global:
tlsDisable: false
As the documentation for the helm chart says here:
The http/https scheme is controlled by the tlsDisable value.
I am using ISTIO and hostnames to load balance and direct traffic. I have the following Virtual Service enabled:
kind: VirtualService
metadata:
name: app-lab-app
namespace: my-namespace
spec:
gateways:
- istio-system/ingressgateway
hosts:
- hostname1.lab
http:
- match:
route:
- destination:
host: search-head-service
port:
number: 8000
When I try to reach this service via cURL, I receive the following error (32271 is the hostport which is mapped to port 80 on ingressgateway):
curl -Hhost:hostname1.lab http://10.20.1.108:32271/ -L
curl: (7) Failed to connect to hostname1.lab port 80: Connection refused
The issue is this..the endpoint does a redirect. I can reach the first website, but once the redirect happens, it fails
I can make this work by removing the hostname in the spec and changing to '*' but this won't help me do the host-based load balancing.
EDIT: ingress-gateway config (kubectl describe pod/ingress-gateway-xxxx)
Name: istio-ingressgateway-657df8bc75-cmghw
Namespace: istio-system
Priority: 0
Node: ip-10-20-1-108.us-west-2.compute.internal/10.20.1.108
Start Time: Tue, 21 Apr 2020 13:22:48 -0500
Labels: app=istio-ingressgateway
chart=gateways
heritage=Tiller
istio=ingressgateway
pod-template-hash=657df8bc75
release=istio
service.istio.io/canonical-name=istio-ingressgateway
service.istio.io/canonical-revision=1.5
Annotations: cni.projectcalico.org/podIP: 10.192.1.36/32
kubernetes.io/psp: 00-privileged
sidecar.istio.io/inject: false
Status: Running
IP: 10.192.1.36
IPs:
IP: 10.192.1.36
Controlled By: ReplicaSet/istio-ingressgateway-657df8bc75
Containers:
istio-proxy:
Container ID: docker://bfa29df838cd1e42a24674838bbf8454c8d56ec898b1833563f1b89a19a38030
Image: docker.io/istio/proxyv2:1.5.0
Image ID: docker-pullable://docker.io/istio/proxyv2#sha256:89b5fe2df96920189a193dd5f7dbd776e00024e4c1fd1b59bb53867278e9645a
Ports: 15020/TCP, 80/TCP, 443/TCP, 15029/TCP, 15030/TCP, 15031/TCP, 15032/TCP, 31400/TCP, 15443/TCP, 15011/TCP, 8060/TCP, 853/TCP, 15090/TCP
Host Ports: 0/TCP, 80/TCP, 443/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP
Args:
proxy
router
--domain
$(POD_NAMESPACE).svc.cluster.local
--proxyLogLevel=warning
--proxyComponentLogLevel=misc:error
--log_output_level=default:info
--drainDuration
45s
--parentShutdownDuration
1m0s
--connectTimeout
10s
--serviceCluster
istio-ingressgateway
--zipkinAddress
zipkin.istio-system:9411
--proxyAdminPort
15000
--statusPort
15020
--controlPlaneAuthPolicy
NONE
--discoveryAddress
istio-pilot.istio-system.svc:15012
--trust-domain=cluster.local
State: Running
Started: Tue, 21 Apr 2020 13:22:50 -0500
Ready: True
Restart Count: 0
Limits:
cpu: 2
memory: 1Gi
Requests:
cpu: 10m
memory: 40Mi
Readiness: http-get http://:15020/healthz/ready delay=1s timeout=1s period=2s #success=1 #failure=30
Environment:
JWT_POLICY: first-party-jwt
PILOT_CERT_PROVIDER: istiod
ISTIO_META_USER_SDS: true
CA_ADDR: istio-pilot.istio-system.svc:15012
NODE_NAME: (v1:spec.nodeName)
POD_NAME: istio-ingressgateway-657df8bc75-cmghw (v1:metadata.name)
POD_NAMESPACE: istio-system (v1:metadata.namespace)
INSTANCE_IP: (v1:status.podIP)
HOST_IP: (v1:status.hostIP)
SERVICE_ACCOUNT: (v1:spec.serviceAccountName)
ISTIO_META_WORKLOAD_NAME: istio-ingressgateway
ISTIO_META_OWNER: kubernetes://apis/apps/v1/namespaces/istio-system/deployments/istio-ingressgateway
ISTIO_META_MESH_ID: cluster.local
ISTIO_AUTO_MTLS_ENABLED: true
ISTIO_META_POD_NAME: istio-ingressgateway-657df8bc75-cmghw (v1:metadata.name)
ISTIO_META_CONFIG_NAMESPACE: istio-system (v1:metadata.namespace)
ISTIO_META_ROUTER_MODE: sni-dnat
ISTIO_META_CLUSTER_ID: Kubernetes
Mounts:
/etc/istio/ingressgateway-ca-certs from ingressgateway-ca-certs (ro)
/etc/istio/ingressgateway-certs from ingressgateway-certs (ro)
/etc/istio/pod from podinfo (rw)
/var/run/ingress_gateway from ingressgatewaysdsudspath (rw)
/var/run/secrets/istio from istiod-ca-cert (rw)
/var/run/secrets/kubernetes.io/serviceaccount from istio-ingressgateway-service-account-token-7ssdg (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
istiod-ca-cert:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: istio-ca-root-cert
Optional: false
podinfo:
Type: DownwardAPI (a volume populated by information about the pod)
Items:
metadata.labels -> labels
metadata.annotations -> annotations
ingressgatewaysdsudspath:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
ingressgateway-certs:
Type: Secret (a volume populated by a Secret)
SecretName: istio-ingressgateway-certs
Optional: true
ingressgateway-ca-certs:
Type: Secret (a volume populated by a Secret)
SecretName: istio-ingressgateway-ca-certs
Optional: true
istio-ingressgateway-service-account-token-7ssdg:
Type: Secret (a volume populated by a Secret)
SecretName: istio-ingressgateway-service-account-token-7ssdg
Optional: false
QoS Class: Burstable
Node-Selectors: istio-ingressgateway=true
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events: <none>
While I'd still like to understand what was happening originally, an ISTIO guru had me apply the patch below. These steps create an ISTIO Gateway (not an ingress gateway) on all nodes with the appropriate label:
Step 1 - Label certain nodes:
kubectl label nodes <hostname> istio-ingressgateway=true
kubectl label nodes <hostname> istio-ingressgateway=true
Step 2 - Save patch to a file like patch.json:
"spec": {
"replicas": 2,
"template": {
"spec": {
"nodeSelector": {"istio-ingressgateway" : "true"},
"containers": [
{"name" : "istio-proxy", "ports": [{"containerPort" : 80, "hostPort" : 80, "protocol": "TCP"}, {"containerPort":443, "hostPort": 443, "protocol" : "TCP"}]}
]
}
}
}
}
Step 3 - Apply the patch:
kubectl -n istio-system patch deployment/istio-ingressgateway --patch "$(cat patch.json)"
I am deploy traefik v2.1.6 using this yaml:
apiVersion: v1
kind: Service
metadata:
name: traefik
annotations:
prometheus.io/scrape: 'true'
prometheus.io/port: '8080'
spec:
ports:
- name: web
port: 80
- name: websecure
port: 443
- name: metrics
port: 8080
selector:
app: traefik
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: traefik-ingress-controller
labels:
app: traefik
spec:
selector:
matchLabels:
app: traefik
template:
metadata:
name: traefik
labels:
app: traefik
spec:
serviceAccountName: traefik-ingress-controller
terminationGracePeriodSeconds: 1
containers:
- image: traefik:2.1.6
name: traefik-ingress-lb
ports:
- name: web
containerPort: 80
hostPort: 80 #hostPort方式,将端口暴露到集群节点
- name: websecure
containerPort: 443
hostPort: 443 #hostPort方式,将端口暴露到集群节点
- name: metrics
containerPort: 8080
resources:
limits:
cpu: 2000m
memory: 1024Mi
requests:
cpu: 1000m
memory: 1024Mi
securityContext:
capabilities:
drop:
- ALL
add:
- NET_BIND_SERVICE
envFrom:
- secretRef:
name: traefik-alidns-secret
args:
- --configfile=/config/traefik.yaml
- --logLevel=INFO
- --metrics=true
- --metrics.prometheus=true
- --entryPoints.metrics.address=:8080
- --metrics.prometheus.entryPoint=metrics
- --metrics.prometheus.addServicesLabels=true
- --metrics.prometheus.addEntryPointsLabels=true
- --metrics.prometheus.buckets=0.100000, 0.300000, 1.200000, 5.000000
# HTTPS证书配置
- --entryPoints.web.address=:80
- --entryPoints.websecure.address=:443
# 邮箱配置
- --certificatesResolvers.default.acme.email=jiangtingqiang#gmail.com
# 保存 ACME 证书的位置
- --certificatesResolvers.default.acme.storage=/config/acme.json
- --certificatesResolvers.default.acme.httpChallenge.entryPoint=web
# 下面是用于测试的ca服务,如果https证书生成成功了,则移除下面参数
- --certificatesResolvers.default.acme.dnsChallenge.provider=alidns
- --certificatesResolvers.default.acme.dnsChallenge=true
- --certificatesresolvers.default.acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory
volumeMounts:
- mountPath: "/config"
name: "config"
volumes:
- name: config
configMap:
name: traefik-config
tolerations: #设置容忍所有污点,防止节点被设置污点
- operator: "Exists"
nodeSelector: #设置node筛选器,在特定label的节点上启动
app-type: "online-app"
the service start success:
$ k get daemonset -n kube-system
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
traefik-ingress-controller 1 1 1 1 1 app-type=online-app 61d
But when I access the treafik using this command, it shows 404 not found:
[root#fat001 ~]# curl -k --header 'Host:traefik.example.com' https://172.19.104.230
404 page not found
172.19.104.230 is the kubernetes cluster(v1.15.2) edge node runs traefik, what should I do to access traefik success? This is pod describe output:
$ k describe pod traefik-ingress-controller-t4rmx -n kube-system
Name: traefik-ingress-controller-t4rmx
Namespace: kube-system
Priority: 0
Node: azshara-k8s02/172.19.104.230
Start Time: Tue, 31 Mar 2020 00:14:38 +0800
Labels: app=traefik
controller-revision-hash=547587d6d5
pod-template-generation=44
Annotations: <none>
Status: Running
IP: 172.30.208.18
IPs: <none>
Controlled By: DaemonSet/traefik-ingress-controller
Containers:
traefik-ingress-lb:
Container ID: docker://88b74826c5e380e00a53d2d4741ab6b74d8628412275f062dda861ad26681971
Image: traefik:2.1.6
Image ID: docker-pullable://traefik#sha256:13c5e62a0757bd8bf57c8c36575f7686f06186994ad6d2bda773ed8f140415c2
Ports: 80/TCP, 443/TCP, 8080/TCP
Host Ports: 80/TCP, 443/TCP, 0/TCP
Args:
--configfile=/config/traefik.yaml
--logLevel=INFO
--metrics=true
--metrics.prometheus=true
--entryPoints.metrics.address=:8080
--metrics.prometheus.entryPoint=metrics
--metrics.prometheus.addServicesLabels=true
--metrics.prometheus.addEntryPointsLabels=true
--metrics.prometheus.buckets=0.100000, 0.300000, 1.200000, 5.000000
--entryPoints.web.address=:80
--entryPoints.websecure.address=:443
--certificatesResolvers.default.acme.email=jiangtingqiang#gmail.com
--certificatesResolvers.default.acme.storage=/config/acme.json
--certificatesResolvers.default.acme.httpChallenge.entryPoint=web
--certificatesResolvers.default.acme.dnsChallenge.provider=alidns
--certificatesResolvers.default.acme.dnsChallenge=true
--certificatesresolvers.default.acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory
State: Running
Started: Tue, 31 Mar 2020 00:14:39 +0800
Ready: True
Restart Count: 0
Limits:
cpu: 2
memory: 1Gi
Requests:
cpu: 1
memory: 1Gi
Environment Variables from:
traefik-alidns-secret Secret Optional: false
Environment: <none>
Mounts:
/config from config (rw)
/var/run/secrets/kubernetes.io/serviceaccount from traefik-ingress-controller-token-92vsc (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: traefik-config
Optional: false
traefik-ingress-controller-token-92vsc:
Type: Secret (a volume populated by a Secret)
SecretName: traefik-ingress-controller-token-92vsc
Optional: false
QoS Class: Burstable
Node-Selectors: app-type=online-app
Tolerations:
node.kubernetes.io/disk-pressure:NoSchedule
node.kubernetes.io/memory-pressure:NoSchedule
node.kubernetes.io/not-ready:NoExecute
node.kubernetes.io/pid-pressure:NoSchedule
node.kubernetes.io/unreachable:NoExecute
node.kubernetes.io/unschedulable:NoSchedule
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 102m default-scheduler Successfully assigned kube-system/traefik-ingress-controller-t4rmx to azshara-k8s02
Normal Pulled 102m kubelet, azshara-k8s02 Container image "traefik:2.1.6" already present on machine
Normal Created 102m kubelet, azshara-k8s02 Created container traefik-ingress-lb
Normal Started 102m kubelet, azshara-k8s02 Started container traefik-ingress-lb
And this is my treafik route config:
apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
name: traefik-dashboard-route
namespace: kube-system
spec:
entryPoints:
- websecure
tls:
certResolver: default
routes:
- match: Host(`traefik.example.com`) && PathPrefix(`/default`)
kind: Rule
services:
- name: traefik
port: 8080
curl from kubernetes container works fine like this:
/ # curl -L traefik.kube-system.svc.cluster.local:8080
<!DOCTYPE html><html><head><title>Traefik</title><meta charset=utf-8><meta name=description content="Traefik UI"><meta name=format-detection content="telephone=no"><meta name=msapplication-tap-highlight content=no><meta name=viewport content="user-scalable=no,initial-scale=1,maximum-scale=1,minimum-scale=1,width=device-width"><link rel=icon type=image/png href=statics/app-logo-128x128.png><link rel=icon type=image/png sizes=16x16 href=statics/icons/favicon-16x16.png><link rel=icon type=image/png sizes=32x32 href=statics/icons/favicon-32x32.png><link rel=icon type=image/png sizes=96x96 href=statics/icons/favicon-96x96.png><link rel=icon type=image/ico href=statics/icons/favicon.ico><link href=css/019be8e4.d05f1162.css rel=prefetch><link href=css/099399dd.9310dd1b.css rel=prefetch><link href=css/0af0fca4.e3d6530d.css rel=prefetch><link href=css/162d302c.9310dd1b.css rel=prefetch><link href=css/29ead7f5.9310dd1b.css rel=prefetch><link href=css/31ad66a3.9310dd1b.css rel=prefetch><link href=css/524389aa.619bfb84.css rel=prefetch><link href=css/61674343.9310dd1b.css rel=prefetch><link href=css/63c47f2b.294d1efb.css rel=prefetch><link href=css/691c1182.ed0ee510.css rel=prefetch><link href=css/7ba452e3.37efe53c.css rel=prefetch><link href=css/87fca1b4.8c8c2eec.css rel=prefetch><link href=js/019be8e4.d8726e8b.js rel=prefetch><link href=js/099399dd.a047d401.js rel=prefetch><link href=js/0af0fca4.271bd48d.js rel=prefetch><link href=js/162d302c.ce1f9159.js rel=prefetch><link href=js/29ead7f5.cd022784.js rel=prefetch><link href=js/2d21e8fd.f3d2bb6c.js rel=prefetch><link href=js/31ad66a3.12ab3f06.js rel=prefetch><link href=js/524389aa.21dfc9ee.js rel=prefetch><link href=js/61674343.adb358dd.js rel=prefetch><link href=js/63c47f2b.caf9b4a2.js rel=prefetch><link href=js/691c1182.5d4aa4c9.js rel=prefetch><link href=js/7ba452e3.71a69a60.js rel=prefetch><link href=js/87fca1b4.ac9c2dc6.js rel=prefetch><link href=css/app.e4fba3f1.css rel=preload as=style><link href=js/app.841031a8.js rel=preload as=script><link href=js/vendor.49a1849c.js rel=preload as=script><link href=css/app.e4fba3f1.css rel=stylesheet><link rel=manifest href=manifest.json><meta name=theme-color content=#027be3><meta name=apple-mobile-web-app-capable content=yes><meta name=apple-mobile-web-app-status-bar-style content=default><meta name=apple-mobile-web-app-title content=Traefik><link rel=apple-touch-icon href=statics/icons/apple-icon-120x120.png><link rel=apple-touch-icon sizes=180x180 href=statics/icons/apple-icon-180x180.png><link rel=apple-touch-icon sizes=152x152 href=statics/icons/apple-icon-152x152.png><link rel=apple-touch-icon sizes=167x167 href=statics/icons/apple-icon-167x167.png><link rel=mask-icon href=statics/icons/safari-pinned-tab.svg color=#027be3><meta name=msapplication-TileImage content=statics/icons/ms-icon-144x144.png><meta name=msapplication-TileColor content=#000000></head><body><div id=q-app></div><script type=text/javascript src=js/app.841031a8.js></script><script type=text/javascript src=js/vendor.49a1849c.js></script></body></html>/ #
curl from host failed:
[root#fat001 ~]# curl -k --header 'Host:traefik.example.com' https://172.19.104.230
404 page not found
I bootstrap a kubernetes cluster using kubeadm. After a few month of inactivity, when I get our running pods, I realize that the kube-apiserver sticks in the CreatecontainerError!
kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-576cbf47c7-bcv8m 1/1 Running 435 175d
coredns-576cbf47c7-dwvmv 1/1 Running 435 175d
etcd-master 1/1 Running 23 175d
kube-apiserver-master 0/1 CreateContainerError 23 143m
kube-controller-manager-master 1/1 Running 27 175d
kube-proxy-2s9sx 1/1 Running 23 175d
kube-proxy-rrp7m 1/1 Running 20 127d
kube-scheduler-master 1/1 Running 24 175d
kubernetes-dashboard-65c76f6c97-7cwwp 1/1 Running 34 169d
tiller-deploy-779784fbd6-cwrqn 1/1 Running 0 152m
weave-net-2g8s5 2/2 Running 62 170d
weave-net-9r6cp 2/2 Running 44 127d
I delete this pod to restart it, but still goes same problem.
More info :
$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
master Ready master 175d v1.12.1
worker Ready worker 175d v1.12.1
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.1", GitCommit:"4ed3216f3ec431b140b1d899130a69fc671678f4", GitTreeState:"clean", BuildDate:"2018-10-05T16:46:06Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.1", GitCommit:"4ed3216f3ec431b140b1d899130a69fc671678f4", GitTreeState:"clean", BuildDate:"2018-10-05T16:36:14Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}
$ kubectl describe pod kube-apiserver-master -n kube-system
Name: kube-apiserver-master
Namespace: kube-system
Priority: 2000000000
PriorityClassName: system-cluster-critical
Node: master/192.168.88.205
Start Time: Wed, 07 Aug 2019 17:58:29 +0430
Labels: component=kube-apiserver
tier=control-plane
Annotations: kubernetes.io/config.hash: ce0f74ad5fcbf28c940c111df265f4c8
kubernetes.io/config.mirror: ce0f74ad5fcbf28c940c111df265f4c8
kubernetes.io/config.seen: 2019-08-07T17:58:28.178339939+04:30
kubernetes.io/config.source: file
scheduler.alpha.kubernetes.io/critical-pod:
Status: Running
IP: 192.168.88.205
Containers:
kube-apiserver:
Container ID: docker://3328849ad82745341717616f4ef6e951116fde376d19990610f670c30eb1e26f
Image: k8s.gcr.io/kube-apiserver:v1.12.1
Image ID: docker-pullable://k8s.gcr.io/kube-apiserver#sha256:52b9dae126b5a99675afb56416e9ae69239e012028668f7274e30ae16112bb1f
Port: <none>
Host Port: <none>
Command:
kube-apiserver
--authorization-mode=Node,RBAC
--advertise-address=192.168.88.205
--allow-privileged=true
--client-ca-file=/etc/kubernetes/pki/ca.crt
--enable-admission-plugins=NodeRestriction
--enable-bootstrap-token-auth=true
--etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt
--etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt
--etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key
--etcd-servers=https://127.0.0.1:2379
--insecure-port=0
--kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt
--kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key
--kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
--proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt
--proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key
--requestheader-allowed-names=front-proxy-client
--requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
--requestheader-extra-headers-prefix=X-Remote-Extra-
--requestheader-group-headers=X-Remote-Group
--requestheader-username-headers=X-Remote-User
--secure-port=6443
--service-account-key-file=/etc/kubernetes/pki/sa.pub
--service-cluster-ip-range=10.96.0.0/12
--tls-cert-file=/etc/kubernetes/pki/apiserver.crt
--tls-private-key-file=/etc/kubernetes/pki/apiserver.key
State: Waiting
Reason: CreateContainerError
Last State: Terminated
Reason: Error
Exit Code: 255
Started: Wed, 07 Aug 2019 17:58:30 +0430
Finished: Wed, 07 Aug 2019 13:28:11 +0430
Ready: False
Restart Count: 23
Requests:
cpu: 250m
Liveness: http-get https://192.168.88.205:6443/healthz delay=15s timeout=15s period=10s #success=1 #failure=8
Environment: <none>
Mounts:
/etc/ca-certificates from etc-ca-certificates (ro)
/etc/kubernetes/pki from k8s-certs (ro)
/etc/ssl/certs from ca-certs (ro)
/usr/local/share/ca-certificates from usr-local-share-ca-certificates (ro)
/usr/share/ca-certificates from usr-share-ca-certificates (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
k8s-certs:
Type: HostPath (bare host directory volume)
Path: /etc/kubernetes/pki
HostPathType: DirectoryOrCreate
ca-certs:
Type: HostPath (bare host directory volume)
Path: /etc/ssl/certs
HostPathType: DirectoryOrCreate
usr-share-ca-certificates:
Type: HostPath (bare host directory volume)
Path: /usr/share/ca-certificates
HostPathType: DirectoryOrCreate
usr-local-share-ca-certificates:
Type: HostPath (bare host directory volume)
Path: /usr/local/share/ca-certificates
HostPathType: DirectoryOrCreate
etc-ca-certificates:
Type: HostPath (bare host directory volume)
Path: /etc/ca-certificates
HostPathType: DirectoryOrCreate
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: :NoExecute
Events: <none>
$ kubectl get pods kube-apiserver-master -n kube-system -o yaml
apiVersion: v1
kind: Pod
metadata:
annotations:
kubernetes.io/config.hash: ce0f74ad5fcbf28c940c111df265f4c8
kubernetes.io/config.mirror: ce0f74ad5fcbf28c940c111df265f4c8
kubernetes.io/config.seen: 2019-08-07T17:58:28.178339939+04:30
kubernetes.io/config.source: file
scheduler.alpha.kubernetes.io/critical-pod: ""
creationTimestamp: 2019-08-13T08:33:18Z
labels:
component: kube-apiserver
tier: control-plane
name: kube-apiserver-master
namespace: kube-system
resourceVersion: "19613877"
selfLink: /api/v1/namespaces/kube-system/pods/kube-apiserver-master
uid: 0032d68b-bda5-11e9-860c-000c292f9c9e
spec:
containers:
- command:
- kube-apiserver
- --authorization-mode=Node,RBAC
- --advertise-address=192.168.88.205
- --allow-privileged=true
- --client-ca-file=/etc/kubernetes/pki/ca.crt
- --enable-admission-plugins=NodeRestriction
- --enable-bootstrap-token-auth=true
- --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt
- --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt
- --etcd-keyfile=/etc/kubernetes/pki/apiserver-etcd-client.key
- --etcd-servers=https://127.0.0.1:2379
- --insecure-port=0
- --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt
- --kubelet-client-key=/etc/kubernetes/pki/apiserver-kubelet-client.key
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt
- --proxy-client-key-file=/etc/kubernetes/pki/front-proxy-client.key
- --requestheader-allowed-names=front-proxy-client
- --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt
- --requestheader-extra-headers-prefix=X-Remote-Extra-
- --requestheader-group-headers=X-Remote-Group
- --requestheader-username-headers=X-Remote-User
- --secure-port=6443
- --service-account-key-file=/etc/kubernetes/pki/sa.pub
- --service-cluster-ip-range=10.96.0.0/12
- --tls-cert-file=/etc/kubernetes/pki/apiserver.crt
- --tls-private-key-file=/etc/kubernetes/pki/apiserver.key
image: k8s.gcr.io/kube-apiserver:v1.12.1
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 8
httpGet:
host: 192.168.88.205
path: /healthz
port: 6443
scheme: HTTPS
initialDelaySeconds: 15
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 15
name: kube-apiserver
resources:
requests:
cpu: 250m
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /etc/ca-certificates
name: etc-ca-certificates
readOnly: true
- mountPath: /etc/kubernetes/pki
name: k8s-certs
readOnly: true
- mountPath: /etc/ssl/certs
name: ca-certs
readOnly: true
- mountPath: /usr/share/ca-certificates
name: usr-share-ca-certificates
readOnly: true
- mountPath: /usr/local/share/ca-certificates
name: usr-local-share-ca-certificates
readOnly: true
dnsPolicy: ClusterFirst
hostNetwork: true
nodeName: master
priority: 2000000000
priorityClassName: system-cluster-critical
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoExecute
operator: Exists
volumes:
- hostPath:
path: /etc/kubernetes/pki
type: DirectoryOrCreate
name: k8s-certs
- hostPath:
path: /etc/ssl/certs
type: DirectoryOrCreate
name: ca-certs
- hostPath:
path: /usr/share/ca-certificates
type: DirectoryOrCreate
name: usr-share-ca-certificates
- hostPath:
path: /usr/local/share/ca-certificates
type: DirectoryOrCreate
name: usr-local-share-ca-certificates
- hostPath:
path: /etc/ca-certificates
type: DirectoryOrCreate
name: etc-ca-certificates
status:
conditions:
- lastProbeTime: null
lastTransitionTime: 2019-08-07T13:28:29Z
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: 2019-08-07T08:58:11Z
message: 'containers with unready status: [kube-apiserver]'
reason: ContainersNotReady
status: "False"
type: Ready
- lastProbeTime: null
lastTransitionTime: 2019-08-07T08:58:11Z
message: 'containers with unready status: [kube-apiserver]'
reason: ContainersNotReady
status: "False"
type: ContainersReady
- lastProbeTime: null
lastTransitionTime: 2019-08-07T13:28:29Z
status: "True"
type: PodScheduled
containerStatuses:
- containerID: docker://3328849ad82745341717616f4ef6e951116fde376d19990610f670c30eb1e26f
image: k8s.gcr.io/kube-apiserver:v1.12.1
imageID: docker-pullable://k8s.gcr.io/kube-apiserver#sha256:52b9dae126b5a99675afb56416e9ae69239e012028668f7274e30ae16112bb1f
lastState:
terminated:
containerID: docker://3328849ad82745341717616f4ef6e951116fde376d19990610f670c30eb1e26f
exitCode: 255
finishedAt: 2019-08-07T08:58:11Z
reason: Error
startedAt: 2019-08-07T13:28:30Z
name: kube-apiserver
ready: false
restartCount: 23
state:
waiting:
message: 'Error response from daemon: Conflict. The container name "/k8s_kube-apiserver_kube-apiserver-master_kube-system_ce0f74ad5fcbf28c940c111df265f4c8_24"
is already in use by container 14935b714aee924aa42295fa5d252c760264d24ee63ea74e67092ccc3fb2b530.
You have to remove (or rename) that container to be able to reuse that name.'
reason: CreateContainerError
hostIP: 192.168.88.205
phase: Running
podIP: 192.168.88.205
qosClass: Burstable
startTime: 2019-08-07T13:28:29Z
If any other information is needed let me know.
How can I make it run properly?
The issue is explained by this error message from docker daemon:
message: 'Error response from daemon: Conflict. The container name
"/k8s_kube-apiserver_kube-apiserver-master_kube-system_ce0f74ad5fcbf28c940c111df265f4c8_24"
is already in use by container 14935b714aee924aa42295fa5d252c760264d24ee63ea74e67092ccc3fb2b530.
You have to remove (or rename) that container to be able to reuse that name.'
reason: CreateContainerError
List all containers using:
docker ps -a
You should be able to find on the list container with following name:
/k8s_kube-apiserver_kube-apiserver-master_kube-system_ce0f74ad5fcbf28c940c111df265f4c8_24
or ID:
14935b714aee924aa42295fa5d252c760264d24ee63ea74e67092ccc3fb2b530
Then you can try to delete it by running:
docker rm "/k8s_kube-apiserver_kube-apiserver-master_kube-system_ce0f74ad5fcbf28c940c111df265f4c8_24"
or by providing its ID:
docker rm 14935b714aee924aa42295fa5d252c760264d24ee63ea74e67092ccc3fb2b530
If there is still any problem with removing it, add the -f flag to delete it forcefully:
docker rm -f 14935b714aee924aa42295fa5d252c760264d24ee63ea74e67092ccc3fb2b530
Once done that, you can try to delete kube-apiserver-master pod, so it can be recreated.