What and where is the default kubeadm config file? - kubernetes

Using kubeadm init initializes the control plane with default configuration options.
Is there a way to see what default values/configuration it will use for the control plane, how can I view that configuration file, and where is it stored?

Found the command: ( just in case someone needs it)
C02W84XMHTD5:~ iahmad$ kubectl get configMap kubeadm-config -o yaml --namespace=kube-system
apiVersion: v1
data:
MasterConfiguration: |
api:
advertiseAddress: 192.168.64.4
bindPort: 8443
controlPlaneEndpoint: localhost
apiServerExtraArgs:
admission-control: Initializers,NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,DefaultTolerationSeconds,NodeRestriction,MutatingAdmissionWebhook,ValidatingAdmissionWebhook,ResourceQuota
auditPolicy:
logDir: /var/log/kubernetes/audit
logMaxAge: 2
path: ""
authorizationModes:
- Node
- RBAC
certificatesDir: /var/lib/minikube/certs/
cloudProvider: ""
criSocket: /var/run/dockershim.sock
etcd:
caFile: ""
certFile: ""
dataDir: /data/minikube
endpoints: null
image: ""
keyFile: ""
imageRepository: k8s.gcr.io
kubeProxy:
config:
bindAddress: 0.0.0.0
clientConnection:
acceptContentTypes: ""
burst: 10
contentType: application/vnd.kubernetes.protobuf
kubeconfig: /var/lib/kube-proxy/kubeconfig.conf
qps: 5
clusterCIDR: ""
configSyncPeriod: 15m0s
conntrack:
max: null
maxPerCore: 32768
min: 131072
tcpCloseWaitTimeout: 1h0m0s
tcpEstablishedTimeout: 24h0m0s
enableProfiling: false
healthzBindAddress: 0.0.0.0:10256
hostnameOverride: ""
iptables:
masqueradeAll: false
masqueradeBit: 14
minSyncPeriod: 0s
syncPeriod: 30s
ipvs:
minSyncPeriod: 0s
scheduler: ""
syncPeriod: 30s
metricsBindAddress: 127.0.0.1:10249
mode: ""
nodePortAddresses: null
oomScoreAdj: -999
portRange: ""
resourceContainer: /kube-proxy
udpIdleTimeout: 250ms
kubeletConfiguration: {}
kubernetesVersion: v1.10.0
networking:
dnsDomain: cluster.local
podSubnet: ""
serviceSubnet: 10.96.0.0/12
noTaintMaster: true
nodeName: minikube
privilegedPods: false
token: ""
tokenGroups:
- system:bootstrappers:kubeadm:default-node-token
tokenTTL: 24h0m0s
tokenUsages:
- signing
- authentication
unifiedControlPlaneImage: ""

kubeadm config print init-defaults (cf. join-defaults) will show the defaults:
$ kubeadm config print init-defaults
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 1.2.3.4
bindPort: 6443
nodeRegistration:
criSocket: /var/run/dockershim.sock
name: k1.london.vultr
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: k8s.gcr.io
kind: ClusterConfiguration
kubernetesVersion: v1.16.0
networking:
dnsDomain: cluster.local
serviceSubnet: 10.96.0.0/12
scheduler: {}
(kubeadm v1.16.0)
As far as I can tell these are built-in defaults, and there is no overridable defaults file, i.e. to supply a different config file it is necessary to pass --config=<file>.

Related

Error on Telegraf Helm Chart update: Error parsing data

Im trying to deploy telegraf helm chart on kubernetes.
helm upgrade --install telegraf-instance -f values.yaml influxdata/telegraf
When I add modbus input plugin with holding_register i get error
[telegraf] Error running agent: Error loading config file /etc/telegraf/telegraf.conf: Error parsing data: line 49: key `name’ is in conflict with line 2fd
my values.yaml like below
## Default values.yaml for Telegraf
## This is a YAML-formatted file.
## ref: https://hub.docker.com/r/library/telegraf/tags/
replicaCount: 1
image:
repo: "telegraf"
tag: "1.21.4"
pullPolicy: IfNotPresent
podAnnotations: {}
podLabels: {}
imagePullSecrets: []
args: []
env:
- name: HOSTNAME
value: "telegraf-polling-service"
resources: {}
nodeSelector: {}
affinity: {}
tolerations: []
service:
enabled: true
type: ClusterIP
annotations: {}
rbac:
create: true
clusterWide: false
rules: []
serviceAccount:
create: false
name:
annotations: {}
config:
agent:
interval: 60s
round_interval: true
metric_batch_size: 1000000
metric_buffer_limit: 100000000
collection_jitter: 0s
flush_interval: 60s
flush_jitter: 0s
precision: ''
hostname: '9825128'
omit_hostname: false
processors:
- enum:
mapping:
field: "status"
dest: "status_code"
value_mappings:
healthy: 1
problem: 2
critical: 3
inputs:
- modbus:
name: "PS MAIN ENGINE"
controller: 'tcp://192.168.0.101:502'
slave_id: 1
holding_registers:
- name: "Coolant Level"
byte_order: CDAB
data_type: FLOAT32
scale: 0.001
address: [51410, 51411]
- modbus:
name: "SB MAIN ENGINE"
controller: 'tcp://192.168.0.102:502'
slave_id: 1
holding_registers:
- name: "Coolant Level"
byte_order: CDAB
data_type: FLOAT32
scale: 0.001
address: [51410, 51411]
outputs:
- influxdb_v2:
token: token
organization: organisation
bucket: bucket
urls:
- "url"
metrics:
health:
enabled: true
service_address: "http://:8888"
threshold: 5000.0
internal:
enabled: true
collect_memstats: false
pdb:
create: true
minAvailable: 1
Problem resolved by doing the following steps
deleted config section of my values.yaml
added my telegraf.conf to /additional_config path
added configmap to kubernetes with the following command
kubectl create configmap external-config --from-file=/additional_config
added the following command to values.yaml
volumes:
- name: my-config
configMap:
name: external-config
volumeMounts:
- name: my-config
mountPath: /additional_config
args:
- "--config=/etc/telegraf/telegraf.conf"
- "--config-directory=/additional_config"

Kubernetes Ingress looking for secret in wrong place

I have Keycloak Chart (https://codecentric.github.io/helm-charts). Where I configured ingress to look at my secret for certificates, but instead it is looking at wrong place:
W0830 15:05:12.330745 7 controller.go:1387] Error getting SSL certificate "default/tls-keycloak-czv9g": local SSL certificate default/tls-keycloak-czv9g was not found
Here is how Chart looks like:
keycloak:
basepath: auth/
username: admin
password: password
route:
tls:
enabled: true
extraEnv: |
- name: PROXY_ADDRESS_FORWARDING
value: "true"
- name: KEYCLOAK_IMPORT
value: /keycloak/master-realm.json
- name: JAVA_OPTS
value: >-
-Djboss.socket.binding.port-offset=1000
extraVolumes: |
- name: realm-secret
secret:
secretName: realm-secret
extraVolumeMounts: |
- name: realm-secret
mountPath: "/keycloak/"
readOnly: true
ingress:
enabled: true
annotations:
kubernetes.io/ingress.class: nginx
nginx.ingress.kubernetes.io/use-regex: "true"
cert-manager.io/cluster-issuer: "keycloak-issuer"
path: /auth/?(.*)
hosts:
- keycloak.localtest.me
tls:
- hosts:
- keycloak.localtest.me
secretName: tls-keycloak-czv9g
That is what i see from console:
$ kubectl get secret
NAME TYPE DATA AGE
default-token-lbt48 kubernetes.io/service-account-token 3 22m
keycloak-admin-password Opaque 1 15m
keycloak-realm-secret Opaque 1 15m
tls-keycloak-czv9g Opaque 1 15m
$ kubectl describe secrets/tls-keycloak-czv9g
Name: tls-keycloak-czv9g
Namespace: default
Labels: cert-manager.io/next-private-key=true
Annotations: <none>
Type: Opaque
Data
====
tls.key: 1704 bytes
Why ingress is looking wrong place?

Failed to get job complete status for job rke-network-plugin-deploy-job

I depolied rke in air-gapped environment with below specification:
Nodes:
3 controller with etcd
2 workers
RKE version:
v1.0.0
Docker version:
Client:
Debug Mode: false
Server:
Containers: 24
Running: 7
Paused: 0
Stopped: 17
Images: 4
Server Version: 19.03.1-ol
Storage Driver: overlay2
Backing Filesystem: xfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: **************
runc version: ******
init version: fec3683
Security Options:
seccomp
Profile: default
Kernel Version: 4.14.35-1902.8.4.el7uek.x86_64
Operating System: Oracle Linux Server 7.7
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 1.409GiB
Name: rke01.kuberlocal.co
ID:*******************************
Docker Root Dir: /var/lib/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
registry.console:5000
127.0.0.0/8
Live Restore Enabled: false
Registries:
Operating system and kernel: (Oracle linux 7)
Red Hat Enterprise Linux Server release 7.7
4.14.35-1902.8.4.el7uek.x86_64
Type/provider of hosts: VirtualBox (test environment)
cluster.yml file:
If you intened to deploy Kubernetes in an air-gapped environment,
please consult the documentation on how to configure custom RKE images.
nodes:
address: rke01
port: "22"
internal_address: 192.168.40.11
role:
controlplane
etcd
hostname_override: ""
user: rke
docker_socket: /var/run/docker.sock
ssh_key: ""
ssh_key_path: ~/.ssh/id_rsa
ssh_cert: ""
ssh_cert_path: ""
labels: {}
taints: []
address: rke02
port: "22"
internal_address: 192.168.40.17
role:
controlplane
etcd
hostname_override: ""
user: rke
docker_socket: /var/run/docker.sock
ssh_key: ""
ssh_key_path: ~/.ssh/id_rsa
ssh_cert: ""
ssh_cert_path: ""
labels: {}
taints: []
address: rke03
port: "22"
internal_address: 192.168.40.13
role:
controlplane
etcd
hostname_override: ""
user: rke
docker_socket: /var/run/docker.sock
ssh_key: ""
ssh_key_path: ~/.ssh/id_rsa
ssh_cert: ""
ssh_cert_path: ""
labels: {}
taints: []
address: rke04
port: "22"
internal_address: 192.168.40.14
role:
worker
hostname_override: ""
user: rke
docker_socket: /var/run/docker.sock
ssh_key: ""
ssh_key_path: ~/.ssh/id_rsa
ssh_cert: ""
ssh_cert_path: ""
labels: {}
taints: []
address: rke05
port: "22"
internal_address: 192.168.40.15
role:
worker
hostname_override: ""
user: rke
docker_socket: /var/run/docker.sock
ssh_key: ""
ssh_key_path: ~/.ssh/id_rsa
ssh_cert: ""
ssh_cert_path: ""
labels: {}
taints: []
services:
etcd:
image: ""
extra_args: {}
extra_binds: []
extra_env: []
external_urls: []
ca_cert: ""
cert: ""
key: ""
path: ""
uid: 0
gid: 0
snapshot: null
retention: ""
creation: ""
backup_config: null
kube-api:
image: ""
extra_args: {}
extra_binds: []
extra_env: []
service_cluster_ip_range: 10.43.0.0/16
service_node_port_range: ""
pod_security_policy: false
always_pull_images: false
secrets_encryption_config: null
audit_log: null
admission_configuration: null
event_rate_limit: null
kube-controller:
image: ""
extra_args: {}
extra_binds: []
extra_env: []
cluster_cidr: 10.42.0.0/16
service_cluster_ip_range: 10.43.0.0/16
scheduler:
image: ""
extra_args: {}
extra_binds: []
extra_env: []
kubelet:
image: ""
extra_args: {}
extra_binds: []
extra_env: []
cluster_domain: bmi.rke.cluster.local
infra_container_image: ""
cluster_dns_server: 10.43.0.10
fail_swap_on: false
generate_serving_certificate: false
kubeproxy:
image: ""
extra_args: {}
extra_binds: []
extra_env: []
network:
plugin: weave
weave_network_provider:
password: "********"
options: {}
node_selector: {}
authentication:
strategy: x509
sans: []
webhook: null
addons: ""
addons_include: []
system_images:
etcd: registry.console:5000/rancher/coreos-etcd:v3.3.15-rancher1
alpine: registry.console:5000/rancher/rke-tools:v0.1.51
nginx_proxy: registry.console:5000/rancher/rke-tools:v0.1.51
cert_downloader: registry.console:5000/rancher/rke-tools:v0.1.51
kubernetes_services_sidecar: registry.console:5000/rancher/rke-tools:v0.1.51
kubedns: registry.console:5000/rancher/k8s-dns-kube-dns:1.15.0
dnsmasq: registry.console:5000/rancher/k8s-dns-dnsmasq-nanny:1.15.0
kubedns_sidecar: registry.console:5000/rancher/k8s-dns-sidecar:1.15.0
kubedns_autoscaler: registry.console:5000/rancher/cluster-proportional-autoscaler:1.7.1
coredns: registry.console:5000/rancher/coredns-coredns:1.6.2
coredns_autoscaler: registry.console:5000/rancher/cluster-proportional-autoscaler:1.7.1
kubernetes: registry.console:5000/rancher/hyperkube:v1.16.3-rancher1
flannel: registry.console:5000/rancher/coreos-flannel:v0.11.0-rancher1
flannel_cni: registry.console:5000/rancher/flannel-cni:v0.3.0-rancher5
calico_node: registry.console:5000/rancher/calico-node:v3.8.1
calico_cni: registry.console:5000/rancher/calico-cni:v3.8.1
calico_controllers: registry.console:5000/rancher/calico-kube-controllers:v3.8.1
calico_ctl: ""
calico_flexvol: registry.console:5000/rancher/calico-pod2daemon-flexvol:v3.8.1
canal_node: registry.console:5000/rancher/calico-node:v3.8.1
canal_cni: registry.console:5000/rancher/calico-cni:v3.8.1
canal_flannel: registry.console:5000/rancher/coreos-flannel:v0.11.0
canal_flexvol: registry.console:5000/rancher/calico-pod2daemon-flexvol:v3.8.1
weave_node: registry.console:5000/weaveworks/weave-kube:2.5.2
weave_cni: registry.console:5000/weaveworks/weave-npc:2.5.2
pod_infra_container: registry.console:5000/rancher/pause:3.1
ingress: registry.console:5000/rancher/nginx-ingress-controller:nginx-0.25.1-rancher1
ingress_backend: registry.console:5000/rancher/nginx-ingress-controller-defaultbackend:1.5-rancher1
metrics_server: registry.console:5000/rancher/metrics-server:v0.3.4
windows_pod_infra_container: rancher/kubelet-pause:v0.1.3
ssh_key_path: ~/.ssh/id_rsa
ssh_cert_path: ""
ssh_agent_auth: false
authorization:
mode: rbac
options: {}
#ignore_docker_version: false
ignore_docker_version: true
kubernetes_version: ""
private_registries:
url: registry.console:5000
user: registry_user
password: ***********
is_default: true
ingress:
provider: ""
options: {}
node_selector: {}
extra_args: {}
dns_policy: ""
extra_envs: []
extra_volumes: []
extra_volume_mounts: []
cluster_name: ""
cloud_provider:
name: ""
prefix_path: "/opt/rke/"
addon_job_timeout: 30
bastion_host:
address: ""
port: ""
user: ""
ssh_key: ""
ssh_key_path: ""
ssh_cert: ""
ssh_cert_path: ""
monitoring:
provider: ""
options: {}
node_selector: {}
restore:
restore: false
snapshot_name: ""
dns:
provider: coredns
Steps to Reproduce:
rke -d up --config cluster.yml
Results:
INFO[0129] [sync] Successfully synced nodes Labels and Taints
DEBU[0129] Host: rke01 has role: controlplane
DEBU[0129] Host: rke01 has role: etcd
DEBU[0129] Host: rke03 has role: controlplane
DEBU[0129] Host: rke03 has role: etcd
DEBU[0129] Host: rke04 has role: worker
DEBU[0129] Host: rke05 has role: worker
INFO[0129] [network] Setting up network plugin: weave
INFO[0129] [addons] Saving ConfigMap for addon rke-network-plugin to Kubernetes
INFO[0129] [addons] Successfully saved ConfigMap for addon rke-network-plugin to Kubernetes
INFO[0129] [addons] Executing deploy job rke-network-plugin
DEBU[0129] [k8s] waiting for job rke-network-plugin-deploy-job to complete..
FATA[0159] Failed to get job complete status for job rke-network-plugin-deploy-job in namespace kube-system
kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system rke-network-plugin-deploy-job-4jgcq 0/1 Error 0 4m6s
kube-system rke-network-plugin-deploy-job-57jr8 0/1 Error 0 3m50s
kube-system rke-network-plugin-deploy-job-h2gr8 0/1 Error 0 90s
kube-system rke-network-plugin-deploy-job-p92br 0/1 Error 0 2m50s
kube-system rke-network-plugin-deploy-job-xrgpl 0/1 Error 0 4m1s
kube-system rke-network-plugin-deploy-job-zqhmk 0/1 Error 0 3m30s
kubectl describe pod rke-network-plugin-deploy-job-zqhmk -n kube-system
Name: rke-network-plugin-deploy-job-zqhmk
Namespace: kube-system
Priority: 0
Node: rke01/192.168.40.11
Start Time: Sun, 12 Jan 2020 09:40:00 +0330
Labels: controller-uid=*******************
job-name=rke-network-plugin-deploy-job
Annotations:
Status: Failed
IP: 192.168.40.11
IPs:
IP: 192.168.40.11
Controlled By: Job/rke-network-plugin-deploy-job
Containers:
rke-network-plugin-pod:
Container ID: docker://7658aecff174e4ac53caaf088782dab50654911065371cd0d8dcdd50b8fbef3b
Image: registry.console:5000/rancher/hyperkube:v1.16.3-rancher1
Image ID: docker-pullable://registry.console:5000/rancher/hyperkube#sha256:0a55590eb8453bcc46a4bdb8217a48cf56a7c7f7c52d72a267632ffa35b3b8c8
Port:
Host Port:
Command:
kubectl
apply
-f
/etc/config/rke-network-plugin.yaml
State: Terminated
Reason: Error
Exit Code: 1
Started: Sun, 12 Jan 2020 09:40:00 +0330
Finished: Sun, 12 Jan 2020 09:40:01 +0330
Ready: False
Restart Count: 0
Environment:
Mounts:
/etc/config from config-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from rke-job-deployer-token-9dt6n (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: rke-network-plugin
Optional: false
rke-job-deployer-token-9dt6n:
Type: Secret (a volume populated by a Secret)
SecretName: rke-job-deployer-token-9dt6n
Optional: false
QoS Class: BestEffort
Node-Selectors:
Tolerations:
Events:
Type Reason Age From Message
Normal Pulled 4m10s kubelet, rke01 Container image "registry.console:5000/rancher/hyperkube:v1.16.3-rancher1" already present on machine
Normal Created 4m10s kubelet, rke01 Created container rke-network-plugin-pod
Normal Started 4m10s kubelet, rke01 Started container rke-network-plugin-pod
container logs:
docker logs -f 267a894bb999
unable to recognize "/etc/config/rke-network-plugin.yaml": Get https://10.43.0.1:443/api?timeout=32s: dial tcp 10.43.0.1:443: connect: network is unreachable
unable to recognize "/etc/config/rke-network-plugin.yaml": Get https://10.43.0.1:443/api?timeout=32s: dial tcp 10.43.0.1:443: connect: network is unreachable
unable to recognize "/etc/config/rke-network-plugin.yaml": Get https://10.43.0.1:443/api?timeout=32s: dial tcp 10.43.0.1:443: connect: network is unreachable
unable to recognize "/etc/config/rke-network-plugin.yaml": Get https://10.43.0.1:443/api?timeout=32s: dial tcp 10.43.0.1:443: connect: network is unreachable
unable to recognize "/etc/config/rke-network-plugin.yaml": Get https://10.43.0.1:443/api?timeout=32s: dial tcp 10.43.0.1:443: connect: network is unreachable
unable to recognize "/etc/config/rke-network-plugin.yaml": Get https://10.43.0.1:443/api?timeout=32s: dial tcp 10.43.0.1:443: connect: network is unreachable
network interfaces
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: enp0s8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether *********** brd ff:ff:ff:ff:ff:ff
inet 192.168.40.11/24 brd 192.168.40.255 scope global dynamic enp0s8
valid_lft 847sec preferred_lft 847sec
inet6 ************* scope link
valid_lft forever preferred_lft forever
3: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
link/ether *************** brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
valid_lft forever preferred_lft forever
inet6 ************* scope link
valid_lft forever preferred_lft forever
docker network status
docker network ls
NETWORK ID NAME DRIVER SCOPE
c6063ba5a4d0 bridge bridge local
822441eae3cf host host local
314798c82599 none null local
is the issue related to network interfaces? if yes: how can i create it?
that's resolved by below command and I created a network interface:
docker network create --driver=bridge --subnet=10.43.0.0/16 br0_rke
I had the same issue, and these two steps solved my problem.
Increase addon_job_timeout
Check node free space (at lease 15%)
In my case, one of the nodes had DiskPressure state
I creating Ubuntu vms in the local machine and having this problem. I got it working by increasing the disk and memory capacities at VM creation. multipass launch --name node1 -m 2G -d 8G.

(RKE) FATA[0212] [etcd] Failed to bring up Etcd Plane: [etcd] Etcd Cluster is not healthy

I am trying to setup a small Kubernetes cluster using RKE with 2 nodes using RKE. The 2 nodes are Ubuntu server VM's running in VirtualBox, both with a bridged connection.
ip of vm 1: xx.xx.xx.61
ip of vm 2: xx.xx.xx.67
When I launch the cluster using rke up I get the following error:
FATA[0212] [etcd] Failed to bring up Etcd Plane: [etcd] Etcd Cluster is not healthy
When I subsequently run kubectl commands such as kubectl --kubeconfig kube_config_cluster.yml version, I get the following error:
Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.1", GitCommit:"4485c6f18cee9a5d3c3b4e523bd27972b1b53892", GitTreeState:"clean", BuildDate:"2019-07-18T09:18:22Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
The connection to the server xx.xx.xx.67:6443 was refused - did you specify the right host or port?
Not sure if these errors are caused by the same underlying issue.
What could be causing this or how could I troubleshoot this issue. Are there any particular log files that I could look into?
The is what the cluster.yml looks like:
# If you intened to deploy Kubernetes in an air-gapped environment,
# please consult the documentation on how to configure custom RKE images.
nodes:
- address: xx.xx.xx.61
port: "22"
internal_address: ""
role:
- controlplane
- worker
- etcd
hostname_override: ""
user: rke
docker_socket: /var/run/docker.sock
ssh_key: ""
ssh_key_path: ~/.ssh/id_rsa
ssh_cert: ""
ssh_cert_path: ""
labels: {}
- address: xx.xx.xx.67
port: "22"
internal_address: ""
role:
- controlplane
- worker
- etcd
hostname_override: ""
user: rke
docker_socket: /var/run/docker.sock
ssh_key: ""
ssh_key_path: ~/.ssh/id_rsa
ssh_cert: ""
ssh_cert_path: ""
labels: {}
services:
etcd:
image: ""
extra_args: {}
extra_binds: []
extra_env: []
external_urls: []
ca_cert: ""
cert: ""
key: ""
path: ""
snapshot: null
retention: ""
creation: ""
backup_config: null
kube-api:
image: ""
extra_args: {}
extra_binds: []
extra_env: []
service_cluster_ip_range: 10.43.0.0/16
service_node_port_range: ""
pod_security_policy: false
always_pull_images: false
kube-controller:
image: ""
extra_args: {}
extra_binds: []
extra_env: []
cluster_cidr: 10.42.0.0/16
service_cluster_ip_range: 10.43.0.0/16
scheduler:
image: ""
extra_args: {}
extra_binds: []
extra_env: []
kubelet:
image: ""
extra_args: {}
extra_binds: []
extra_env: []
cluster_domain: cluster.local
infra_container_image: ""
cluster_dns_server: 10.43.0.10
fail_swap_on: false
kubeproxy:
image: ""
extra_args: {}
extra_binds: []
extra_env: []
network:
plugin: canal
options: {}
authentication:
strategy: x509
sans: []
webhook: null
addons: ""
addons_include:
- https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml
- https://gist.githubusercontent.com/superseb/499f2caa2637c404af41cfb7e5f4a938/raw/930841ac00653fdff8beca61dab9a20bb8983782/k8s-dashboard-user.yml
system_images:
etcd: rancher/coreos-etcd:v3.3.10-rancher1
alpine: rancher/rke-tools:v0.1.34
nginx_proxy: rancher/rke-tools:v0.1.34
cert_downloader: rancher/rke-tools:v0.1.34
kubernetes_services_sidecar: rancher/rke-tools:v0.1.34
kubedns: rancher/k8s-dns-kube-dns:1.15.0
dnsmasq: rancher/k8s-dns-dnsmasq-nanny:1.15.0
kubedns_sidecar: rancher/k8s-dns-sidecar:1.15.0
kubedns_autoscaler: rancher/cluster-proportional-autoscaler:1.3.0
coredns: rancher/coredns-coredns:1.3.1
coredns_autoscaler: rancher/cluster-proportional-autoscaler:1.3.0
kubernetes: rancher/hyperkube:v1.14.3-rancher1
flannel: rancher/coreos-flannel:v0.10.0-rancher1
flannel_cni: rancher/flannel-cni:v0.3.0-rancher1
calico_node: rancher/calico-node:v3.4.0
calico_cni: rancher/calico-cni:v3.4.0
calico_controllers: ""
calico_ctl: rancher/calico-ctl:v2.0.0
canal_node: rancher/calico-node:v3.4.0
canal_cni: rancher/calico-cni:v3.4.0
canal_flannel: rancher/coreos-flannel:v0.10.0
weave_node: weaveworks/weave-kube:2.5.0
weave_cni: weaveworks/weave-npc:2.5.0
pod_infra_container: rancher/pause:3.1
ingress: rancher/nginx-ingress-controller:0.21.0-rancher3
ingress_backend: rancher/nginx-ingress-controller-defaultbackend:1.5-rancher1
metrics_server: rancher/metrics-server:v0.3.1
ssh_key_path: ~/.ssh/id_rsa
ssh_cert_path: ""
ssh_agent_auth: false
authorization:
mode: rbac
options: {}
ignore_docker_version: false
kubernetes_version: ""
private_registries: []
ingress:
provider: ""
options: {}
node_selector: {}
extra_args: {}
cluster_name: ""
cloud_provider:
name: ""
prefix_path: ""
addon_job_timeout: 0
bastion_host:
address: ""
port: ""
user: ""
ssh_key: ""
ssh_key_path: ""
ssh_cert: ""
ssh_cert_path: ""
monitoring:
provider: ""
options: {}
restore:
restore: false
snapshot_name: ""
dns: null
The scale for etcd deployment is generally an odd number (1, 3, 5 ...). I see that you have selected both the nodes with etcd role. It might be causing issues with quorum. Can you use etcd role with only one of the nodes and run rke up again?

kubernetes upgrade from 1.9.0 to 1.10 FATAL: could not decode configuration: unable to decode config from bytes

I am trying to upgrade my 1.9.0 cluster to 1.10. kubeadm upgrade plan command giving below error message. How to resolve this error message
kubeadm upgrade plan
[preflight] Running pre-flight checks.
[upgrade] Making sure the cluster is healthy:
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[upgrade/config] FATAL: could not decode configuration: unable to decode config from bytes: v1alpha1.MasterConfiguration: KubeProxy: v1alpha1.KubeProxy: Config: v1alpha1.KubeProxyConfiguration: FeatureGates: ReadMapCB: expect { or n, but found ", error found in #10 byte of ...|reGates":"","healthz|..., bigger context ...|24h0m0s"},"enableProfiling":false,"featureGates":"","healthzBindAddress":"0.0.0.0:10256","hostnameOv|...
YAML config file output:
apiVersion: v1
data:
MasterConfiguration: |
api:
advertiseAddress: 192.168.16.211
bindPort: 6443
authorizationModes:
- Node
- RBAC
certificatesDir: /etc/kubernetes/pki
cloudProvider: ""
etcd:
caFile: ""
certFile: ""
dataDir: /var/lib/etcd
endpoints: null
image: ""
keyFile: ""
imageRepository: gcr.io/google_containers
kubeProxy:
config:
bindAddress: 0.0.0.0
clientConnection:
acceptContentTypes: ""
burst: 10
contentType: application/vnd.kubernetes.protobuf
kubeconfig: /var/lib/kube-proxy/kubeconfig.conf
qps: 5
clusterCIDR: 10.244.0.0/16
configSyncPeriod: 15m0s
conntrack:
max: null
maxPerCore: 32768
min: 131072
tcpCloseWaitTimeout: 1h0m0s
tcpEstablishedTimeout: 24h0m0s
enableProfiling: false
featureGates: ""
healthzBindAddress: 0.0.0.0:10256
hostnameOverride: ""
iptables:
masqueradeAll: false
masqueradeBit: 14
minSyncPeriod: 0s
syncPeriod: 30s
ipvs:
minSyncPeriod: 0s
scheduler: ""
syncPeriod: 30s
metricsBindAddress: 127.0.0.1:10249
mode: ""
oomScoreAdj: -999
portRange: ""
resourceContainer: /kube-proxy
udpTimeoutMilliseconds: 250ms
kubeletConfiguration: {}
kubernetesVersion: v1.9.0
networking:
dnsDomain: cluster.local
podSubnet: 10.244.0.0/16
serviceSubnet: 10.96.0.0/12
nodeName: k8sm-01
token: ""
tokenTTL: 24h0m0s
unifiedControlPlaneImage: ""
kind: ConfigMap
metadata:
creationTimestamp: 2017-10-06T20:44:05Z
name: kubeadm-config
namespace: kube-system
resourceVersion: "2462269"
selfLink: /api/v1/namespaces/kube-system/configmaps/kubeadm-config
uid: 1818b79c-aad7-11e7-9ef5-525400ada096
This is followed by kubernetes issue 61764, which mentions the Before upgrading section:
kube-proxy: feature gates are now specified as a map when provided via a JSON or YAML KubeProxyConfiguration, rather than as a string of key-value pairs.
For example:
KubeProxyConfiguration Before:
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
**featureGates: "SupportIPVSProxyMode=true"**
KubeProxyConfiguration After:
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
**featureGates:**
** SupportIPVSProxyMode: true**
And:
if featureGates: "", replace with with featureGates: {}
Actually, the OP sfgroups ads in the comments:
changed config like this: featureGates: {""}