I currently have SkyDNS that is trying to connect via HTTPS and is using port 443.
I1221 01:15:28.199437 1 server.go:91] Using https://10.100.0.1:443 for kubernetes master
I1221 01:15:28.199440 1 server.go:92] Using kubernetes API <nil>
I1221 01:15:28.199637 1 server.go:132] Starting SkyDNS server. Listening on port:10053
I want it to use HTTP and port 8080 instead.
My YAML file is :
apiVersion: v1
kind: ReplicationController
spec:
replicas: 1
selector:
k8s-app: kube-dns
version: v18
template:
metadata:
creationTimestamp: null
labels:
k8s-app: kube-dns
kubernetes.io/cluster-service: "true"
version: v18
spec:
containers:
- args:
- --domain=kube.local
- --dns-port=10053
image: gcr.io/google_containers/kubedns-amd64:1.6
imagePullPolicy: IfNotPresent
name: kubedns
ports:
- containerPort: 10053
name: dns-local
protocol: UDP
- containerPort: 10053
name: dns-tcp-local
protocol: TCP
resources:
limits:
cpu: 100m
memory: 200Mi
requests:
cpu: 100m
memory: 100Mi
terminationMessagePath: /dev/termination-log
- args:
- --cache-size=1000
- --no-resolv
- --server=127.0.0.1#10053
image: gcr.io/google_containers/kube-dnsmasq-amd64:1.3
imagePullPolicy: IfNotPresent
name: dnsmasq
ports:
- containerPort: 53
name: dns
protocol: UDP
- containerPort: 53
name: dns-tcp
protocol: TCP
resources: {}
terminationMessagePath: /dev/termination-log
- args:
- -cmd=nslookup kubernetes.default.svc.kube.local 127.0.0.1 >/dev/null &&
nslookup kubernetes.default.svc.kube.local 127.0.0.1:10053 >/dev/null
- -port=8080
- -quiet
image: gcr.io/google_containers/exechealthz-amd64:1.0
imagePullPolicy: IfNotPresent
name: healthz
ports:
- containerPort: 8080
protocol: TCP
resources:
limits:
cpu: 10m
memory: 20Mi
requests:
cpu: 10m
memory: 20Mi
I understand this might not be a favorable design but is there a way I can change the protocol and port ?
Adding the below like to the kubedns's spec.containers.args section did the trick.
- --kube-master-url=http://master:8080
Related
I have deployed a fluentd sidecar container with my application in a pod to collect logs from my app.
Here's my sidecar manifest sidecar.yaml:
spec:
template:
spec:
containers:
- name: fluentd
image: fluent/fluentd
ports:
- containerPort: 24224
protocol: TCP
imagePullPolicy: IfNotPresent
resources:
limits:
cpu: 100m
memory: 200Mi
requests:
cpu: 100m
memory: 200Mi
terminationMessagePath: /dev/termination-log
volumeMounts:
- mountPath: /etc/td-agent/config.d
name: configmap-sidecar-volume
securityContext:
runAsUser: 101
runAsGroup: 101
I used this manifest and patched it to my deployment using the following command:
kubectl patch deployment my-deployment --patch “$(cat sidecar.yaml)”
The deployment was successfully updated, however the my fluentd container can't seem to start and is throwing the following error:
2020-10-16 09:07:07 +0000 [info]: parsing config file is succeeded path="/fluentd/etc/fluent.conf"
2020-10-16 09:07:08 +0000 [info]: gem 'fluentd' version '1.11.2'
2020-10-16 09:07:08 +0000 [warn]: [output_docker1] 'time_format' specified without 'time_key', will be ignored
2020-10-16 09:07:08 +0000 [error]: config error file="/fluentd/etc/fluent.conf" error_class=Fluent::ConfigError error="out_file: `/fluentd/log/docker.20201016.log` is not writable"
This is my fluent.conf file:
<source>
#type forward
bind 127.0.0.1
port 24224
<parse>
#type json
</parse>
</source>
What is causing this issue?
fluentd's UID default to 1000 unless it changed via env FLUENT_UID.
/fluentd/log/docker.20201016.log is not writable - error says that your user 101 doesn't have write permission to the log file. Change the security context to 1000 or set env FLUENT_UID=101 to fix the issue.
spec:
template:
spec:
containers:
- name: fluentd
image: fluent/fluentd
ports:
- containerPort: 24224
protocol: TCP
imagePullPolicy: IfNotPresent
resources:
limits:
cpu: 100m
memory: 200Mi
requests:
cpu: 100m
memory: 200Mi
terminationMessagePath: /dev/termination-log
volumeMounts:
- mountPath: /etc/td-agent/config.d
name: configmap-sidecar-volume
securityContext:
runAsUser: 1000
runAsGroup: 1000
Related resources:
User id mapping between host and container
No more access to mounted /fluentd/logs folder
Background:
I'm trying to use goreplay to mirror the traffic to other destination.
I found that k8s service is a load balancing on layer 4 which cause the traffic can not be capture by goreplay,So i decide to add a reverse-proxy sidecar inside pod just like istio does.
Here is my pod yaml:
apiVersion: v1
kind: Pod
metadata:
name: nginx
namespace: default
labels:
app: nginx
spec:
containers:
- image: nginx
imagePullPolicy: IfNotPresent
name: nginx
ports:
- containerPort: 80
name: http
protocol: TCP
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
- image: nginx
imagePullPolicy: IfNotPresent
name: proxy
resources:
limits:
cpu: "2"
memory: 1Gi
requests:
cpu: 10m
memory: 40Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /etc/nginx/conf.d
name: default
initContainers:
- command:
- iptables
args:
- -t
- nat
- -A
- PREROUTING
- -p
- tcp
- --dport
- "80"
- -j
- REDIRECT
- --to-ports
- "15001"
image: soarinferret/iptablesproxy
imagePullPolicy: IfNotPresent
name: istio-init
resources:
limits:
cpu: 100m
memory: 50Mi
requests:
cpu: 10m
memory: 10Mi
securityContext:
allowPrivilegeEscalation: false
capabilities:
add:
- NET_ADMIN
- NET_RAW
drop:
- ALL
privileged: false
readOnlyRootFilesystem: false
runAsGroup: 0
runAsNonRoot: false
runAsUser: 0
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
terminationGracePeriodSeconds: 30
volumes:
- configMap:
defaultMode: 256
name: default
optional: false
name: default
---
apiVersion: v1
kind: Service
metadata:
name: nginx
namespace: default
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app: nginx
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
---
apiVersion: v1
data:
default.conf: |
server {
listen 15001;
server_name localhost;
access_log /var/log/nginx/host.access.log main;
location / {
root /usr/share/nginx/html;
index index.html index.htm;
}
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root /usr/share/nginx/html;
}
}
kind: ConfigMap
metadata:
name: default
namespace: default
I use kubectl port-forward service/nginx 8080:80 and then curl http://localhost:8080,the traffic were sent directly to nginx not my proxy.
WHAT I WANT:
A way to let goreplay to capture traffic that load balanced by k8s service.
Correct iptables rule to let traffic success route to my proxy sideCar.
Thanks for any help!
As #Jonyhy96 mentioned in comments the only things which need to be changed here is to the privileged value to true in the securityContext field of initContainer.
Privileged - determines if any container in a pod can enable privileged mode. By default a container is not allowed to access any devices on the host, but a "privileged" container is given access to all devices on the host. This allows the container nearly all the same access as processes running on the host. This is useful for containers that want to use linux capabilities like manipulating the network stack and accessing devices.
So the initContainer would look like this
initContainers:
- command:
- iptables
args:
- -t
- nat
- -A
- PREROUTING
- -p
- tcp
- --dport
- "80"
- -j
- REDIRECT
- --to-ports
- "15001"
image: soarinferret/iptablesproxy
imagePullPolicy: IfNotPresent
name: istio-init
resources:
limits:
cpu: 100m
memory: 50Mi
requests:
cpu: 10m
memory: 10Mi
securityContext:
allowPrivilegeEscalation: false
capabilities:
add:
- NET_ADMIN
- NET_RAW
drop:
- ALL
privileged: true <---- changed from false
readOnlyRootFilesystem: false
runAsGroup: 0
runAsNonRoot: false
runAsUser: 0
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
There is very good tutorial about that, not exactly on nginx, but explains how to actually build the proxy.
The above securityContext works except for requiring a change to
allowPrivilegeEscalation: true
The following trimmed down version also works on GKE (Google Kubernetes Engine):
securityContext:
capabilities:
add:
- NET_ADMIN
drop:
- ALL
privileged: true
I get
All host(s) tried for query failed (tried: 10.244.0.72/10.244.0.72:9042 (com.datastax.driver.core.exceptions.TransportException: [10.244.0.72/10.244.0.72:9042] Channel has been closed))
when trying to access Cassandra within the same namespace. Although when I forward ports it works ok from localhost. keyspace is created successfully.
kubectl port-forward cassandra1-0 9042:9042
My yaml
apiVersion: v1
kind: Service
metadata:
name: cassandra1
labels:
app: cassandra1
spec:
ports:
- name: "cql"
protocol: "TCP"
port: 9042
targetPort: 9042
- name: "thrift"
protocol: "TCP"
port: 9160
targetPort: 9160
selector:
app: cassandra1
type: NodePort
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: cassandra1
labels:
app: cassandra1
spec:
serviceName: cassandra1
replicas: 1
selector:
matchLabels:
app: cassandra1
template:
metadata:
labels:
app: cassandra1
spec:
terminationGracePeriodSeconds: 1800
containers:
- name: cassandra1
image: gcr.io/google-samples/cassandra:v13
imagePullPolicy: Always
ports:
- containerPort: 7000
name: intra-node
- containerPort: 7001
name: tls-intra-node
- containerPort: 7199
name: jmx
- containerPort: 9042
name: cql
- containerPort: 9160
name: thrift
resources:
limits:
cpu: "500m"
memory: 1Gi
requests:
cpu: "500m"
memory: 1Gi
securityContext:
capabilities:
add:
- IPC_LOCK
lifecycle:
preStop:
exec:
command:
- /bin/sh
- -c
- nodetool drain
env:
- name: MAX_HEAP_SIZE
value: 512M
- name: HEAP_NEWSIZE
value: 100M
- name: CASSANDRA_SEEDS
value: "cassandra1-0.cassandra1.default.svc.cluster.local"
- name: CASSANDRA_CLUSTER_NAME
value: "cassandra1"
- name: CASSANDRA_DC
value: "DC1-cassandra1"
- name: CASSANDRA_RACK
value: "Rack1-cassandra1"
- name: POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
readinessProbe:
exec:
command:
- /bin/bash
- -c
- /ready-probe.sh
initialDelaySeconds: 15
timeoutSeconds: 5
# These volume mounts are persistent. They are like inline claims,
# but not exactly because the names need to match exactly one of
# the stateful pod volumes.
volumeMounts:
- name: cassandra1-data
mountPath: /cassandra1_data
volumeClaimTemplates:
- metadata:
name: cassandra1-data
namespace: default
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 1Gi
Cassandra starts with following properties:
Starting Cassandra on 10.244.0.72
CASSANDRA_CONF_DIR /etc/cassandra
CASSANDRA_CFG /etc/cassandra/cassandra.yaml
CASSANDRA_AUTO_BOOTSTRAP true
CASSANDRA_BROADCAST_ADDRESS 10.244.0.72
CASSANDRA_BROADCAST_RPC_ADDRESS 10.244.0.72
CASSANDRA_CLUSTER_NAME cassandra1
CASSANDRA_COMPACTION_THROUGHPUT_MB_PER_SEC
CASSANDRA_CONCURRENT_COMPACTORS
CASSANDRA_CONCURRENT_READS
CASSANDRA_CONCURRENT_WRITES
CASSANDRA_COUNTER_CACHE_SIZE_IN_MB
CASSANDRA_DC DC1-cassandra1
CASSANDRA_DISK_OPTIMIZATION_STRATEGY ssd
CASSANDRA_ENDPOINT_SNITCH SimpleSnitch
CASSANDRA_GC_WARN_THRESHOLD_IN_MS
CASSANDRA_INTERNODE_COMPRESSION
CASSANDRA_KEY_CACHE_SIZE_IN_MB
CASSANDRA_LISTEN_ADDRESS 10.244.0.72
CASSANDRA_LISTEN_INTERFACE
CASSANDRA_MEMTABLE_ALLOCATION_TYPE
CASSANDRA_MEMTABLE_CLEANUP_THRESHOLD
CASSANDRA_MEMTABLE_FLUSH_WRITERS
CASSANDRA_MIGRATION_WAIT 1
CASSANDRA_NUM_TOKENS 32
CASSANDRA_RACK Rack1-cassandra1
CASSANDRA_RING_DELAY 30000
CASSANDRA_RPC_ADDRESS 0.0.0.0
CASSANDRA_RPC_INTERFACE
CASSANDRA_SEEDS cassandra1-0.cassandra1.default.svc.cluster.local
CASSANDRA_SEED_PROVIDER org.apache.cassandra.locator.SimpleSeedProvider
changed ownership of '/cassandra_data/data' from root to cassandra
changed ownership of '/cassandra_data' from root to cassandra
In my application that runs in the same namespace i tried setting cassandraport to 9042 and host to:
10.240.0.4 (hostIP)
10.244.0.72 (podIP)
cassandra1 (name of the service)
cassandra1.default
cassandra1.default.svc.cluster.local
cassandra1-0.cassandra1.default.svc.cluster.local
_cql._tcp.cassandra1.default.svc.cluster.local
I also tried different types of a service:
headless, ClusterIP, NodePort
Does anybody has ANY ideas what is wrong or what else can i try to get this to work?
When running a deployment I get downtime. Requests failing after a variable amount of time (20-40 seconds).
The readiness check for the entry container fails when the preStop sends SIGUSR1, waits for 31 seconds, then sends SIGTERM. In that timeframe the pod should be removed from the service as the readiness check is set to fail after 2 failed attempts with 5 second intervals.
How can I see the events for pods being added and removed from the service to find out what's causing this?
And events around the readiness checks themselves?
I use Google Container Engine version 1.2.2 and use GCE's network load balancer.
service:
apiVersion: v1
kind: Service
metadata:
name: myapp
labels:
app: myapp
spec:
type: LoadBalancer
ports:
- name: http
port: 80
targetPort: http
protocol: TCP
- name: https
port: 443
targetPort: https
protocol: TCP
selector:
app: myapp
deployment:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: myapp
spec:
replicas: 3
strategy:
type: RollingUpdate
revisionHistoryLimit: 10
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
version: 1.0.0-61--66-6
spec:
containers:
- name: myapp
image: ****
resources:
limits:
cpu: 100m
memory: 250Mi
requests:
cpu: 10m
memory: 125Mi
ports:
- name: http-direct
containerPort: 5000
livenessProbe:
httpGet:
path: /status
port: 5000
initialDelaySeconds: 30
timeoutSeconds: 1
lifecycle:
preStop:
exec:
# SIGTERM triggers a quick exit; gracefully terminate instead
command: ["sleep 31;"]
- name: haproxy
image: travix/haproxy:1.6.2-r0
imagePullPolicy: Always
resources:
limits:
cpu: 100m
memory: 100Mi
requests:
cpu: 10m
memory: 25Mi
ports:
- name: http
containerPort: 80
- name: https
containerPort: 443
env:
- name: "SSL_CERTIFICATE_NAME"
value: "ssl.pem"
- name: "OFFLOAD_TO_PORT"
value: "5000"
- name: "HEALT_CHECK_PATH"
value: "/status"
volumeMounts:
- name: ssl-certificate
mountPath: /etc/ssl/private
livenessProbe:
httpGet:
path: /status
port: 443
scheme: HTTPS
initialDelaySeconds: 30
timeoutSeconds: 1
readinessProbe:
httpGet:
path: /readiness
port: 81
initialDelaySeconds: 0
timeoutSeconds: 1
periodSeconds: 5
successThreshold: 1
failureThreshold: 2
lifecycle:
preStop:
exec:
# SIGTERM triggers a quick exit; gracefully terminate instead
command: ["kill -USR1 1; sleep 31; kill 1"]
volumes:
- name: ssl-certificate
secret:
secretName: ssl-c324c2a587ee-20160331
When the probe fails, the prober will emit a warning event with reason as Unhealthy and message as xx probe errored: xxx.
You should be able to find those events using either kubectl get events or kubectl describe pods -l app=myapp,version=1.0.0-61--66-6 (filter pods by its label).
My pods can not resolve external world ( for ex for mail, ... ) how can I add google nameserver to the cluster ? For info the host resolve it without problem and has nameserver.
The problem is that the liveness check made skids fail, I changed it like bellow.
apiVersion: v1
kind: ReplicationController
metadata:
name: kube-dns-v10
namespace: kube-system
labels:
k8s-app: kube-dns
version: v10
kubernetes.io/cluster-service: "true"
spec:
replicas: 1
selector:
k8s-app: kube-dns
version: v10
template:
metadata:
labels:
k8s-app: kube-dns
version: v10
kubernetes.io/cluster-service: "true"
spec:
containers:
- name: etcd
image: gcr.io/google_containers/etcd:2.0.9
resources:
# keep request = limit to keep this container in guaranteed class
limits:
cpu: 100m
memory: 50Mi
requests:
cpu: 100m
memory: 50Mi
command:
- /usr/local/bin/etcd
- -data-dir
- /var/etcd/data
- -listen-client-urls
- http://127.0.0.1:2379,http://127.0.0.1:4001
- -advertise-client-urls
- http://127.0.0.1:2379,http://127.0.0.1:4001
- -initial-cluster-token
- skydns-etcd
volumeMounts:
- name: etcd-storage
mountPath: /var/etcd/data
- name: kube2sky
image: gcr.io/google_containers/kube2sky:1.12
resources:
# keep request = limit to keep this container in guaranteed class
limits:
cpu: 100m
memory: 50Mi
requests:
cpu: 100m
memory: 50Mi
args:
# command = "/kube2sky"
- --domain=cluster.local
- name: skydns
image: gcr.io/google_containers/skydns:2015-10-13-8c72f8c
resources:
# keep request = limit to keep this container in guaranteed class
limits:
cpu: 100m
memory: 50Mi
requests:
cpu: 100m
memory: 50Mi
args:
# command = "/skydns"
- -machines=http://127.0.0.1:4001
- -addr=0.0.0.0:53
- -ns-rotate=false
- -domain=cluster.local.
ports:
- containerPort: 53
name: dns
protocol: UDP
- containerPort: 53
name: dns-tcp
protocol: TCP
livenessProbe:
httpGet:
path: /healthz
port: 8080
scheme: HTTP
initialDelaySeconds: 30
timeoutSeconds: 15
#readinessProbe:
#httpGet:
#path: /healthz
#port: 8080
#scheme: HTTP
#initialDelaySeconds: 1
#timeoutSeconds: 5
- name: healthz
image: gcr.io/google_containers/exechealthz:1.0
resources:
# keep request = limit to keep this container in guaranteed class
limits:
cpu: 10m
memory: 20Mi
requests:
cpu: 10m
memory: 20Mi
args:
- -cmd=nslookup kubernetes.default.svc.cluster.local 127.0.0.1 >/dev/null
- -port=8080
ports:
- containerPort: 8080
protocol: TCP
volumes:
- name: etcd-storage
emptyDir: {}
dnsPolicy: Default # Don't use cluster DNS.