Fluentd is unable to write the logs to /fluentd/log Directory - kubernetes

I have deployed a fluentd sidecar container with my application in a pod to collect logs from my app.
Here's my sidecar manifest sidecar.yaml:
spec:
template:
spec:
containers:
- name: fluentd
image: fluent/fluentd
ports:
- containerPort: 24224
protocol: TCP
imagePullPolicy: IfNotPresent
resources:
limits:
cpu: 100m
memory: 200Mi
requests:
cpu: 100m
memory: 200Mi
terminationMessagePath: /dev/termination-log
volumeMounts:
- mountPath: /etc/td-agent/config.d
name: configmap-sidecar-volume
securityContext:
runAsUser: 101
runAsGroup: 101
I used this manifest and patched it to my deployment using the following command:
kubectl patch deployment my-deployment --patch “$(cat sidecar.yaml)”
The deployment was successfully updated, however the my fluentd container can't seem to start and is throwing the following error:
2020-10-16 09:07:07 +0000 [info]: parsing config file is succeeded path="/fluentd/etc/fluent.conf"
2020-10-16 09:07:08 +0000 [info]: gem 'fluentd' version '1.11.2'
2020-10-16 09:07:08 +0000 [warn]: [output_docker1] 'time_format' specified without 'time_key', will be ignored
2020-10-16 09:07:08 +0000 [error]: config error file="/fluentd/etc/fluent.conf" error_class=Fluent::ConfigError error="out_file: `/fluentd/log/docker.20201016.log` is not writable"
This is my fluent.conf file:
<source>
#type forward
bind 127.0.0.1
port 24224
<parse>
#type json
</parse>
</source>
What is causing this issue?

fluentd's UID default to 1000 unless it changed via env FLUENT_UID.
/fluentd/log/docker.20201016.log is not writable - error says that your user 101 doesn't have write permission to the log file. Change the security context to 1000 or set env FLUENT_UID=101 to fix the issue.
spec:
template:
spec:
containers:
- name: fluentd
image: fluent/fluentd
ports:
- containerPort: 24224
protocol: TCP
imagePullPolicy: IfNotPresent
resources:
limits:
cpu: 100m
memory: 200Mi
requests:
cpu: 100m
memory: 200Mi
terminationMessagePath: /dev/termination-log
volumeMounts:
- mountPath: /etc/td-agent/config.d
name: configmap-sidecar-volume
securityContext:
runAsUser: 1000
runAsGroup: 1000
Related resources:
User id mapping between host and container
No more access to mounted /fluentd/logs folder

Related

Argo Workflows pods missing cpu/memory resources

I'm running into a missing resources issue when submitting a Workflow. The Kubernetes namespace my-namespace has a quota enabled, and for whatever reason the pods being created after submitting the workflow are failing with:
pods "hello" is forbidden: failed quota: team: must specify limits.cpu,limits.memory,requests.cpu,requests.memory
I'm submitting the following Workflow,
apiVersion: "argoproj.io/v1alpha1"
kind: "Workflow"
metadata:
name: "hello"
namespace: "my-namespace"
spec:
entrypoint: "main"
templates:
- name: "main"
container:
image: "docker/whalesay"
resources:
requests:
memory: 0
cpu: 0
limits:
memory: "128Mi"
cpu: "250m"
Argo is running on Kubernetes 1.19.6 and was deployed with the official Helm chart version 0.16.10. Here are my Helm values:
controller:
workflowNamespaces:
- "my-namespace"
resources:
requests:
memory: 0
cpu: 0
limits:
memory: 500Mi
cpu: 0.5
pdb:
enabled: true
# See https://argoproj.github.io/argo-workflows/workflow-executors/
# docker container runtime is not present in the TKGI clusters
containerRuntimeExecutor: "k8sapi"
workflow:
namespace: "my-namespace"
serviceAccount:
create: true
rbac:
create: true
server:
replicas: 2
secure: false
resources:
requests:
memory: 0
cpu: 0
limits:
memory: 500Mi
cpu: 0.5
pdb:
enabled: true
executer:
resources:
requests:
memory: 0
cpu: 0
limits:
memory: 500Mi
cpu: 0.5
Any ideas on what I may be missing? Thanks, Weldon
Update 1: I tried another namespace without quotas enabled and got past the missing resources issue. However I now see: Failed to establish pod watch: timed out waiting for the condition. Here's what the spec looks like for this pod. You can see the wait container is missing resources. This is the container causing the issue reported by this question.
spec:
containers:
- command:
- argoexec
- wait
env:
- name: ARGO_POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
- name: ARGO_CONTAINER_RUNTIME_EXECUTOR
value: k8sapi
image: argoproj/argoexec:v2.12.5
imagePullPolicy: IfNotPresent
name: wait
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /argo/podmetadata
name: podmetadata
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: default-token-v4jlb
readOnly: true
- image: docker/whalesay
imagePullPolicy: Always
name: main
resources:
limits:
cpu: 250m
memory: 128Mi
requests:
cpu: "0"
memory: "0"
try deploying the workflow on another namespace if you can, and verify if it's working or not.
if you can try with removing the quota for respective namespace.
instead of quota you can also use the
apiVersion: v1
kind: LimitRange
metadata:
name: default-limit-range
spec:
limits:
- default:
memory: 512Mi
cpu: 250m
defaultRequest:
cpu: 50m
memory: 64Mi
type: Container
so any container have not resource request, limit mentioned that will get this default config of 50m CPU & 64 Mi Memory.
https://kubernetes.io/docs/concepts/policy/limit-range/

How to resolve Kubernetes Deployment warning?

$ kubectl version --short
Client Version: v1.20.2
Server Version: v1.19.6-eks-49a6c0
I have the following Deployment manifest
apiVersion: apps/v1
kind: Deployment
metadata:
name: stats-service
namespace: my-system
labels:
app: stats-service
spec:
selector:
matchLabels:
app: stats-service
template:
metadata:
labels:
app: stats-service
spec:
containers:
- name: stats-service
image: 0123456789.dkr.ecr.us-east-1.amazonaws.com/stats-service:3.12.1
resources:
requests:
memory: "1024m"
cpu: "512m"
limits:
memory: "2048m"
cpu: "1024m"
ports:
- name: http
containerPort: 5000
protocol: TCP
startupProbe:
httpGet:
path: /manage/health
port: 5000
failureThreshold: 30
periodSeconds: 10
livenessProbe:
httpGet:
path: /manage/health
port: 5000
failureThreshold: 3
periodSeconds: 10
readinessProbe:
httpGet:
path: /manage/health
port: 5000
failureThreshold: 6
periodSeconds: 10
env:
- name: SPRING_PROFILES_ACTIVE
value: test
- name: JAVA_OPTS
value: "my_java_opts"
When I apply it I get the following warning, and the Pod never gets created. What does it mean and how to resolve it? In my case, I'm running and EKS Fargate (only) cluster. Thanks!
$ kubectl describe pod stats-service-797784dfd5-tvh84
...
Warning FailedCreatePodSandBox 12s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create containerd task: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:319: getting the final child's pid from pipe caused \"read init-p: connection reset by peer\"": unknown
NOTES:
Seems like the warning is related to the spec.template.spec.containers.resources.limits block. If I remove that, the Pod gets created.
A lot of the solutions I read online say to reset Docker, which obviously can't apply to me.
You are using the wrong notations for your resources. As per Meaning of memory:
Limits and requests for memory are measured in bytes. You can express
memory as a plain integer or as a fixed-point number using one of
these suffixes: E, P, T, G, M, K. You can also use the power-of-two
equivalents: Ei, Pi, Ti, Gi, Mi, Ki.
If you want,
Requests:
1GB RAM
0.5 vCPU/Core
Limits:
2GB RAM
1 vCPU/Core
This should work:
resources:
requests:
memory: "1G"
cpu: "0.5"
limits:
memory: "2G"
cpu: "1"
The following is equivalent, but using different notations:
resources:
requests:
memory: "1024M"
cpu: "500m"
limits:
memory: "2048M"
cpu: "1000m"
Notice that the above example uses M for memory, not m.

iptables in kuberntes init container does't work

Background:
I'm trying to use goreplay to mirror the traffic to other destination.
I found that k8s service is a load balancing on layer 4 which cause the traffic can not be capture by goreplay,So i decide to add a reverse-proxy sidecar inside pod just like istio does.
Here is my pod yaml:
apiVersion: v1
kind: Pod
metadata:
name: nginx
namespace: default
labels:
app: nginx
spec:
containers:
- image: nginx
imagePullPolicy: IfNotPresent
name: nginx
ports:
- containerPort: 80
name: http
protocol: TCP
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
- image: nginx
imagePullPolicy: IfNotPresent
name: proxy
resources:
limits:
cpu: "2"
memory: 1Gi
requests:
cpu: 10m
memory: 40Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /etc/nginx/conf.d
name: default
initContainers:
- command:
- iptables
args:
- -t
- nat
- -A
- PREROUTING
- -p
- tcp
- --dport
- "80"
- -j
- REDIRECT
- --to-ports
- "15001"
image: soarinferret/iptablesproxy
imagePullPolicy: IfNotPresent
name: istio-init
resources:
limits:
cpu: 100m
memory: 50Mi
requests:
cpu: 10m
memory: 10Mi
securityContext:
allowPrivilegeEscalation: false
capabilities:
add:
- NET_ADMIN
- NET_RAW
drop:
- ALL
privileged: false
readOnlyRootFilesystem: false
runAsGroup: 0
runAsNonRoot: false
runAsUser: 0
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
terminationGracePeriodSeconds: 30
volumes:
- configMap:
defaultMode: 256
name: default
optional: false
name: default
---
apiVersion: v1
kind: Service
metadata:
name: nginx
namespace: default
spec:
ports:
- port: 80
protocol: TCP
targetPort: 80
selector:
app: nginx
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
---
apiVersion: v1
data:
default.conf: |
server {
listen 15001;
server_name localhost;
access_log /var/log/nginx/host.access.log main;
location / {
root /usr/share/nginx/html;
index index.html index.htm;
}
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root /usr/share/nginx/html;
}
}
kind: ConfigMap
metadata:
name: default
namespace: default
I use kubectl port-forward service/nginx 8080:80 and then curl http://localhost:8080,the traffic were sent directly to nginx not my proxy.
WHAT I WANT:
A way to let goreplay to capture traffic that load balanced by k8s service.
Correct iptables rule to let traffic success route to my proxy sideCar.
Thanks for any help!
As #Jonyhy96 mentioned in comments the only things which need to be changed here is to the privileged value to true in the securityContext field of initContainer.
Privileged - determines if any container in a pod can enable privileged mode. By default a container is not allowed to access any devices on the host, but a "privileged" container is given access to all devices on the host. This allows the container nearly all the same access as processes running on the host. This is useful for containers that want to use linux capabilities like manipulating the network stack and accessing devices.
So the initContainer would look like this
initContainers:
- command:
- iptables
args:
- -t
- nat
- -A
- PREROUTING
- -p
- tcp
- --dport
- "80"
- -j
- REDIRECT
- --to-ports
- "15001"
image: soarinferret/iptablesproxy
imagePullPolicy: IfNotPresent
name: istio-init
resources:
limits:
cpu: 100m
memory: 50Mi
requests:
cpu: 10m
memory: 10Mi
securityContext:
allowPrivilegeEscalation: false
capabilities:
add:
- NET_ADMIN
- NET_RAW
drop:
- ALL
privileged: true <---- changed from false
readOnlyRootFilesystem: false
runAsGroup: 0
runAsNonRoot: false
runAsUser: 0
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
There is very good tutorial about that, not exactly on nginx, but explains how to actually build the proxy.
The above securityContext works except for requiring a change to
allowPrivilegeEscalation: true
The following trimmed down version also works on GKE (Google Kubernetes Engine):
securityContext:
capabilities:
add:
- NET_ADMIN
drop:
- ALL
privileged: true

Disable https and change port for SKyDNS

I currently have SkyDNS that is trying to connect via HTTPS and is using port 443.
I1221 01:15:28.199437 1 server.go:91] Using https://10.100.0.1:443 for kubernetes master
I1221 01:15:28.199440 1 server.go:92] Using kubernetes API <nil>
I1221 01:15:28.199637 1 server.go:132] Starting SkyDNS server. Listening on port:10053
I want it to use HTTP and port 8080 instead.
My YAML file is :
apiVersion: v1
kind: ReplicationController
spec:
replicas: 1
selector:
k8s-app: kube-dns
version: v18
template:
metadata:
creationTimestamp: null
labels:
k8s-app: kube-dns
kubernetes.io/cluster-service: "true"
version: v18
spec:
containers:
- args:
- --domain=kube.local
- --dns-port=10053
image: gcr.io/google_containers/kubedns-amd64:1.6
imagePullPolicy: IfNotPresent
name: kubedns
ports:
- containerPort: 10053
name: dns-local
protocol: UDP
- containerPort: 10053
name: dns-tcp-local
protocol: TCP
resources:
limits:
cpu: 100m
memory: 200Mi
requests:
cpu: 100m
memory: 100Mi
terminationMessagePath: /dev/termination-log
- args:
- --cache-size=1000
- --no-resolv
- --server=127.0.0.1#10053
image: gcr.io/google_containers/kube-dnsmasq-amd64:1.3
imagePullPolicy: IfNotPresent
name: dnsmasq
ports:
- containerPort: 53
name: dns
protocol: UDP
- containerPort: 53
name: dns-tcp
protocol: TCP
resources: {}
terminationMessagePath: /dev/termination-log
- args:
- -cmd=nslookup kubernetes.default.svc.kube.local 127.0.0.1 >/dev/null &&
nslookup kubernetes.default.svc.kube.local 127.0.0.1:10053 >/dev/null
- -port=8080
- -quiet
image: gcr.io/google_containers/exechealthz-amd64:1.0
imagePullPolicy: IfNotPresent
name: healthz
ports:
- containerPort: 8080
protocol: TCP
resources:
limits:
cpu: 10m
memory: 20Mi
requests:
cpu: 10m
memory: 20Mi
I understand this might not be a favorable design but is there a way I can change the protocol and port ?
Adding the below like to the kubedns's spec.containers.args section did the trick.
- --kube-master-url=http://master:8080

External nameserver trouble for kubernetes pods

My pods can not resolve external world ( for ex for mail, ... ) how can I add google nameserver to the cluster ? For info the host resolve it without problem and has nameserver.
The problem is that the liveness check made skids fail, I changed it like bellow.
apiVersion: v1
kind: ReplicationController
metadata:
name: kube-dns-v10
namespace: kube-system
labels:
k8s-app: kube-dns
version: v10
kubernetes.io/cluster-service: "true"
spec:
replicas: 1
selector:
k8s-app: kube-dns
version: v10
template:
metadata:
labels:
k8s-app: kube-dns
version: v10
kubernetes.io/cluster-service: "true"
spec:
containers:
- name: etcd
image: gcr.io/google_containers/etcd:2.0.9
resources:
# keep request = limit to keep this container in guaranteed class
limits:
cpu: 100m
memory: 50Mi
requests:
cpu: 100m
memory: 50Mi
command:
- /usr/local/bin/etcd
- -data-dir
- /var/etcd/data
- -listen-client-urls
- http://127.0.0.1:2379,http://127.0.0.1:4001
- -advertise-client-urls
- http://127.0.0.1:2379,http://127.0.0.1:4001
- -initial-cluster-token
- skydns-etcd
volumeMounts:
- name: etcd-storage
mountPath: /var/etcd/data
- name: kube2sky
image: gcr.io/google_containers/kube2sky:1.12
resources:
# keep request = limit to keep this container in guaranteed class
limits:
cpu: 100m
memory: 50Mi
requests:
cpu: 100m
memory: 50Mi
args:
# command = "/kube2sky"
- --domain=cluster.local
- name: skydns
image: gcr.io/google_containers/skydns:2015-10-13-8c72f8c
resources:
# keep request = limit to keep this container in guaranteed class
limits:
cpu: 100m
memory: 50Mi
requests:
cpu: 100m
memory: 50Mi
args:
# command = "/skydns"
- -machines=http://127.0.0.1:4001
- -addr=0.0.0.0:53
- -ns-rotate=false
- -domain=cluster.local.
ports:
- containerPort: 53
name: dns
protocol: UDP
- containerPort: 53
name: dns-tcp
protocol: TCP
livenessProbe:
httpGet:
path: /healthz
port: 8080
scheme: HTTP
initialDelaySeconds: 30
timeoutSeconds: 15
#readinessProbe:
#httpGet:
#path: /healthz
#port: 8080
#scheme: HTTP
#initialDelaySeconds: 1
#timeoutSeconds: 5
- name: healthz
image: gcr.io/google_containers/exechealthz:1.0
resources:
# keep request = limit to keep this container in guaranteed class
limits:
cpu: 10m
memory: 20Mi
requests:
cpu: 10m
memory: 20Mi
args:
- -cmd=nslookup kubernetes.default.svc.cluster.local 127.0.0.1 >/dev/null
- -port=8080
ports:
- containerPort: 8080
protocol: TCP
volumes:
- name: etcd-storage
emptyDir: {}
dnsPolicy: Default # Don't use cluster DNS.