kubelet Error while processing event /sys/fs/cgroup/memory/libcontainer_10010_systemd_test_default.slice - kubernetes

I have setup Kubernetes 1.15.3 cluster on Centos 7 OS using systemd cgroupfs. on all my nodes syslog started logging this message frequently.
How to fix this error message?
kubelet: W0907 watcher.go:87 Error while processing event ("/sys/fs/cgroup/memory/libcontainer_10010_systemd_test_default.slice": 0x40000100 == IN_CREATE|IN_ISDIR): readdirent: no such file or directory
Thanks

It's a known issue with a bad interaction with runc; someone observed it is actually caused by a repeated etcd health check but that wasn't my experience on Ubuntu, which exhibits that same behavior on every Node
They allege that updating the runc binary on your hosts will make the problem go away, but I haven't tried that myself

I had exactly the same problem with the same kubernetes version and with the same context -that is changing cgroups to systemd. Github ticket for this error is created here.
After changing container runtime, as it is described in this tutorial to systemd error start popping out in kublete service log.
What worked for me was to update docker and containerd to following versions.
docker: v19.03.5
containerd: v1.2.10
I assume that any version higher than above will fix the problem as well.

Related

Cadence - Cadence Canary failing to start

I am running into this error after Cadence Canary gets started on my cluster nodes.
After the error error starting cron workflow.... , Cadence Canary does nothing and just hangs there.
Any thoughts/suggestions?
UPDATE: I have turned on debug level logging and I am getting hammered with the following (note: it's a fresh cluster):
This error message says that cadence-canary was not able to call cadence-frontend service. This might indicate that cadence-frontend is not running or is not reachable. Check if cadence-frontend is running and check if your cadence-canary config points to correct cadence-frontend address

How to overcome the IllegalAccessError while start up of connector in Kafka

I am writing a connector for Kafka Connect. The error I see during the start up of connector is
java.lang.IllegalAccessError: tried to access field org.apache.kafka.common.config.ConfigTransformer.DEFAULT_PATTERN from class org.apache.kafka.connect.runtime.AbstractHerder
The error seems to happen at https://github.com/apache/kafka/blob/trunk/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/AbstractHerder.java#L449
Do I need to set this DEFAULT.PATTERN manually? Is this not set by default.
I am using the docker image confluentinc/cp-kafka:5.0.1. The version of connect-api I am using in my connector app is org.apache.kafka:connect-api:2.0.0. I am running my set up inside Kubernetes.
The issue was resolved when I changed the image to confluentinc/cp-kafka:5.0.0-2.
I already tried this option before posting the question, but the pod was in a Pending state and was refusing to start. I thought that it could have been an issue with the image. Upon doing some more research later, I came to know that sometimes Kubernetes is unable to allot enough resources and hence pods can stay in Pending state.
I tried the image confluentinc/cp-kafka:5.0.0-2 and it works fine.

kubelet.exe on Windows Server Core (1709): unable to read physical memory

I've built kubelet, kube-proxy off master branch of kubernetes repository and when running kubelet.exe without any parameters I'm getting the following error:
error: failed to run Kubelet: unable to read physical memory
I'm building offmaster as none one of the unstable branches included the fix described (and fixed) here 55031, which I was hitting on 1.9.0-alpha.3.
After looking into it for the last week or so, it seems VMware (Fusion in my case) doesn't correctly populate the SMBIOS data related to installed memory (which is used by GetPhysicallyInstalledSystemMemory that will show the above error).
I was able to confirm that it's working fine on Hyper-V hosts. Will attempt a fix to the issue.

Linux kernel tune in Google Container Engine

I deployed a redis container to Google Container Engine and get the following warnings.
10:M 01 Mar 05:01:46.140 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
I know to correct the warning I need executing
echo never > /sys/kernel/mm/transparent_hugepage/enabled
I tried that in container but does not help.
How to solve this warning in Google Container Engine?
As I understand, my pods are running on the node, and the node is a VM private for me only? So I ssh to the node and modify the kernel directly?
Yes, you own the nodes and can ssh into them and modify them as you need.

A combination for monitoring system for container (Grafana+Heapster+InfluxDB+cAdvisor) based on baremetal

Expert
I have a question related monitoring system for container.
Below picture has my thinking for monitoring.
I'd like to run monitoring combination(Grafana,Heapster,InfluxDB,cAdvisor) on baremetal as a daemon process instead of running in containers.
When i configure those 4 units...i got the error below.
It might be comes out from linking influxdb to Heapster (ex:--sink)
Below is my commands to run heapster
"./heapster --source=cadvisor:external?cadvisorPort=8081 --sink=influxdb:http://192.168.56.20:8086/db=k8s&user=root&pw=root"
Then I get following message
driver.go:326 Database creation failed : Server returned (404): 404
page not found
What symptom it is?
Someone if have a solution? let me get an advice
Thanks in advance