Kubernetes scheduler: watch of *api.Pod ended with error: unexpected end of JSON input - kubernetes

Yesterday service worked fine. But today when i checked service's state i saw:
Mar 11 14:03:16 coreos-1 systemd[1]: scheduler.service: main process exited, code=exited, status=2/INVALIDARGUMENT
Mar 11 14:03:16 coreos-1 systemd[1]: Unit scheduler.service entered failed state.
Mar 11 14:03:16 coreos-1 systemd[1]: scheduler.service failed.
Mar 11 14:03:16 coreos-1 systemd[1]: Starting Kubernetes Scheduler...
Mar 11 14:03:16 coreos-1 systemd[1]: Started Kubernetes Scheduler.
Mar 11 14:08:16 coreos-1 kube-scheduler[4659]: E0311 14:08:16.808349 4659 reflector.go:118] watch of *api.Service ended with error: very short watch
Mar 11 14:08:16 coreos-1 kube-scheduler[4659]: E0311 14:08:16.811434 4659 reflector.go:118] watch of *api.Pod ended with error: unexpected end of JSON input
Mar 11 14:08:16 coreos-1 kube-scheduler[4659]: E0311 14:08:16.847595 4659 reflector.go:118] watch of *api.Pod ended with error: unexpected end of JSON input
It's really confused 'cause etcd, flannel and apiserver work fine.
Only some strange logs are for etcd:
Mar 11 20:22:21 coreos-1 etcd[472]: [etcd] Mar 11 20:22:21.572 INFO | aba44aa0670b4b2e8437c03a0286d779: warning: heartbeat time out peer="6f4934635b6b4291bf29763add9bf4c7" missed=1 backoff="2s"
Mar 11 20:22:48 coreos-1 etcd[472]: [etcd] Mar 11 20:22:48.269 INFO | aba44aa0670b4b2e8437c03a0286d779: warning: heartbeat time out peer="6f4934635b6b4291bf29763add9bf4c7" missed=1 backoff="2s"
Mar 11 20:48:12 coreos-1 etcd[472]: [etcd] Mar 11 20:48:12.070 INFO | aba44aa0670b4b2e8437c03a0286d779: warning: heartbeat time out peer="6f4934635b6b4291bf29763add9bf4c7" missed=1 backoff="2s"
So, I'm really stuck and don't know what's wrong. How can i resolve this problem? Or, how can i check details log for scheduler.
journalctl give me same logs like systemd status

Please see: https://github.com/GoogleCloudPlatform/kubernetes/issues/5311
It means apiserver accepted the watch request but then immediately terminated the connection.
If you see it occasionally, it implies a transient error and is not alarming. If you see it repeatedly, it implies that apiserver (or etcd) is sick.
Is something actually not working for you?

Related

MongoDB keeps crashing--is it running out of memory or CPU?

I have an issue where MongoDB keeps crashing on me:
$ sudo cat /var/log/syslog ->
Nov 28 14:06:58 ns557017 systemd[1]: mongod.service: Main process exited, code=killed, status=6/ABRT
Nov 28 14:06:58 ns557017 systemd[1]: mongod.service: Unit entered failed state.
Nov 28 14:06:58 ns557017 systemd[1]: mongod.service: Failed with result 'signal'.
Nov 28 14:06:59 ns557017 systemd[1]: mongod.service: Service hold-off time over, scheduling restart.
Nov 28 14:06:59 ns557017 systemd[1]: Stopped MongoDB Database Server.
I am using Mongo's free monitoring, and it points me towards the CPU being overused:
However if I look at htop, the CPU always seems fine:
How can I deduce what is causing Mongo to crash on me? Thanks

Kafka services aren't coming up

When I start the kafka services on my rhel machine,it fails & prints the following error.Nothing printed in logs aswell.
[root#node01 java]# systemctl start kafka
Job for kafka.service failed because a configured resource limit was exceeded. See "systemctl status kafka.service" and "journalctl -xe" for details.
I have cross verified these kafka configurations files to that of another machine with similar setup.It all looks fine.Also checked few online resources & nothing much turned out to be helpful.
Any thoughts?
Output of journalctl -xe are as follows:
-- Unit kafka.service has begun starting up.
May 28 15:30:09 hostm01 runuser[30740]: pam_unix(runuser:session): session opened for user ossadm by (uid=0)
May 28 15:30:09 hostm01 runuser[30740]: pam_unix(runuser:session): session closed for user ossadm
May 28 15:30:09 hostm01 kafka[30733]: Starting kafka ... [ OK ]
May 28 15:30:11 hostm01 kafka[30733]: [ OK ]
May 28 15:30:11 hostm01 systemd[1]: Started Apache Kafka.
-- Subject: Unit kafka.service has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit kafka.service has finished starting up.
--
-- The start-up result is done.
May 28 15:30:11 hostm01 systemd[1]: kafka.service: main process exited, code=exited, status=1/FAILURE
May 28 15:30:12 hostm01 runuser[31178]: pam_unix(runuser:session): session opened for user ossadm by (uid=0)
May 28 15:30:12 hostm01 kafka[31171]: Stopping kafka ... STOPPED
May 28 15:30:12 hostm01 runuser[31178]: pam_unix(runuser:session): session closed for user ossadm
May 28 15:30:12 hostm01 kafka[31171]: [17B blob data]
May 28 15:30:12 hostm01 systemd[1]: Unit kafka.service entered failed state.
May 28 15:30:12 hostm01 systemd[1]: kafka.service failed.

Ubuntu service stops randomly with "Main Process exited, status 143/n/a"

My apps deployed as debians and started using systemd service.The app is getting crashed randomly. I am unable to find the reason for the crash.
I have 4 applications running[built using java, scala], out of which two are getting killed(named as op and common). All are started using systemd services.
Error on syslog is
Jul 22 11:45:44 misqa mosquitto[2930]: Socket error on client
005056b76983-Common, disconnecting
Jul 22 11:45:44 misqa systemd[1]: commonmod.service: Main process
exited, code=exited, status=143/n/a
Jul 22 11:45:44 misqa systemd[1]: commonmod.service: Unit entered
failed state
Jul 22 11:45:44 misqa systemd[1]: commonmod.service: Failed with
result 'exit-code'
Jul 22 11:45:44 misqa systemd[1]: opmod.service: Main process exited,
code=exited, status=143/n/a
Jul 22 11:45:44 misqa systemd[1]: opmod.service: Unit entered failed
state
Jul 22 11:45:44 misqa systemd[1]: opmod.service: Failed with result
'exit-code'
But I am not getting any error on my application log file for both op and common
When I read more, I understood that the reason for crash is due to SIGTERM command, but unable to find out what is causing it. In any of these applications, I dont have exec commands for killall.
Is there anyway to identify which process is killing my applications.
My systemd service is like this:
[Unit]
Description=common Module
After=common-api
Requires=common-api
[Service]
TimeoutStartSec=0
ExecStart=/usr/bin/common-api
[Install]
WantedBy=multi-user.target
Basically Java programs sometimes don't send back the expected exit status when shutting down in response to SIGTERM.
You should be able to suppress this by adding the exit code into the systemd service file as a "success" exit status:
[Service]
SuccessExitStatus=143
This solution was sucessful applied here (serverfault) and here (stasckoverflow) both with java apps.

Job for kube-apiserver.service failed because the control process exited with error code

On the beginning i wanted to point out i am fairly new into Linux systems, and totally, totally new with kubernetes so my question may be trivial.
As stated in the title i have problem with setting up the Kubernetes cluster. I am working on the Atomic Host Version: 7.1707 (2017-07-31 16:12:06)
I am following this guide:
http://www.projectatomic.io/docs/gettingstarted/
in addition to that i followed this:
http://www.projectatomic.io/docs/kubernetes/
(to be precise, i ran this command:
rpm-ostree install kubernetes-master --reboot
everything was going fine until this point:
systemctl start etcd kube-apiserver kube-controller-manager kube-scheduler
the problem is with:
systemctl start etcd kube-apiserver
as it gives me back this response:
Job for kube-apiserver.service failed because the control process
exited with error code. See "systemctl status kube-apiserver.service"
and "journalctl -xe" for details.
systemctl status kube-apiserver.service
gives me back:
● kube-apiserver.service - Kubernetes API Server
Loaded: loaded (/usr/lib/systemd/system/kube-apiserver.service; enabled; vendor preset: disabled)
Active: failed (Result: start-limit) since Fri 2017-08-25 14:29:56 CEST; 2s ago
Docs: https://github.com/GoogleCloudPlatform/kubernetes
Process: 17876 ExecStart=/usr/bin/kube-apiserver $KUBE_LOGTOSTDERR $KUBE_LOG_LEVEL $KUBE_ETCD_SERVERS $KUBE_API_ADDRESS $KUBE_API_PORT $KUBELET_PORT $KUBE_ALLOW_PRIV $KUBE_SERVICE_ADDRESSES $KUBE_ADMISSION_CONTROL $KUBE_API_ARGS (code=exited, status=255)
Main PID: 17876 (code=exited, status=255)
Aug 25 14:29:56 master systemd[1]: kube-apiserver.service: main process exited, code=exited, status=255/n/a
Aug 25 14:29:56 master systemd[1]: Failed to start Kubernetes API Server.
Aug 25 14:29:56 master systemd[1]: Unit kube-apiserver.service entered failed state.
Aug 25 14:29:56 master systemd[1]: kube-apiserver.service failed.
Aug 25 14:29:56 master systemd[1]: kube-apiserver.service holdoff time over, scheduling restart.
Aug 25 14:29:56 master systemd[1]: start request repeated too quickly for kube-apiserver.service
Aug 25 14:29:56 master systemd[1]: Failed to start Kubernetes API Server.
Aug 25 14:29:56 master systemd[1]: Unit kube-apiserver.service entered failed state.
Aug 25 14:29:56 master systemd[1]: kube-apiserver.service failed.
I have no clue where to start and i will be more than thankful for any advices.
It turned out to be a typo in /etc/kubernetes/config. I misunderstood the "# Comma separated list of nodes in the etcd cluster".
Idk how to close the thread or anything.

Could not start RStudio Server 0.99.893-x86_64

I installed it a Centos 7 box.
R studio server service could not start.
I run the command
systemctl status rstudio-server.service
and it showed:
● rstudio-server.service - RStudio Server
Loaded: loaded (/etc/systemd/system/rstudio-server.service; enabled; vendor preset: disabled)
Active: failed (Result: start-limit) since Thu 2016-01-28 20:18:20 ICT; 1min 6s ago
Process: 48820 ExecStart=/usr/lib/rstudio-server/bin/rserver (code=exited, status=203/EXEC)
Jan 28 20:18:20 localhost.localdomain systemd[1]: rstudio-server.service: control process exited, code=exited s...=203
Jan 28 20:18:20 localhost.localdomain systemd[1]: Failed to start RStudio Server.
Jan 28 20:18:20 localhost.localdomain systemd[1]: Unit rstudio-server.service entered failed state.
Jan 28 20:18:20 localhost.localdomain systemd[1]: rstudio-server.service failed.
Jan 28 20:18:20 localhost.localdomain systemd[1]: rstudio-server.service holdoff time over, scheduling restart.
Jan 28 20:18:20 localhost.localdomain systemd[1]: start request repeated too quickly for rstudio-server.service
Jan 28 20:18:20 localhost.localdomain systemd[1]: Failed to start RStudio Server.
Jan 28 20:18:20 localhost.localdomain systemd[1]: Unit rstudio-server.service entered failed state.
Jan 28 20:18:20 localhost.localdomain systemd[1]: rstudio-server.service failed.
I installed and run an old version (rstudio-server-0.99.491-1.x86_64) on the same box without any problem.
How could I fix the issues?
Although you asked this question 3 years ago, I think it's still necessary to share my solution to this problem.
I encounter this problem after I updated R.
The reason why you can not restart rstudio-server is that the PORT 8787 was been using by previous rserver. After knowing this, the solution is easy.
First, check the pid that was using PORT 8787
sudo netstat -anp | grep 8787
tcp 0 0 0.0.0.0:8787 0.0.0.0:* LISTEN pid/rserver
Second, kill this pid (use your pid)
sudo kill -9 pid
Third, restart rstudio-server or reinstall resutio server package