kubernetes's version is 1.2
I want to watch the scheduler's log. So how to set kube-scheduler's log print to a file?
The kube-scheduler's configuration is at this path: /etc/kubernetes/scheduler.
And the global configuration is at this path: /etc/kubernetes/config.
So we can see these notes:
# logging to stderr means we get it in the systemd journal
KUBE_LOGTOSTDERR="--logtostderr=true"
# journal message level, 0 is debug
KUBE_LOG_LEVEL="--v=0"
Can you tail the contents of the service (if running in systemd): journalctl -u apiserver -f
Or if a container, find the container id of the scheduler, and tail with docker: docker logs -f
Related
For test purposes, I'm trying to connect a module that intoduces an absration layer over s3fs with custom business logic.
It seems like I have trouble connecting the s3fs client to the Minio container.
Here's how I created the the container and attach the s3fs client (below describes how I validated the container is running properly)
import s3fs
import docker
client = docker.from_env()
container = client.containers.run('minio/minio',
"server /data --console-address ':9090'",
environment={
"MINIO_ACCESS_KEY": "minio",
"MINIO_SECRET_KEY": "minio123",
},
ports={
"9000/tcp": 9000,
"9090/tcp": 9090,
},
volumes={'/tmp/minio': {'bind': '/data', 'mode': 'rw'}},
detach=True)
container.reload() # why reload: https://github.com/docker/docker-py/issues/2681
fs = s3fs.S3FileSystem(
anon=False,
key='minio',
secret='minio123',
use_ssl=False,
client_kwargs={
'endpoint_url': "http://localhost:9000" # tried 127.0.0.1:9000 with no success
}
)
===========
>>> fs.ls('/')
[]
>>> fs.ls('/data')
Bucket doesnt exists exception
check that the container is running:
➜ ~ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
127e22c19a65 minio/minio "/usr/bin/docker-ent…" 56 seconds ago Up 55 seconds 0.0.0.0:9000->9000/tcp, :::9000->9000/tcp, 0.0.0.0:9090->9090/tcp, :::9090->9090/tcp hardcore_ride
check that the relevant volume is attached:
➜ ~ docker exec -it 127e22c19a65 bash
[root#127e22c19a65 /]# ls -l /data/
total 4
-rw-rw-r-- 1 1000 1000 4 Jan 11 16:02 foo.txt
[root#127e22c19a65 /]# exit
Since I proved the volume binding is working properly by shelling into the container, I expected to see the same results when attached the container's filesystem via the s3fs client.
What is the bucket name that was created as part of this setup?
From the docs I'm seeing you have to give <bucket_name>/<object_path> syntax to access the resources.
fs.ls('my-bucket')
['my-file.txt']
Also if you look at the docs below there are a couple of other ways to access it using fs.open can you give that a try?
https://buildmedia.readthedocs.org/media/pdf/s3fs/latest/s3fs.pdf
I use filebeat with elk. I started it with nohup command.
nohup ./filebeat -e -c filebeat.yml -d "publish" > filebeat.log &
Application stopped automatically after one day. close_inactive parameter is not work. Is there any configuration that i missed for this problem.
2020-10-22T09:55:36.814+0100 INFO crawler/crawler.go:165 Crawler stopped
2020-10-22T09:55:36.815+0100 INFO registrar/registrar.go:367 Stopping Registrar
2020-10-22T09:55:36.815+0100 INFO registrar/registrar.go:293 Ending Registrar
2020-10-22T09:55:36.820+0100 INFO [monitoring] log/log.go:153 Total non-zero metrics {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":10540,"time":{"ms":10547}},"total":{"ticks":68190,"time":{"ms":68203},"value":68190},"user":{"ticks":57650,"time":{"ms":57656}}},"handles":{"limit":{"hard":16000,"soft":16000},"open":10},"info":{"ephemeral_id":"b57f1c4d-7a80-4f1f-aaba-5ab9ee057757","uptime":{"ms":7119571}},"memstats":{"gc_next":22377264,"memory_alloc":11462592,"memory_total":18240359416,"rss":50831360},"runtime":{"goroutines":21}},"filebeat":{"events":{"added":528063,"done":528063},"harvester":{"closed":77,"open_files":0,"running":0,"started":77},"input":{"log":{"files":{"truncated":38}}}},"libbeat":{"config":{"module":{"running":0},"reloads":1},"output":{"events":{"acked":527884,"batches":4732,"failed":51426,"total":579310},"read":{"bytes":32364,"errors":4},"type":"logstash","write":{"bytes":180629879,"errors":19}},"pipeline":{"clients":0,"events":{"active":0,"filtered":179,"published":527884,"retry":99719,"total":528063},"queue":{"acked":527884}}},"registrar":{"states":{"cleanup":8,"current":38,"update":528063},"writes":{"success":4356,"total":4356}},"system":{"cpu":{"cores":8},"load":{"1":0.66,"15":0.52,"5":0.56,"norm":{"1":0.0825,"15":0.065,"5":0.07}}}}}}
2020-10-22T09:55:36.820+0100 INFO [monitoring] log/log.go:154 Uptime: 1h58m39.572210325s
2020-10-22T09:55:36.820+0100 INFO [monitoring] log/log.go:131 Stopping metrics logging.
2020-10-22T09:55:36.820+0100 INFO instance/beat.go:432 filebeat stopped.
What is the content of "filebeat.yml"? it can stop for example if you didn't define any paths.
Also, you might want to change the logging level to get more information as to what happened:
logging.level: debug
Stop the filebeat service and Run the Filebeat in debug mode from command line to check for any issue in your configuration using the command below from the filebeat home directory.
filebeat -e -c filebeat.yml -d "*"
I'm running fluentbit (td-agent-bit) on a CentOS system in order to output all logs in a centralized system. Everytime fluentbit pushes a record to the remote location, it adds a record in /var/log/messages as well, leading up to a huge log filesize.
Jul 21 08:48:53 hostname td-agent-bit: [2020/07/21 08:48:53] [ info] [out_azure] customer_id=XXXXXXXXXXXXXXXXXXXXXXXX, HTTP status=200
Any idea how can I stop a service (td-agent-bit) from writing to /var/log/messages? Couldn't find any configuration parameter (e.g. verbose) in fluentbit documentation. Thanks!
Your log_level is "info" which includes a lot of messages of the pipeline. You can either decrease the log level inside the output section of the plugin to "error" only, e.g:
[OUTPUT]
name azure
match *
log_level error
note: you can decrease the general log_level also in the main [SERVICE] section.
Problem encountered
When deploying a cluster with Kubespray, CRI-O and Cilium I get an error about having multiple CRI socket to choose from.
Full error
fatal: [p3kubemaster1]: FAILED! => {"changed": true, "cmd": " mkdir -p /etc/kubernetes/external_kubeconfig && /usr/local/bin/kubeadm init phase kubeconfig admin --kubeconfig-dir /etc/kubernetes/external_kubeconfig --cert-dir /etc/kubernetes/ssl --apiserver-advertise-address 10.10.3.15 --apiserver-bind-port 6443 >/dev/null && cat /etc/kubernetes/external_kubeconfig/admin.conf && rm -rf /etc/kubernetes/external_kubeconfig ", "delta": "0:00:00.028808", "end": "2019-09-02 13:01:11.472480", "msg": "non-zero return code", "rc": 1, "start": "2019-09-02 13:01:11.443672", "stderr": "Found multiple CRI sockets, please use --cri-socket to select one: /var/run/dockershim.sock, /var/run/crio/crio.sock", "stderr_lines": ["Found multiple CRI sockets, please use --cri-socket to select one: /var/run/dockershim.sock, /var/run/crio/crio.sock"], "stdout": "", "stdout_lines": []}
Interesting part
kubeadm init phase kubeconfig admin --kubeconfig-dir /etc/kubernetes/external_kubeconfig [...] >/dev/null,"stderr": "Found multiple CRI sockets, please use --cri-socket to select one: /var/run/dockershim.sock, /var/run/crio/crio.sock"}
What I've tried
1) I've tried to set the --cri-socket flag inside /var/lib/kubelet/kubeadm-flags.env:
KUBELET_KUBEADM_ARGS="--container-runtime=remote --container-runtime-endpoint=/var/run/crio/crio.sock --cri-socket=/var/run/crio/crio.sock"
=> Makes no difference
2) I've checked /etc/kubernetes/kubeadm-config.yaml but it already contains the following section :
apiVersion: kubeadm.k8s.io/v1beta2
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 10.10.3.15
bindPort: 6443
certificateKey: 9063a1ccc9c5e926e02f245c06b8d9f2ff3xxxxxxxxxxxx
nodeRegistration:
name: p3kubemaster1
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
criSocket: /var/run/crio/crio.sock
=> Its already ending with the criSocket flag, so nothing to do...
3) Tried to edit the ansible script to add the --cri-socket to the existing command but it fails with Unknow command --cri-socket
Existing :
{% if kubeadm_version is version('v1.14.0', '>=') %}
init phase`
Tried :
{% if kubeadm_version is version('v1.14.0', '>=') %}
init phase --crio socket /var/run/crio/crio.sock`
Theories
It seems that the problem comes from the command kubeadm init phase which is not compatible with the --crio-socket flag... (see point 3)
Even though the correct socket is set (see point 2) using the config file, the kubeadm init phase is not using it.
Any ideas would be apreciated ;-)
thx
This worked for me for multiple cri sockets
kubeadm init --pod-network-cidr=10.244.0.0/16 --cri-socket=unix:///var/run/cri-dockerd.sock
Image pull command before initialization for multiple cri:
kubeadm config images pull --cri-socket=unix:///var/run/cri-dockerd.sock
You can choose cri socket path from the following table. See original documentation here
Runtime
Path to Unix domain socket
containerd
unix:///var/run/containerd/containerd.sock
CRI-O
unix:///var/run/crio/crio.sock
Docker Engine (using cri-dockerd)
unix:///var/run/cri-dockerd.sock
I finally got it !
The initial kubespray command was:
kubeadm init phase kubeconfig admin --kubeconfig-dir {{ kube_config_dir }}/external_kubeconfig
⚠️ It seems that the --kubeconfig-dir flag was not taking into account the number of crio sockets.
So I changed the line to:
kubeadm init phase kubeconfig admin --config /etc/kubernetes/kubeadm-config.yaml
For people having similar issues:
The InitConfig part that made it work on the master is the following:
apiVersion: kubeadm.k8s.io/v1beta2
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 10.10.3.15
bindPort: 6443
certificateKey: 9063a1ccc9c5e926e02f245c06b8d9f2ff3c1eb2dafe5fbe2595ab4ab2d3eb1a
nodeRegistration:
name: p3kubemaster1
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
criSocket: /var/run/crio/crio.sock
In kubespray you must update the file roles/kubernetes/client/tasks/main.yml arround line 57.
You'll have to comment the initial --kubeconfig-dir section and replace it with the path of the InitConfig file.
For me it was generated by kubespray in /etc/kubernetes/kubeadm-config.yaml on the kube master. Check that this file exists on you side and that it contains the criSocket key in the nodeRegistration section.
I have made some research and came upon this github thread.
Which than pointed me to another one here.
This seems to be a kubeadm issue which was already fixed and so the solution is available in v1.15
Could you please upgrade to that version (I am not sure which one you are using basing on both of your question that I have worked on) and see if the problem still persists?
I am using kubernetes load-lanacer(Here the haproxy configuration is written in every 10s and restarted). Since I want to pass the socket connection while reloading the HAProxy, I changed the Dockerfile of the HAProxy such that it uses HAProxy 1.8-dev2 version. The image used is haproxytech/haproxy-ubuntu:1.8-dev2. Also I added the following line under the global section of the template.cfg file(This is the template in which the HAProxy configuration is written)
stats socket /var/run/haproxy/admin.sock mode 660 level admin expose-fd listeners
Also I changed the reload command in haproxy_reload file as follows
haproxy -f /etc/haproxy/haproxy.cfg -p /var/run/haproxy.pid -x /var/run/haproxy/admin.sock -sf $(cat /var/run/haproxy.pid)
Once I run the docker image I get the following error.(kubectl create -f rc.yaml --namespace load-balancer)
W1027 07:13:37.922565 5 service_loadbalancer.go:687] Requeuing kube-system/kube-dns because of error: error restarting haproxy -- [WARNING] 299/071337 (21) : We didn't get the expected number of sockets (expecting 1347703880 got 0)
[ALERT] 299/071337 (21) : Failed to get the sockets from the old process!
: exit status 1
FYI:
I commented the stats socket line in the template.cfg file and ran the docker image to verify whether the restart command identifies the socket. The same error occurred. Seems like the soft restart command doesn't identify the stats socket created by the HAProxy.