Ceph configuration file and ceph-deploy - ceph

I set up a test cluster and follow the documentation.
I created cluster with command ceph-deploy new node1. After that, ceph configuration file appeared in the current directory, which contains information about the monitor on the node with hostname node1. Then I added two OSDs to the cluster.
So now I have cluster with 1 monitor and 2 OSDs. ceph status command says that status is HEALTH_OK.
Following all the same documentation, I moved on to section "Expanding your cluster" and added two new monitors with commands ceph-deploy mon add node2 and ceph-deploy mon add node3. Now I have cluster with three monitors in the quorum and status HEALTH_OK, but there is one little discrepancy for me. The ceph.conf is still the same. It contains old information about only one monitor. Why ceph-deploy mon add {node-name} command didn't update configuration file? And the main question is why ceph status displays correct information about new cluster state with 3 monitors while ceph.conf doesn't contain this information. Where is real configuration file and why ceph-deploy knows it but I don't?
And it works even after a reboot. All ceph daemons start, read incorrect ceph.conf (I checked this with strace) and, ignoring this, work fine with new configuration.
And the last question. Why ceph-deploy osd activate {ceph-node}:/path/to/directory command didn't update configuration file too? After all why do we need ceph.conf file if we have so smart ceph-deploy now?

You have multiple questions here.
1) ceph.conf doesn't need to be completely the same for all nodes to run. E.g. OSD only need osd configuration they care about, MON only need configuration mon care ( unless you run everything on the same node which is also not recommended) So maybe your MON1 has MON1 MON2 has MON2 MON3 has MON3
2) When MON being created and then added, the MON map being updated so MON itself already know which other MON require to have quorum. So MON doesn't counting on ceph.conf to get quorum information but to change run time configuration.
3) ceph-deploy just a python script to prepare and run the ceph command for you. If you read into the detail ceph-deploy use e.g. ceph-disk zap prepare activate.
Once you osd being prepared, and activate, once it is format to ceph partition, udev know where to mount. Then systemd ceph-osd.server will be activate ceph-osd at boot. That's why it doesn't need OSD information in ceph.conf at all

Related

All ceph daemons container images disappeared on a single node after reconfiguring docker logs driver

I've changed log_driver to "local" in daemon.json docker configuration file, because an high activity level on rados gateway logs had satured disk space. My will was to change to journald to have logrotate. Unfortunately, after restart the docker daemon, many Ceph services did disappeared even as containers images. So now that node had caused an HEALTH_ERR because it lost 1 mgr, 1 mon and 3 osd services at the same time.
I've tried to use some ceph commands inside cephadm shell (on another node), but it freezes and nothing happened. What can I try to do to restore the node's services and cluster health?

Monitor daemon running but not in quorum

I'm currently testing OS and version upgrades for a ceph cluster. Starting info:
The cluster is currently on Centos 7 and Ceph version Nautilus. I'm trying to change OS with ubuntu 20.04 and version with Octopus. I started with upgrading mon1 first. I will write down the things done in order.
First of I stopped monitor service - systemctl stop ceph-mon#mon1
Then I removed the monitor from cluster - ceph mon remove mon1
Then installed ubuntu 20.04 on mon1. Updated the system and configured ufw.
Installed ceph octopus packages.
Copied ceph.client.admin.keyring and ceph.conf to mon1 /etc/ceph/
Copied ceph.mon.keyring to mon1 to a temporary folder and changed ownership to ceph:ceph
Got the monmap ceph mon getmap -o ${MONMAP} - The thing is i did this after removing the monitor.
Created /var/lib/ceph/mon/ceph-mon1 folder and changed ownership to ceph:ceph
Created the filesystem for monitor - sudo -u ceph ceph-mon --mkfs -i mon1 --monmap /folder/monmap --keyring /folder/ceph.mon.keyring
After noticing I got the monmap after the monitors removal I added it manually - ceph mon add mon1 <ip> --fsid <fsid>
After starting manually and checking cluster state with ceph -s I can see mon1 is listed but is not in quorum. The monitor daemon runs fine on the said mon1 node. I noticed on logs that mon1 is stuck in "probe" state and on other monitor logs there is an output such as mon1 (rank 2) addr [v2:<ip>:3300/0,v1:<ip>:6789/0] is down (out of quorum) , as i said the the monitor daemon is running on mon1 without any visible errors just stuck in probe state.
I wondered if it was caused by os&version change so i first tried out configuring manager, mds and radosgw daemons by creating the respective folders in /var/lib/ceph/... and copying keyrings. All these services work fine, i was able to reach to my buckets, was able to open the Octopus version dashboard, and metadata server is listed as active in ceph -s. So evidently my problem is only with monitor configuration.
After doing some checking found this on red hat ceph documantation:
If the Ceph Monitor is in the probing state longer than expected, it
cannot find the other Ceph Monitors. This problem can be caused by
networking issues, or the Ceph Monitor can have an outdated Ceph
Monitor map (monmap) and be trying to reach the other Ceph Monitors on
incorrect IP addresses. Alternatively, if the monmap is up-to-date,
Ceph Monitor’s clock might not be synchronized.
There is no network error on the monitor, I can reach all the other machines in the cluster. The clocks are synchronized. If this problem is caused by the monmap situation how can I fix this?
Ok so as a result, directly from centos7-Nautilus to ubuntu20.04-Octopus is not possible for monitor services only, apparently the issue is about hostname resolution with different Operating systems. The rest of the services is fine. There is a longer way to do this without issue and is the correct solution. First change os from centos7 to ubuntu18.04 and install ceph-nautilus packages and add the machines to cluster (no issues at all). Then update&upgrade the system and apply "do-release-upgrade". Works like a charm. I think what eblock mentioned was this.

Kubelet failing start attempts pollutes logs

I have a bunch of fresh CentOS servers installed on AWS. The service kubelet pollutes log file (var/log/messages) with it attempts to start, but as I have no use for it, I would like to remove it. It's this an optional component of CentOS and I can safely remove it (or disable kubelet.service)? I believe so, but would not expect a brand new server pushing out so many errors.
Currently, 97% of my /var/log/messages logs contain rows like:
Jan 17 03:21:03 systemd: Started kubelet: The Kubernetes Node Agent.
Jan 17 03:21:03 kubelet: F0117 03:21:03.101812 29626 server.go:198] failed to load Kubelet
config file /var/lib/kubelet/config.yaml, error failed to read kubelet config file
"/var/lib/kubelet/config.yaml", error: open /var/lib/kubelet/config.yaml: no such file or
directory
***da da da, 40 more rows***
Jan 17 03:21:03 systemd: Unit kubelet.service entered failed state.
Jan 17 03:21:03 systemd: kubelet.service failed.
Jan 17 03:21:13 systemd: kubelet.service holdoff time over, scheduling restart.
Jan 17 03:21:13 systemd: Stopped kubelet: The Kubernetes Node Agent.
Jan 17 03:21:13 systemd: Started kubelet: The Kubernetes Node Agent.
***sleep for 10s and start all over*
As I have already mentioned in my comment, kubelet is a part of kubernetes cluster, it's the primary node agent that runs on each node. I sincerely doubt that this CentOS image came with it preinstalled. If it really did, and as you said, it's a "fresh CentOS server", that nobody had previously tinkered with, I would recommend you to choose a different image if your servers have nothing to do with kubernetes cluster. However if it is used as kind of your production environment and runs some other important things, you should investigate how it was installed and simply remove it.
I did not do the setup myself, but the template used is
258751437250/ami-centos-7-1.13.0-00-1543960911. We have not asked for
Kubernetes on it and is not using clusters
The simplest answer to your question is:
You can safely stop and disable it so it doesn't pollute your /var/log/messages any more:
sudo systemctl stop kubelet.service && sudo systemctl disable kubelet.service
You can also remove it. Depending on how it was installed, you may need to do it in a specific way.
First check:
yum list installed | grep kubelet
If it's there you can:
yum remove kubelet
If it doesn't return any result you may try:
rpm -qa | grep kubelet
and if anything found, remove it:
rpm -e kubelet
It may be also a remnant of an old kubernetes installation which was set up with a tool like minikube or kubeadm. To check that, run:
sudo systemctl cat kubelet.service
and take a look at the ExecStart section. Depending on what you find there, it's very likely you'll need to uninstall some other unnecessary components e.g. if you find something like /var/lib/minikube/binaries/v1.16.0/kubelet, it means it's part of minikube installation.
Chances are that it was even partially uninstalled, but there are still some leftovers. As you can see, even it's config file cannot be found:
error: open /var/lib/kubelet/config.yaml: no such file or
directory
In case of any doubts or additional questions, don't hesitate to ask.

The connection to the server 10.0.x.x:6443 was refused after restarting the VM where kubernetes master was installed using kubeadm

I installed a Kubernetes master using kubeadm sucessfully on a VM (VirtualBox). The problem is that if I stop the machine and restart it the master node seems to be down:
kubectl get nodes
The connection to the server 10.0.x.x:6443 was refused - did you specify the right host or port?
How can I make sure it will always be up after restarting the VM?
UPDATE:
After restarting VM this is what I have to do to make the master node start:
sudo swapoff -a
sudo systemctl restart kubelet.service
Why? How can I fix it so that it starts without having to input that?
The problem is that if I stop the machine and restart it the master node seems to be down
Since it was kubeadm installation that worked properly before restarts, seems like Env var is missing after restart. Try to run this before kubectl get nodes:
export KUBECONFIG=/etc/kubernetes/admin.conf
If it starts normally, then you need to make sure that KUBECONFIG environment variable is properly configured upon restart either adding it to .bashrc or similar...
Edited:
Why? How can I fix it so that it starts without having to input that?
Ah, swap file is teasing you. By default kubelet will not start if swap is enabled. You have two options:
Remove swap: That's easy, just disable it as you already listed but make it permanent by commenting swap line in /etc/fstab file. Add # before line creating swap mount point and next time you restart you won't have it.
Allow kubelet to run with swap enabled: I know, not recommended by documentation, but if you like to live dangerous, you can add/edit in /etc/systemd/system/kubelet.service.d/10-kubeadm.conf following line:
Environment="KUBELET_EXTRA_ARGS=--fail-swap-on=false"
and next restart you will be able to run kubelet with swap enabled.
I got my problem fixed by clearing some space on the HDD. It seems that the space is low. Then, I restarted the server and it fixed my problem.
I encountered a similar issue where the kubectl commands are working in my master node but the same executed in slave node give me this error:
The connection to the server 10.0.x.x:6443 was refused - did you specify the right host or port?
The solution that worked for me is as below:
I copied the $KUBECONFIG file of Master and places in the slave nodes .kube/ location and it worked (I have only 2 nodes, one master and one slave.)
You just need to kill kubelet service and restart again. pods and container will be running as well as before reboot.
pkill kubelet
and
systemctl restart kubelet
good luck

Ceph-deploy is not creating ceph.bootstrap-rbd.keyring file

I am learning Ceph storage (luminous) with one admin node and two nodes for OSD and MON etc. as I following the doc http://docs.ceph.com/docs/master/start/quick-ceph-deploy/ to setup my initial storage cluster and stuck after executing this below command. as per the document the below command should out put 6 files but this file "ceph.bootstrap-rbd.keyrin" is missing in the admin node directory where I execute ceph-deploy commands.
ceph-deploy --username sanadmin mon create-initial
I am not sure whether it is a normal behaviour are I am really missing something. appreciate you help on this.
Thanks.
It is not important. Because rbd is native service for ceph. Do not worry about that