Ceph-deploy is not creating ceph.bootstrap-rbd.keyring file - ceph

I am learning Ceph storage (luminous) with one admin node and two nodes for OSD and MON etc. as I following the doc http://docs.ceph.com/docs/master/start/quick-ceph-deploy/ to setup my initial storage cluster and stuck after executing this below command. as per the document the below command should out put 6 files but this file "ceph.bootstrap-rbd.keyrin" is missing in the admin node directory where I execute ceph-deploy commands.
ceph-deploy --username sanadmin mon create-initial
I am not sure whether it is a normal behaviour are I am really missing something. appreciate you help on this.
Thanks.

It is not important. Because rbd is native service for ceph. Do not worry about that

Related

How can I change the config file of the mongo running on ECS

I changed the mongod.conf.orig of the mongo running on ECS, but when I restart, the changes are gone.
Here's the details:
I have a mongodb running on ECS, it always crashes due to out of memory.
I have found the reason, I set the ECS memory to 8G, but because the mongo is running in a container, it detected a higher memory.
when I run db.hostInfo()
I got the memSizeMB higher than 16G.
It caused that when I run db.serverStatus().wiredTiger.cache
I got a "maximum bytes configured" higher than 8G
so I need to reduce the wiredTigerCacheSizeGB in config file.
I used the command line copilot svc exec -c /bin/sh -n mongo to connect to it.
Then I found a file named mongod.conf.orig.
I ran apt-get install vim to install vi and edit this file mongod.conf.orig.
But after I restart the mongo task, all my changes are gone. include the vi I just installed.
Did anyone meet the same problem? Any information will be appreciated.
ECS containers has ephemeral storage. In your case, you could create an EFS and mount it in a container, then share the configuration.
If you use CloudFormation, look at mount points.

Monitor daemon running but not in quorum

I'm currently testing OS and version upgrades for a ceph cluster. Starting info:
The cluster is currently on Centos 7 and Ceph version Nautilus. I'm trying to change OS with ubuntu 20.04 and version with Octopus. I started with upgrading mon1 first. I will write down the things done in order.
First of I stopped monitor service - systemctl stop ceph-mon#mon1
Then I removed the monitor from cluster - ceph mon remove mon1
Then installed ubuntu 20.04 on mon1. Updated the system and configured ufw.
Installed ceph octopus packages.
Copied ceph.client.admin.keyring and ceph.conf to mon1 /etc/ceph/
Copied ceph.mon.keyring to mon1 to a temporary folder and changed ownership to ceph:ceph
Got the monmap ceph mon getmap -o ${MONMAP} - The thing is i did this after removing the monitor.
Created /var/lib/ceph/mon/ceph-mon1 folder and changed ownership to ceph:ceph
Created the filesystem for monitor - sudo -u ceph ceph-mon --mkfs -i mon1 --monmap /folder/monmap --keyring /folder/ceph.mon.keyring
After noticing I got the monmap after the monitors removal I added it manually - ceph mon add mon1 <ip> --fsid <fsid>
After starting manually and checking cluster state with ceph -s I can see mon1 is listed but is not in quorum. The monitor daemon runs fine on the said mon1 node. I noticed on logs that mon1 is stuck in "probe" state and on other monitor logs there is an output such as mon1 (rank 2) addr [v2:<ip>:3300/0,v1:<ip>:6789/0] is down (out of quorum) , as i said the the monitor daemon is running on mon1 without any visible errors just stuck in probe state.
I wondered if it was caused by os&version change so i first tried out configuring manager, mds and radosgw daemons by creating the respective folders in /var/lib/ceph/... and copying keyrings. All these services work fine, i was able to reach to my buckets, was able to open the Octopus version dashboard, and metadata server is listed as active in ceph -s. So evidently my problem is only with monitor configuration.
After doing some checking found this on red hat ceph documantation:
If the Ceph Monitor is in the probing state longer than expected, it
cannot find the other Ceph Monitors. This problem can be caused by
networking issues, or the Ceph Monitor can have an outdated Ceph
Monitor map (monmap) and be trying to reach the other Ceph Monitors on
incorrect IP addresses. Alternatively, if the monmap is up-to-date,
Ceph Monitor’s clock might not be synchronized.
There is no network error on the monitor, I can reach all the other machines in the cluster. The clocks are synchronized. If this problem is caused by the monmap situation how can I fix this?
Ok so as a result, directly from centos7-Nautilus to ubuntu20.04-Octopus is not possible for monitor services only, apparently the issue is about hostname resolution with different Operating systems. The rest of the services is fine. There is a longer way to do this without issue and is the correct solution. First change os from centos7 to ubuntu18.04 and install ceph-nautilus packages and add the machines to cluster (no issues at all). Then update&upgrade the system and apply "do-release-upgrade". Works like a charm. I think what eblock mentioned was this.

Postgres data still "in use" after server stop

I am running postgresql in a docker container. Now I wanted to add checksums to the database cluster. So I stopped the docker container and waited some time. But the pg_checksums tool is still complaining:
pg_checksums: error: cluster must be shut down
There is no postgres or similar running any longer, with docker or not.
Renaming the file postmaster.pid did not change anything.
What du I need to do to convince pg_checksums that it can savely work on the cluster data?
I'm using postgresql 12 and Docker version 19.03.13, build 4484c46d9d on a CentOS 8 machine.
You need to shutdown the database cleanly. Shutting down the container itself apparently did not do that. A tool which does not attach to PostgreSQL's shared memory has no way to know whether a database has crashed, or is still running. So you need a clean shutdown.

Ceph configuration file and ceph-deploy

I set up a test cluster and follow the documentation.
I created cluster with command ceph-deploy new node1. After that, ceph configuration file appeared in the current directory, which contains information about the monitor on the node with hostname node1. Then I added two OSDs to the cluster.
So now I have cluster with 1 monitor and 2 OSDs. ceph status command says that status is HEALTH_OK.
Following all the same documentation, I moved on to section "Expanding your cluster" and added two new monitors with commands ceph-deploy mon add node2 and ceph-deploy mon add node3. Now I have cluster with three monitors in the quorum and status HEALTH_OK, but there is one little discrepancy for me. The ceph.conf is still the same. It contains old information about only one monitor. Why ceph-deploy mon add {node-name} command didn't update configuration file? And the main question is why ceph status displays correct information about new cluster state with 3 monitors while ceph.conf doesn't contain this information. Where is real configuration file and why ceph-deploy knows it but I don't?
And it works even after a reboot. All ceph daemons start, read incorrect ceph.conf (I checked this with strace) and, ignoring this, work fine with new configuration.
And the last question. Why ceph-deploy osd activate {ceph-node}:/path/to/directory command didn't update configuration file too? After all why do we need ceph.conf file if we have so smart ceph-deploy now?
You have multiple questions here.
1) ceph.conf doesn't need to be completely the same for all nodes to run. E.g. OSD only need osd configuration they care about, MON only need configuration mon care ( unless you run everything on the same node which is also not recommended) So maybe your MON1 has MON1 MON2 has MON2 MON3 has MON3
2) When MON being created and then added, the MON map being updated so MON itself already know which other MON require to have quorum. So MON doesn't counting on ceph.conf to get quorum information but to change run time configuration.
3) ceph-deploy just a python script to prepare and run the ceph command for you. If you read into the detail ceph-deploy use e.g. ceph-disk zap prepare activate.
Once you osd being prepared, and activate, once it is format to ceph partition, udev know where to mount. Then systemd ceph-osd.server will be activate ceph-osd at boot. That's why it doesn't need OSD information in ceph.conf at all

Moving MongoDB dbpath to an AWS EBS device

I'm using CentOS 7 via AWS.
I'd like to store MongoDB data on an attached EBS instead of the default /var/lib path.
However, when I edit /etc/mongod.conf to point to a new dbpath, I'm getting a permission denied error.
Permissions are set correctly to mongod.mongod on the dir.
What gives?
TL;DR - The issue is SELinux, which affects what daemons can access. Run setenforce 0 to temporarily disable.
You're using a flavour of Linux that uses SELinux.
From Wikipedia:
SELinux can potentially control which activities a system allows each
user, process and daemon, with very precise specifications. However,
it is mostly used to confine daemons[citation needed] like database
engines or web servers that have more clearly defined data access and
activity rights. This limits potential harm from a confined daemon
that becomes compromised. Ordinary user-processes often run in the
unconfined domain, not restricted by SELinux but still restricted by
the classic Linux access rights
To fix temporarily:
sudo setenforce 0
This should disable SELinux policies and allow the service to run.
To fix permanently:
Edit /etc/sysconfig/selinux and set this:
SELINUX=disabled
Then reboot.
The service should now start-up fine.
The data dir will also work with Docker, i.e. something like:
docker run --name db -v /mnt/path-to-mounted-ebs:/data/db -p 27017:27017 mongo:latest
Warning: Both solutions DISABLE the security that SELinux provides, which will weaken your overall security. A better solution is to understand how SELinux works, and create a policy on your new data dir that works with mongod. See https://wiki.centos.org/HowTos/SELinux for a more complete tutorial.