qemu-img convert rbd volumes between different ceph clusters accelerate - copy

Is there an elegant way to copy an RBD volume to another Ceph cluster?
I calculate the convert time with qemu-img 2.5 version or qemu-img 6.0 version, by copying a volume(capability is 2.5T and 18G only used) to another Ceph cluster.
qemu-img [2.5 or 6.0] convert -p -f raw rbd:pool_1/volume-orig_id:id=cinder:conf=1_ceph.conf:keyring=1_ceph.client.cinder.keyring -O raw rbd:pool_2/volume-new_id:id=cinder:conf=2_ceph.conf:keyring=2_ceph.client.cinder.keyring [-n -m 16 -W -S 4k]
Test qemu-img convert result:
qemu-img 2.5 spend 2 hours and 40 minutes with no option parameter:
qemu-img 6.0 spend 3 hours and 3 minutes with option parameter (-m 16 -W -S 4k):
Questions:
1, why 2.5 version write only used disk capability(18G), but 6.0 version write the hole disk 2.5T?
2, how to use qemu-img (2.5 or 6.0 version) accelerating convert RBD volume to another Ceph cluster or there is some other ways to approach?

The main feature is qemu-img convert -n the -n option parameter.
If convert the disk with '-n' skips the target volume creation (useful if the volume is created prior to running qemu-img) parameter, it will write the hole disk capability to the destination rbd volume. Without it, the qemu-img convert only read the source volume used capability and write them to destination volume

Related

pg_dump with -j option and -Z

I am about to backup 120 Gb database. I kept on failing when using PGADMIN backup (because of VPN disconnection after 7 hours running) or SQLMaestro (out of memory issue after 3 hours running).
So I want to run it on the server using pg_dump. The command I want to use is : time pg_dump -j 5 -Fc -Z 1 db_profile_20210714 -f /var/lib/postgresql/backup2/ (I want to measure the time as well, so I put time). And after that I will run pg_dumpall -g
I have 30 cores server and backup drive mounted on NFS. Postgres 12 running on Ubuntu 12.
Questions :
If I use -Z 0, will it undo the default compression of -Fc ? (-Fc is compressed by default)
Does the usage of -j 5 and -Z 1 counter productive to each other ? I read from article that to throttle pg_dump process so that it wont cause I/O spike, one can use -Z between 3 and 5. But what if some one want to utilize the cores and compress at once, is it effective / efficient ?
Thanks
Yes, if you use -Z 0, the custom format dump will be uncompressed. -j and -Z are independent from each other, and you cannot use -j with the custom format. If using compression speeds up the dump or not depends on your bottleneck. If that is the network, compression can help. Otherwise, compression usually makes pg_dump slower.

Ceph RBD real space usage is much larger than disk usage once mounted

I'm trying to understand how to find out the current and real disk usage of a ceph cluster and I noticed that the output of a rbd du is way different from the output of a df -h once that rbd is mounted as a disk.
Example:
Inside the ToolBox I have the following:
$ rbd du replicapool/csi-vol-da731ad9-eebe-11eb-9fbd-f2c976e9e23a
warning: fast-diff map is not enabled for csi-vol-da731ad9-eebe-11eb-9fbd-f2c976e9e23a. operation may be slow.
2021-09-01T13:53:23.482+0000 7f8c56ffd700 -1 librbd::object_map::DiffRequest: 0x557402c909c0 handle_load_object_map: failed to load object map: rbd_object_map.8cdeb6e704c7e0
NAME PROVISIONED USED
csi-vol-da731ad9-eebe-11eb-9fbd-f2c976e9e23a 100 GiB 95 GiB
But, inside the Pod that is mounting this rbd, I have:
$ k exec -it -n monitoring prometheus-prometheus-operator-prometheus-1 -- sh
Defaulting container name to prometheus.
Use 'kubectl describe pod/prometheus-prometheus-operator-prometheus-1 -n monitoring' to see all of the containers in this pod.
/prometheus $ df -h
Filesystem Size Used Available Use% Mounted on
overlay 38.0G 19.8G 18.2G 52% /
...
/dev/rbd5 97.9G 23.7G 74.2G 24% /prometheus
...
Is there a reason for the two results to be so different? Can this be a problem when ceph keeps track of the total space used by the cluster to know how much space is available?

Slow query time with Postgres 10 inside Docker vs bare-metal for AWS Linux 2

I've been trying to deploy Postgres within Docker for portability reason, and noticed that query performance as measured by "explain analyze" has been painfully slow compared to bare metal.
For a table with 1.7 million rows, a query on bare metal Postgres takes about 1.2 sec vs 4.8 sec on Dockered Postgres, an increase of 4 times! This comparison is done with the same mounted volume for both bare-metal and Docker (for Docker, I'm using the -v option) The volume is a gp2 volume, mounted through AWS console, 60GB
Couple of things I tried:
Increase shared memory buffer option in postgresql.conf, which has negligible effect
Tried several volume mapping options (delegated, cached, consistent)
Upgrading Docker from 17.06-ce to 17.12-ce
This is all done in AWS Linux 2 instance. At this point I’m hoping to get more suggestions on what to do to improve performance.
The docker run command I use:
docker run -p 5432:5432 --name postgres -v /vol/pgsql/10.0/data:/var/lib/postgresql/data postgres:latest

Kickstart: create lvm volume group without partition

I am trying to setup a redhat 7 kickstart for a server with 2 disks.
On the second disk I want to use the full disk without partitioning in lvm.
Once the system is installed, the config works:
pvcreate /dev/sdb
vgcreate data /dev/sdb
lvcreate -l +100%FREE -n data data
mkfs.xfs /dev/mapper/data-data
echo -e "/dev/mapper/data-data\t/data\txfs\tdefaults\t0 1" >> /etc/fstab
mount /data
But I cannot manage to have the following partitioning to work as expected
The partitioning system on kickstart as I know it will only create a partition on /dev/sdb and the volume is create on /dev/sdb1 at the end.
I managed to workaround the issue by using a postscript, but I compiled packages to install in this folder, so I would need the formating to be done before at least in a pre script if the partitioning is not possible.

How to Mount Multiple CephFS on Client-Node?

I'd Created three CephFS and try to Mount it on Client node but didn't find any way to mount specific one Cephfs. I'd tried
mount -t ceph mon-node:/ /mnt/apachefs/ -o mds_namespace=webfs,secret=ceph-authtool -p /etc/ceph/ceph.client.admin.keyring
But it fails, Is there any other way to Mount Multiple File systems on Client node with use of kernel Driver, mount.ceph or ceph-fuse?
It is possible to specify multiple CephFS by following options.
-o mds_namespace ... kernel Driver (mount -t ceph)
--client_mds_namespace ... ceph fuse (cephf-fuse)
I am pretty sure that -o mds_namespace did not work due to old kernel version. If you are using CentOS7, please test it with ceph-fuse 12.2.4 or later version with (--client_mds_namespace). It worked fine on my env.
If you using Debian based system, you can install ceph-fs-common package with apt, like: apt-get install -y ceph-fs-common.
ceph fs volume create nextcloud [<placement>]
ceph fs volume create okd-admin [<placement>]
#/etc/fstab
### one
10.10.20.6:6789:/folder1 /USERDATA ceph name=admin,secretfile=/etc/ceph/secret.key,fs=nextcloud,noatime,_netdev 0 2
### two
10.10.20.5:6789:/folder2 /mnt/cephfs ceph name=okd-admin,secretfile=/etc/ceph/secret-openshift.key,fs=openshift,noatime,_netdev 0 2