Remove rbd image,but lot of rbd_data files still exists - ceph

I create a cache tier pool and some rbd images,then I write data to images,remove these images,but lot of rbd_data files still exists,this is why?
enter image description here

Because that's what a cache tier is for, it caches images. You can run rados -p <poo> cache-try-flush-evict-all and rados -p <pool> cache-flush-evict-all to flush the cache tier.

Related

How listing objects in ceph works

I know that object locations in ceph are computed from the cluster map using the hash of the object. On the other hand, we have commands like this that list objects:
rados -p POOL_NAME ls
How does this command work? Are object names stored somewhere? If yes, is it all in the monitor database? What will happen in ceph when we run this command?
Monitors keeps pool -> PG map in their database and when you run rados -p POOL_NAME ls it will ask monitor to get PGs associated with this pool. Each PG has an up/acting set that keeps the running OSDs for that PG. After that it will ask PG on the primary OSD to return objects within it.
You can find more info within source code: https://github.com/ceph/ceph/blob/master/src/tools/rados/rados.cc#L2399

Mongorestore seems to run out of memory and kills the mongo process

In current setup there are two Mongo Docker containers, running on hosts A and B, with Mongo version of 3.4 and running in a replica set. I would like to upgrade them to 3.6 and increase a member so the containers would run on hosts A, B and C. Containers have 8GB memory limit and no swap allocated (currently), and are administrated in Rancher. So my plan was to boot up the three new containers, initialize a replica set for those, take a dump from the 3.4 container, and restore it the the new replica set master.
Taking the dump went fine, and its size was about 16GB. When I tried to restore it to the new 3.6 master, restoring starts fine, but after it has restored roughly 5GB of the data, mongo process seems to be killed by OS/Rancher, and while the container itself doesn't restart, MongoDB process just crashes and reloads itself back up again. If I run mongorestore to the same database again, it says unique key error for all the already inserted entries and then continue where it left off, only to do the same again after 5GB or so. So it seems that mongorestore loads all the entries it restores to memory.
So I've got to get some solution to this, and:
Every time it crashes, just run the mongorestore command so it continues where it left off. It probably should work, but I feel a bit uneasy doing it.
Restore the database one collection at a time, but the largest collection is bigger than 5GB so it wouldn't work properly either.
Add swap or physical memory (temporarily) to the container so the process doesn't get killed after the process runs out of physical memory.
Something else, hopefully a better solution?
Increasing the swap size as the other answer pointed out worked out for me. Also, The --numParallelCollections option controls the number of collections mongodump/mongorestore should dump/restore in parallel. The default is 4 which may consume a lot of memory.
Since it sounds like you're not running out of disk space due to mongorestore continuing where it left off successfully, focusing on memory issues is the correct response. You're definitely running out of memory during the mongorestore process.
I would highly recommend going with the swap space, as this is the simplest, most reliable, least hacky, and arguably the most officially supported way to handle this problem.
Alternatively, if you're for some reason completely opposed to using swap space, you could temporarily use a node with a larger amount of memory, perform the mongorestore on this node, allow it to replicate, then take the node down and replace it with a node that has fewer resources allocated to it. This option should work, but could become quite difficult with larger data sets and is pretty overkill for something like this.
I solved the OOM problem by using the --wiredTigerCacheSizeGB parameter of mongod. Excerpt from my docker-compose.yaml below:
version: '3.6'
services:
db:
container_name: db
image: mongo:3.2
volumes:
- ./vol/db/:/data/db
restart: always
# use 1.5GB for cache instead of the default (Total RAM - 1GB)/2:
command: mongod --wiredTigerCacheSizeGB 1.5
Just documenting here my experience in 2020 using mongodb 4.4:
I ran into this problem restoring a 5GB collection on a machine with 4GB mem. I added 4GB swap which seemed to work, I was no longer seeing the KILLED message.
However, a while later I noticed I was missing a lot of data! Turns out if mongorestore runs out of memory during the final step (at 100%) it will not show killed, BUT IT HASNT IMPORTED YOUR DATA.
You want to make sure you see this final line:
[########################] cranlike.files.chunks 5.00GB/5.00GB (100.0%)
[########################] cranlike.files.chunks 5.00GB/5.00GB (100.0%)
[########################] cranlike.files.chunks 5.00GB/5.00GB (100.0%)
[########################] cranlike.files.chunks 5.00GB/5.00GB (100.0%)
[########################] cranlike.files.chunks 5.00GB/5.00GB (100.0%)
restoring indexes for collection cranlike.files.chunks from metadata
finished restoring cranlike.files.chunks (23674 documents, 0 failures)
34632 document(s) restored successfully. 0 document(s) failed to restore.
In my case I needed 4GB mem + 8GB swap, to import 5GB GridFS collection.
Rather than starting up a new replica set, it's possible to do the entire expansion and upgrade without even going offline.
Start MongoDB 3.6 on host C
On the primary (currently A or B), add node C into the replica set
Node C will do an initial sync of the data; this may take some time
Once that is finished, take down node B; your replica set has two working nodes still (A and C) so will continue uninterrupted
Replace v3.4 on node B with v3.6 and start back up again
When node B is ready, take down node A
Replace v3.4 on node A with v3.6 and start back up again
You'll be left with the same replica set running as before, but now with three nodes all running v.3.4.
PS Be sure to check out the documentation on Upgrade a Replica Set to 3.6 before you start.
I ran into a similar issue running 3 nodes on a single machine (8GB RAM total) as part of testing a replicaset. The default storage cache size is .5 * (Total RAM - 1GB). The mongorestore caused each node to use the full cache size on restore and consume all available RAM.
I am using ansible to template this part of mongod.conf, but you can set your cacheSizeGB to any reasonable amount so multiple instances do not consume the RAM.
storage:
wiredTiger:
engineConfig:
cacheSizeGB: {{ ansible_memtotal_mb / 1024 * 0.2 }}
My scenario is similar to #qwertz but to be able to upload all collection to my database the following script was created to handle partials uploads; uploading every single collection one-by-one instead of trying to send all database at once was the only way to properly populate it.
populate.sh:
#!/bin/bash
backupFiles=`ls ./backup/${DB_NAME}/*.bson.gz`
for file in $backupFiles
do
file="${file:1}"
collection=(${file//./ })
collection=(${collection//// })
collection=${collection[2]}
mongorestore \
$file \
--gzip \
--db=$DB_NAME \
--collection=$collection \
--drop \
--uri="${DB_URI}://${DB_USERNAME}:${DB_PASSWORD}#${DB_HOST}:${DB_PORT}/"
done
Dockerfile.mongorestore:
FROM alpine:3.12.4
RUN [ "apk", "add", "--no-cache", "bash", "mongodb-tools" ]
COPY populate.sh .
ENTRYPOINT [ "./populate.sh" ]
docker-compose.yml:
...
mongorestore:
build:
context: .
dockerfile: Dockerfile.mongorestore
restart: on-failure
environment:
- DB_URI=${DB_URI}
- DB_NAME=${DB_NAME}
- DB_USERNAME=${DB_USERNAME}
- DB_PASSWORD=${DB_PASSWORD}
- DB_HOST=${DB_HOST}
- DB_PORT=${DB_PORT}
volumes:
- ${PWD}/backup:/backup
...

mongorestore hangs while restoring fs.chunks

I am trying to upgrade from the mongodb sandbox option onto a shared cluster, and to keep my current data I have to do a mongodump and mongorestore to migrate the old data onto the new database.
This is what I put in the command line.
mongorestore -h url:host -d heroku_zc -u heroku_zc -p 470grupv030prq5uj0fm mongo-dump-dir/heroku_9r
It seems to all go fine and restores all the data entries, but while uploading the file chunks it hangs part way through. Sometimes 5% of the ay through sometimes 20% of the way through sometimes 50%.
As I say, when I look at the new database, all the rows are there correctly, and only the actual data files are missing.
This is what happens in the terminal, it doesn't give an error it just stops.
2017-02-09T15:45:20.509+0100 [#.......................] heroku_z25kbwmc.fs.chunks 15.8 MB/299.6 MB (5.3%)
2017-02-09T15:45:23.509+0100 [#.......................] heroku_z25kbwmc.fs.chunks 15.8 MB/299.6 MB (5.3%)
2017-02-09T15:45:26.510+0100 [#.......................] heroku_z25kbwmc.fs.chunks 15.8 MB/299.6 MB (5.3%)
Both db's are created from heroku as addons to my parse server.
EDIT: I also don't know if this is a problem, the local system database says 2.03GB. I don't understand how this can be, as the total database size is only 500mb

performance issue until mongodump

we operate for our customer a server with a single mongo instance, gradle, postgres and nginx running on it. The problem is we had massiv performance problmes until mongodump is running. The mongo queue is growing and no data be queried. The next problem is the costumer want not invest in a replica-set or a software update (mongod 3.x).
Has somebody any idea how i clould improve the performance.
command to create the dump:
mongodump -u ${MONGO_USER} -p ${MONGO_PASSWORD} -o ${MONGO_DUMP_DIR} -d ${MONGO_DATABASE} --authenticationDatabase ${MONGO_DATABASE} > /backup/logs/mongobackup.log
tar cjf ${ZIPPED_FILENAME} ${MONGO_DUMP_DIR}
System:
6 Cores
36 GB RAM
1TB SATA HDD
+ 2TB (backup NAS)
MongoDB 2.6.7
Thanks
Best regards,
Markus
As you have heavy load, adding a replica set is a good solution, as backup could be taken on secondary node, but be aware that replica need at least three servers (you can have an master/slave/arbiter - where the last need a little amount of resources)
MongoDump makes general query lock which will have an impact if there is a lot of writes in dumped database.
Hint: try to make backup when there is light load on system.
Try with volume snapshots. Check with your cloud provider what are the options available to take snapshots. It is super fast and cheaper if you compare actual pricing used in taking a backup(RAM and CPU used and if HDD then transactions const(even if it is little)).

What are important mongo data files for backup

If I want to backup database by copying raw files. What files do I need to copy ? only db-name.ns, db-name.0, db-name.1.... or whole folder (local.ns.., journal). I'm running replica set. I understand procedure for locking hidden secondary node and then copying files to new location. But I'm wondering do I need to copy whole folder or just some files.
Thx
Simple answer: All of them. As obvious as it might sound. And here is why:
If you don't copy a namespaces file, your database will most likely not work.
When not copying all datafiles, some of your data is missing and your indices will point to void locations. The database in question might work (minus the data stored in the missing data file), but I would not bet on that – and since the data was important enough to create a backup in the first place, you don't want this to happen, do you?
Config, admin and local databases are vitally necessary for their respective features – and since you used the feature, you probably want to use it after a restore, too.
How do I backup all files?
The best solution save for MMS backup I have found so far is to create LVM snapshots of the filesystem the MongoDB data resides on. In order for tis to work, the journal needs to be included. Usually, you don't need a dedicated backup node for this approach. It is a bit complicated to set up, though.
Preparing LVM backups
Let's assume you have your data in the default data directory /data/db and you have not changed any paths. Then you would mount a logical volume to /data/db and use this to hold the data. Assuming that you don't have anything like this, here is a step by step guide:
Create a logical volume big enough to hold your data. I will call that one /dev/VolGroup/LogVol1 from now on. Make sure that you only use about 80% of the available disk space in the volume group for creating the logical volume.
Create a filesystem on the logical volume. I prefer XFS, so we create an xfs filesystem on /dev/VolGroup/LogVol1:
mkfs.xfs /dev/VolGroup/LogVol1
Mount the newly created filesystem on /mnt
mount /dev/VolGroup/LogVol1 /mnt
Shut down mongod:
killall mongod
(Note that the upstart scripts sometimes have problems shutting down mongod, and this command gracefully stops mongod anyway).
Copy the datafiles from /data/dbto /mntby issuing
cp -a /data/db/* /mnt
Adjust your /etc/fstab so that the logical volume gets mounted on reboot:
# The noatime parameter increases io speed of mongod significantly
/dev/VolGroup/LogVol1 /data/db xfs defaults,noatime 0 1
Umount the logical volume from it's current outpoint and remount it on the correct one:
cd && umount /mnt/ && mount /data/db
Restart mongod
Creating a backup
Creating a backup now becomes as easy as
Create a snapshot:
lvcreate -l100%FREE -s -n mongo_backup /dev/VolGroup/LogVol1
Mount the snapshot:
mount /dev/VolGroup/mongo_backup /mnt
Copy it somewhere. The reason we need to do this is that the snapshot can only be held up until the changes to the data files do not exceed the space in the volume group you did not allocate during preparation. For example, if you have a 100GB disk and you allocated 80GB for /dev/VolGroup/LogVol1, the snapshot size would be 20GB. While the changes on the filesystem from the point you took the snapshot are less than 20GB, everything runs fine. After that, the filesystem will refuse to take any changes. So you aren't in a hurry, but you should definitely move the data to an offsite location, an FTP server or whatever you deem appropriate. Note that compressing the datafiles can take quite long and you might run out of "change space" before finishing that. Personally, I like to have a slower HDD as a temporary place to store the backup, doing all other operations on the HDD. So my copy command looks like
cp -a /mnt/* /home/mongobackup/backups
when the HDD is mounted on /home/mongobackup.
Destroy the snapshot:
umount /mnt && lvremove /dev/VolGroup/mongo_backup
The space allocated for the snapshot is released and the restrictions to the amount of changes to the filesystem are removed.
Whole db-Data folder + where ever you have your logs and journalling
The best solution to backup data on MongoDB would be to use Mongo monitoring Service(MMS). All other solutions including copying files manually, mongodump, mongoexport are way behind MMS.