How listing objects in ceph works - ceph

I know that object locations in ceph are computed from the cluster map using the hash of the object. On the other hand, we have commands like this that list objects:
rados -p POOL_NAME ls
How does this command work? Are object names stored somewhere? If yes, is it all in the monitor database? What will happen in ceph when we run this command?

Monitors keeps pool -> PG map in their database and when you run rados -p POOL_NAME ls it will ask monitor to get PGs associated with this pool. Each PG has an up/acting set that keeps the running OSDs for that PG. After that it will ask PG on the primary OSD to return objects within it.
You can find more info within source code: https://github.com/ceph/ceph/blob/master/src/tools/rados/rados.cc#L2399

Related

Are there any quick ways to move PostgreSQL database between clusters on the same server?

We have two big databases (200GB and 330GB) in our "9.6 main" PostgreSQL cluster.
What if we create another cluster (instance) on the same server, is there any way to quickly move database files to new cluster's folder?
Without using pg_dump and pg_restore, with minimum downtime.
We want to be able to replicate the 200GB database to another server without pumping all 530GB of data.
Databases aren't portable, so the only way to move them to another cluster is to use pg_dump (which I'm aware you want to avoid), or use logical replication to copy it to another cluster. You would just need to set wal_level to 'logical' in postgresql.conf, and create a publication that included all tables.
CREATE PUBLICATION my_pub FOR ALL TABLES;
Then, on your new cluster, you'd create a subscription:
CREATE SUBSCRIPTION my_sub
CONNECTION 'host=172.100.100.1 port=5432 dbname=postgres'
PUBLICATION my_pub;
More information on this is available in the PostgreSQL documentation: https://www.postgresql.org/docs/current/logical-replication.html
TL;DR: no.
PosgreSQL itself does not allow to move all data files from a single database from one source PG cluster to another target PG cluster, whether the cluster runs on the same machine or on another machine. To this respect it is less flexible than Oracle transportable tablespaces or SQL Server attach/detach database commands for example.
The usual way to clone a PG cluster is to use streaming physical replication to build a physical standby cluster of all databases but this requires to backup and restore all databases with pg_basebackup (physical backup): it can be slow depending on the databases size but once the standby cluster is synchronized it should be really fast to failover to standby cluster by promoting it; miminal downtime is possible. After promotion you can drop the database not needed.
However it may be possible to use storage snaphots to copy quickly all data files from one source cluster to another cluster (and then drop the database not needed in the target cluster). But I have not practiced it and it does not seem to be really used (except maybe in some managed services in the cloud).
(PG cluster means PG instance).
If You would like to avoid pg_dump/pg_restore, than use:
logical replication (enables to replicate only desired databases)
streaming replication via replication slot (moving the whole cluster
to another and then drop undesired databases)
While 1. option is described above, I will briefly describe the 2.:
a) create role with replication privileges on master (cluster I want to copy from)
master# psql> CREATE USER replikator WITH REPLICATION ENCRYPTED PASSWORD 'replikator123';
b) log to slave cluster and switch to postgres user. Stop postgresql instance and delete DB data files. Then You will initiate replication from slave (watch versions and dirs!):
pg_basebackup -h MASTER_IP -U replikator -D /var/lib/pgsql/11/data -r 50M -R –waldir /var/lib/pgwal/11/pg_wal -X stream -c fast -C -S master1_to_slave1 -v -P
What this command do? It connects to master with replikator credentials and start pg_basebackup via slot that will be created. There is bandwith throttling as well (50M) as other options... Right after the basebackup slave will start streaming replication and You've got failsafe replication.
c) Then when You want, promote slave to be standalone and delete undesired databases:
rm -f /varlib/pgsql/11/data/recovery.conf
systemctl restart postgresql11.service

Restore postgres data on GCloud VM instance restart

We use Postgres as a docker container, and each time we restart the machine, the data and tables are lost.
Even data files under the base and global folders.
Even more, now we've faced some RAM problems, so I'm not able to connect to Postgres and VM instance itself in now way -- neither by ssh, psql client nor redash. The last one says the following:
Error running query: could not send SSL negotiation packet: Resource temporarily unavailable.
Is there any way to somehow solve the problem?
UPD. I've tried to create a disk snapshot, but it doesn't help.
UPD2. BTW, the pgdata folder is set to be a container volume and could be used outside of it.

Persisting a single, static, large Postgres database beyond removal of the db cluster?

I have an application which, for local development, has multiple Docker containers (organized under Docker Compose). One of those containers is a Postgres 10 instance, based on the official postgres:10 image. That instance has its data directory mounted as a Docker volume, which persists data across container runs. All fine so far.
As part of testing the creation and initialization of the postgres cluster, it is frequently the case that I need to remove the Docker volume that holds the data. (The official postgres image runs cluster init if-and-only-if the data directory is found to be empty at container start.) This is also fine.
However! I now have a situation where in order to test and use a third party Postgres extension, I need to load around 6GB of (entirely static) geocoding lookup data into a database on the cluster, from Postgres backup dump files. It's certainly possible to load the data from a local mount point at container start, and the resulting (very large) tables would persist across container restarts in the volume that holds the entire cluster.
Unfortunately, they won't survive the removal of the docker volume which, again, needs to happen with some frequency. I am looking for a way to speed up or avoid the rebuilding of the single database which holds the geocoding data.
Approaches I have been or currently am considering:
Using a separate Docker volume on the same container to create persistent storage for a separate Postgres tablespace that holds only the geocoder database. This appears to be unworkable because while I can definitely set it up, the official PG docs say that tablespaces and clusters are inextricably linked such that the loss of the rest of the cluster would render the additional tablespace unusable. I would love to be wrong about this, since it seems like the simplest solution.
Creating an entirely separate container running Postgres, which mounts a volume to hold a separate cluster containing only the geocoding data. Presumably I would then need to do something kludgy with foreign data wrappers (or some more arcane postgres admin trickery that I don't know of at this point) to make the data seamlessly accessible from the application code.
So, my question: Does anyone know of a way to persist a single database from a dockerized Postgres cluster, without resorting to a dump and reload strategy?
If you want to speed up then you could convert your database dump to a data directory (import your dump to a clean postgres container, stop it and create a tarball of the data directory, then upload it somewhere). Now when you need to create a new postgres container use use a init script to stop the database, download and unpack your tarball to the data directory and start the database again, this way you skip the whole db restore process.
Note: The data tarball has to match the postgres major version so the container has no problem to start from it.
If you want to speed up things even more then create a custom postgres image with the tarball and init script bundled so everytime it starts then it will wipe the empty cluster and copy your own.
You could even change the entrypoint to use your custom script and load the database data, then call docker-entrypoint.sh so there is no need to delete a possible empty cluster.
This will only work if you are OK with replacing the whole cluster everytime you want to run your tests, else you are stuck with importing the database dump.

VERITAS: VxVM vxvol ERROR V-5-1-1654 keyword clean not recognized for init operation

I have installed the Veritas Volume Manager on Solaris 10. And I am trying to create volumes from the vmdisks. I have created the vmdisks and created the disk group with 2 vm disks. I have created almost 3 sub disks (with same length) in each of the vm disk after the creation of the disk group. I have created the 3 plexes from the 6 sub disks and I created a logical volume using the 3 plexes with the below command.
# vxmake -g testgrp -Uraid5 vol testvol1 plex=testplex,testplex-2,testplex-3
But when I run the below command, I could see the plex and volumes as disabled.
# vxprint -hg testgrp
And I tried to run the below command before starting the volumes as the volume is empty.
# vxvol -g testgrp init clean testvol1
But at this point, I am getting the error as "VxVM vxvol ERROR V-5-1-1654 keyword clean not recognized for init operation".
Could anyone please help me in solving this issue? Thanks in advance.
If raid5 is the type of volume you want to create I think you can use following command:
vxassist -g testgrp make testvol1 layout=raid5 disk1,disk2,disk3,...
It will take care to create the appropriate plex(es) and subdisks on the specified disks in the given diskgroup

MongoDB does not see database or collections after migrating from localhost to EBS volume

full disclosure: I am a complete n00b to mongodb and am just getting my feet wet with using mongo on AWS (but have 2 decades working in IT so not a total n00b :P)
I setup an EBS volume and installed mongo on a EC2 instance.
My problem is that I provisioned too small an EBS volume initially.
When I realized this I:
created a new larger EBS volume
mounted it on the server
stopped mongo ( $ sudo service mongod stop)
copied all my /data/db files into the new volume
updated conf files and fstab (dbpath, logpath, pidfilepath and mount point for new volume respectively)
restarted mongod
When I execute: $ sudo service mongod start
- everything runs fine.
- I can futz about in the admin and local databases.
However, when I run the mongos command: > show databases
- I only see the admin and local.
- the database I copied into the new volume (named encompass) is not listed.
I still have a working local copy of the database so my data is not lost, just not sure how best to move mongo data around other than:
A) start all over importing the data to the db on the AWS server (not what I would like since it is already loaded in my local db)
B) copy the local db to the new EBS volume again (also not preferred but better that importing all the data from scratch again!).
NOTE: originally I secure copied the data into the EBS volume with this command:
$ scp -r -i / / ec2-user#:/
then when I copied between volumes I used a vanilla cp command.
Did I miss something here?
The best I could find on SO and the web was this process (How to scale MongoDB?), but perhaps I missed a switch in a command or a nuance to the process that rendered my database files inert/useless?
Any idea how I can get mongo to see my other database files and collections?
Or did I make a irreversible error somewhere along the way?
Thanks for any help!!
Are you sure you conf file is being loaded? You can, for a test, load mongod.exe and specify the path directly to your db for a test, i.e.:
mongod --dbpath c:\mongo\data\db (unix syntax may vary a bit, this is windows)
run this from the command line and see what, if anything, mongo complains about.
A database has a very finicky algorithm that is easy to damage. Before copying from one database to another you should probably seed the database, a few dummy entries will tell you the database is working.