How can I achieve remote ZFS snapshot and restoration with logical domains? - solaris

Hardware: Solaris SPARC with ZFS filesystems
I have setup the primary domain(dom0) on machine 1 with a logical domain called test.
Additionally I also have machine 2 where the test-backup is supposed to reside and act as a failover. test-backup is a logical domain that is supposed to replicate the original and is inactive but will be brought up when machine 1 fails.
May I know how to achieve the above?
I have already tested the following from dom0:
zfs snapshot rpool/logicaldomain/test/datadisk#sync0
zfs send rpool/logicaldomain/test/datadisk#sync0 | zfs recv rpool/logicaldomain/test-backup/datadisk
but whenever I bring up the machine and run zpool status, the disk's state is faulted and the cksum is corrupted.

Related

How can I fix ceph commands hanging after a reboot?

I'm pretty new to Ceph, so I've included all my steps I used to set up my cluster since I'm not sure what is or is not useful information to fix my problem.
I have 4 CentOS 8 VMs in VirtualBox set up to teach myself how to bring up Ceph. 1 is a client and 3 are Ceph monitors. Each ceph node has 6 8Gb drives. Once I learned how the networking worked, it was pretty easy.
I set each VM to have a NAT (for downloading packages) and an internal network that I called "ceph-public". This network would be accessed by each VM on the 10.19.10.0/24 subnet. I then copied the ssh keys from each VM to every other VM.
I followed this documentation to install cephadm, bootstrap my first monitor, and added the other two nodes as hosts. Then I added all available devices as OSDs, created my pools, then created my images, then copied my /etc/ceph folder from the bootstrapped node to my client node. On the client, I ran rbd map mypool/myimage to mount the image as a block device, then used mkfs to create a filesystem on it, and I was able to write data and see the IO from the bootstrapped node. All was well.
Then, as a test, I shutdown and restarted the bootstrapped node. When it came back up, I ran ceph status but it just hung with no output. Every single ceph and rbd command now hangs and I have no idea how to recover or properly reset or fix my cluster.
Has anyone ever had the ceph command hang on their cluster, and what did you do to solve it?
Let me share a similar experience. I also tried some time ago to perform some tests on Ceph (mimic i think) an my VMs on my VirtualBox acted very strange, nothing comparing with actual bare metal servers so please bare this in mind... the tests are not quite relevant.
As regarding your problem, try to see the following:
have at least 3 monitors (or an even number). It's possible that hang is because of monitor election.
make sure the networking part is OK (separated VLANs for ceph servers and clients)
DNS is resolving OK. (you have added the servername in hosts)
...just my 2 cents...

Install Postgres on removable volume on linux?

Cloud platforms like Linode.com often provide hot-pluggable storage volumes that you can easily attach and detach from a Linux virtual machine without restarting it.
I am looking for a way to install Postgres so that its data and configuration ends up on a volume that I have mounted to the virtual machine. The end result should allow me to shut down the machine, detach the volume, spin up another machine with an identical version of Postgres already installed, attach the volume and have Postgres work just like it did on the old machine with all the data, file system permissions and server-wide configuration intact.
Is such a thing possible? Is there a reliable way to move installations (i.e databases and configuration, not the actual binaries) of Postgres across machines?
CLARIFICATION: the virtual machine has two disks:
the "built-in" one which is created when the VM is created and mounted to /. That's where Postgres gets installed to and you can't move this disk.
the hot-pluggable disk which you can easily attach and detach from a running VM. This is where I want Postgres data and configuration to be so I can just detach the disk (after shutting down the VM to prevent data loss/corruption) and attach it to another VM when I want my data to move so it behaves like it did on the old VM (i.e. no failures to start Postgres, no errors about permissions or missing files, etc).
This works just fine. It is not really any different to starting and stopping PostgreSQL and not removing the disk. There are a couple of things to consider though.
You have to make sure it is stopped + writing synced before unmounting the volume. Obvious enough, and I can't believe you'd be able to unmount before sync completed, but worth repeating.
You will want the same version of PostgreSQL, probably on the same version of operating system with the same locales too. Different distributions might compile it with different options.
Although you can put configuration and data in the same directory hierarchy, most distros tend to put config in /etc. If you compile from source yourself this won't be a problem. Alternatively, you can usually override the default locations or, and this is probably simpler, bind-mount the data and config directories into the places your distro expects.
Note that if your storage allows you to connect the same volume to multiple hosts in some sort of "read only" mode that won't work.
Edit: steps from comment moved into body for easier reading.
start up PG, create a table put one row in it.
Stop PG.
Mount your volume at /mnt/db
rsync /var/lib/postgresql/NN/main to /mnt/db/pg_data and /etc/postgresql/NN/main to /mnt/db/pg_etc
rename /var/lib/postgresql/NN/main and add .OLD to the name and do the same with the /etc
bind-mount the dirs from /mnt to replace them
restart PG
Test
Repeat
Return to step 8 until you are happy

Processor, RAM, and disk usage

How can I know the consumption of RAM, processor and disk that MongoDB takes when I'm doing find queries, insert queries, update queries, bulk queries, etc.
I though about MongoPerf but it only shows me disk usage, although is awesome cause can create threads, choose an amount of gb, and read or write. But I need to know how much RAM it takes too, and processor
It could be like doing htop for MongoDB
You could use the ps(1) command (I guess you are on Linux).
Programmatically, you could (on Linux) use the /proc/ file system (which is used by ps, top, htop). For details, read proc(5).
To get the pid of your MongoDb process, you could use pidof(1) or pgrep(1). If the pid of mongod server is 1234, you should be interested by /proc/1234/status.
Notice that (on Linux) a process does not directly consume RAM. The (mongod server) process has a virtual address space, and the kernel manages the RAM (and dispatches it among-st processes). You could be interested by the resident set size (and you can query it with ps or via /proc/)
The virtual address space of process of pid 1234 can be queried via /proc/1234/status and /proc/1234/maps (see also pmap(1)).
If you are not familiar with /proc/ play first with it on the command line, for your shell, by running cat /proc/$$/status and cat /proc/$$/maps and exploring /proc/$$/.
On my machine, sudo cat /proc/$(pidof mongod)/status gives some interesting output.

HowTo zdb -e poolname to recover data from a single ZFS device

I have the following situation:
1*10TB Drive, full of data on a ZFS
I wanted to add a 100GB NVME partition as a cache
instead of using zpool add poolname cache nvmepartion I wrote zpool add poolname nvmepartition
I did not see my mistake and exported this pool.
Neither the NVME drive is availeable any more, nor the system has any information about this pool in the ZFS cache (due to export).
current status:
zpool import shows the pool but I cannot import the pool using any way found on the internet.
zdb -e poolname shows me what i know: the pool, its name, that it (sadly) has 2 children which one is not availeable any more - and that the system has no informatioon about the missing child (so all tricks i found on internet in linked a ghost device etc. wont work either)
as far i know the only way is to use ZDB to generate all files through
the "journal" and pipe/save them to another path.
**
but how? Nowhere I found any documentation on that.
**
note: the 10 TB drive was 90% full, then I added the NVME partion as a sibling - as ZFS is no real raid 0 and due to the fact that these sibling have been so unequal in size and as I did not wrote many data after my mistake happened - I am quite sure that most of my data is still there.

Linux Page Cache Replacement

I have two PostgreSQL databases named data-1 and data-2 that sit on the same machine. Both databases keep 40 GB of data, and the total memory available on the machine is 68GB.
I started data-1 and data-2, and ran several queries to go over all their data. Then, I shut down data-1 and kept issuing queries against data-2. For some reason, the OS still holds on to large parts of data-1's pages in its page cache, and reserves about 35 GB of RAM to data-2's files. As a result, my queries on data-2 keep hitting disk.
I'm checking page cache usage with fincore. When I run a table scan query against data-2, I see that data-2's pages get evicted and put back into the cache in a round-robin manner. Nothing happens to data-1's pages, although they haven't been touched for days.
Does anybody know why data-1's pages aren't evicted from the page cache? I'm open to all kind of suggestions you think it might relate to problem.
This is an EC2 m2.4xlarge instance on Amazon with 68 GB of RAM and no swap space. The kernel version is:
$ uname -r
3.2.28-45.62.amzn1.x86_64
Edit-1:
It seems that there is no NUMA configuration:
$ dmesg | grep -i numa
[ 0.000000] No NUMA configuration found
Edit-2:
I used page-types tool in Linux kernel source tree to monitor page cache statuses. From the results I conclude that:
data-1 pages are in state : referenced,uptodate,lru,active,private
data-2 pages are in state : referenced,uptodate,lru,mappedtodisk
Take a look at the cpusets you have configured in /dev/cpusets. If you have multiple directories in here then you have multiple cpusets, and potentially multiple memory nodes.
The cpusets mechanism is documented in detail here: http://www.kernel.org/doc/man-pages/online/pages/man7/cpuset.7.html