Is there a way to restore a neo4j 3.5 database without a backup from a persistent volume?

Is there a way to restore a neo4j 3.5 database without a backup from a persistent volume? - kubernetes

I had a neo4j 3.5 enterprise edition running in a kubernetes cluster.
The cluster was deleted by error with any chance to make a recent neo4j database backup.
The only related things remaining to the old database are three Persistent Disk in the Google Cloud Compute Engine.
Is it possible to recover o restore the data stored in them? How?
The disk detail:
{"kubernetes.io/created-for/pv/name":"pvc-fd4fe6eb-2c24-11ea-bd38-42010a8e0228",
"kubernetes.io/created-for/pvc/name":"datadir-neo4j-neo4j-core-2",
"kubernetes.io/created-for/pvc/namespace":"default"}
The Secret storing the old password is lost.
Thanks

You probably need to first identify which disk hold the <neo4j-home>/data directory. Then create a snapshot of this disk (to be safe). Finally, you start a new neo4j pod by create a volume from the snapshot and mount to <neo4j-home>/data.

Related

Is it a good idea to backup/restore Neo4j databases with Kubernetes VolumeSnapshot?

I have a Neo4j database running on Kubernetes. I want to make scheduled backups for the database. I know that Neo4j provides a set of tools for backup and restore. However, Kubernetes VolumeSnapshot also looks viable for backup and restore.
I wonder if it's a good idea to use Kubernetes VolumeSnapshot to backup/restore Neo4j databases? Will it cause errors like inconsistency database status or faulty disk problem? Thanks.

Generally, if it is not supported by the database, then it is a bad idea.
Think of your database as being stored across:
Database files on disk
Page cache (in volatile memory)
Write ahead transaction logs on disk
A volume snapshot would not save enough information to get a consistent state of your database (unless the database is gracefully shut down).
Use the set of tools provided for backup/restore

Backing up EC2 instance from Ephemeral to Persistent Storage

I'm pretty new with EC2 and backing up data, but currently, the app that I've built has no backup strategy and I want to know how to build a proper one. Currently, I have my RoR app and my MongoDB database on one instance. I've just now read about EBS volumes and snapshots, but I just can't wrap my head around it.
Supposedly EBS can be used as a datastore. If that is so, how do I set up a MongoDB database in EBS and migrate the data I have in my EC2 instance to it? I'm not familiar with configuring EBS and I've read the documentation and have more questions than answers.
In short, my instance is ephemeral storage right now and I want to turn it into persistent storage.
Thank you,
Don

It is pretty simple.
EBS is network disk volumes, it is used to store data.
A snapshot is an compress image backup, so this can apply to EC2 instance, RDS instances, even snapshot EBS volumes itself. After create the snapshot, it must store some where, thus, AWS use to store this backup into EBS.
Configure EBS is not difficult, it is little different that put on a new hard drive. You just need to "attach" an EBS volume to your instance. Then inside the EC2, do the usual OS disk initialisation work.
Because EBS is a dynamic storage, as long as your EC2 instance OS support it, you can extend the disk space anytime you need it (although it is recommended to do backup before doing it).
But from the operation perspective, you may want to consider putting your data into RDS if it is run for 24x7x365. So you don't need to deal with DB installation, complicate replication update,etc. If you run the DB occasionally, then you might want to stick to the EC2 instance mongodb.

Accessing mongodb data on aws instance

Due to some hardware issue my aws instance stopped functioning. Team suggested me to stop and and start the instanace.
Now aws provided new IP, where all data is present. I installed mongodb and had couple of databases there.
Now when I checked on new server mongodb was not working. I started mongod and letter I asked to create /data/db directory. Now mongodb is functioning but when I do
"show databases" none of my previous database appearning. Any help on getting this data back.?

A AWS EC2 instance have two types of Storage. A Ephemeral storage and a EBS Volume storage.
The Ephemeral storage should be used for temporary data only. If you restart your EC2 the data in it will not be lost, but if you stop and restart you loose it all. When trying to stop a EC2 AWS gives you this message.
Note that when your instances are stopped: Any data on the ephemeral
storage of your instances will be lost.
This kind of storage is provisioned very close to the instance and because of that it is faster.
EBS is a persistent storage independent of your EC2 instance. It can be attached/dettached from your EC2. This is the kind of storage you want to use when creating a database inside your instance.

What is the best way to take snapshots of an EC2 instance running MongoDB?

I wanted to automate taking snapshots of the volume attached to an EC2 instance running the primary node of our production MongoDB replicaSet. While trying to gauge the pitfalls and best practices over Google, I came across the fact that data inconsistency and corruption are very much possible while creating a snapshot but not of journaling is enabled, which it is in our case.
So my question is - is it safe to go ahead and execute aws ec2 create-snapshot --volume-id <volume-id> to get clean backups of my data?
Moreover, I plan on running the same command via a cron job that runs once every week. Is that a good enough plan to have scheduled backups?

For MongoDB on an EC2 instance I do the following:
mongodump to a backup directory on the EBS volume
zip the mongodump output directory
copy the zip file to an S3 bucket (with encryption and versioning enabled)
initiate a snapshot of the EBS volume
I write a script to perform the above tasks, and schedule it to run daily via cron. Now you will have a backup of the database on the EC2 instance, in the EBS snapshot, and on S3. You could go one step further by enabling cross region replication on the S3 bucket.
This setup provides multiple levels of backup. You should now be able to recover your database in the event of an EC2 failure, an EBS failure, an AWS Availability Zone failure or even a complete AWS Region failure.
I would recommend reading the official MongoDB documentation on EC2 deployments:
https://docs.mongodb.org/ecosystem/platforms/amazon-ec2/
https://docs.mongodb.org/ecosystem/tutorial/backup-and-restore-mongodb-on-amazon-ec2/

EBS snapshots vs. WAL-E for PostgreSQL on EC2

I'm getting ready to move our posgresql databases to EC2 but I'm a little unclear on the best backup and recovery strategy. The original plan was to build an EBS backed server, set up WAL-E to handle WAL archiving and base backups to S3. I would take snapshots of the final production server volume to be used if the instance crashed. I also see that many people perform frequent snapshots of the EBS for recovery purposes.
What is the recommended strategy? Is there a reason to archive with WAL and perform scheduled EBS snapshots?

The EBS Snapshots will give you a slightly different type of backup than then WAL-E backups. EBS backups the entire drives, which means if your EC2 Virt goes down you can just restart the virt with your last EBS snapshot and things will pickup right where you last snapshotted things.
The frequency of your EBS snapshots would define how good your database backups are.
The appealing thing about WAL-E is the "continuous archiving". If I needed every DB transaction backed up, then WAL-E seems the right choice. Manys apps I can envision cannot afford to lose transactions, so that seems a very prudent choice.
I think your plan to snapshot the production volumes as a baseline, then use WAL-E to continuously archive the database seems very reasonable. Personally I would likely add a periodic snapshot (once a day?) to that plan just to take a hard baseline and make your recovery process a bit easier.
The usual caveat of "Test your recovery plans!" applies here. You're mixing a number of technologies (EC2, EBS, Postgres, Snapshots, S3, WAL-E) so making sure you can actually recover - rather than just back - is of critical importance.

EBS snapshots will save the image of an entire disk, so you can back up all the disks in the server and recover it as a whole in case of data loss or disaster. Besides that, the block-level property of EBS snapshots allows instant recovery, you can have a 1TB database restored and have it up and running in a few minutes. To recover a 1TB database from scratch using a file based solution (like WAL-E) will require copying the data from S3 first, a process that will take hours. Using WAL files for recovery is a good approach, since you can go back to any time by transaction, but snapshotting the entire server will include WAL files as well, so you’ll still have that option. The backup and rapid recovery process using EBS snapshots can be automated with scripts or EC2 backup solutions (for example, Backup solutions for AWS EC2 instances).