Accessing mongodb data on aws instance - mongodb

Due to some hardware issue my aws instance stopped functioning. Team suggested me to stop and and start the instanace.
Now aws provided new IP, where all data is present. I installed mongodb and had couple of databases there.
Now when I checked on new server mongodb was not working. I started mongod and letter I asked to create /data/db directory. Now mongodb is functioning but when I do
"show databases" none of my previous database appearning. Any help on getting this data back.?

A AWS EC2 instance have two types of Storage. A Ephemeral storage and a EBS Volume storage.
The Ephemeral storage should be used for temporary data only. If you restart your EC2 the data in it will not be lost, but if you stop and restart you loose it all. When trying to stop a EC2 AWS gives you this message.
Note that when your instances are stopped: Any data on the ephemeral
storage of your instances will be lost.
This kind of storage is provisioned very close to the instance and because of that it is faster.
EBS is a persistent storage independent of your EC2 instance. It can be attached/dettached from your EC2. This is the kind of storage you want to use when creating a database inside your instance.

Related

Configure MongoDB in EC2 with EBS (SSD Volume)

I am bit confused about !
MongoDB is required for my one of application. Should I go with MongoDB Atlas
OR EC2 built-in MongoDB (by the way I have choose this).
If you go with EC2 built-in MongoDB, then my next question is, let say any instance EC2 of type eg - "MAD5 LARGE" how can i store all my DB data in separate EBS Volume ( Which does not delete in EC2 Termination ), which is not built-in EC2 Storage.
So that if any time I want to terminate my instance, I can do it any time without any worries and attach the volume with new instance ?
First of all, you can choose not to delete your Root EBS volume when you terminate the EC2 instance.
Second you can Attach additional EBS volumes to your EC2 Instance, which wont get deleted when you terminate the EC2 instance
Once you have attached the EBS volume, you need to mount the newly created volume with Linux OS.
Then you need to configure the MongoDB dbpath in /etc/mongod.conf file.
You can check the step-by-step process here in my answer.

Backing up EC2 instance from Ephemeral to Persistent Storage

I'm pretty new with EC2 and backing up data, but currently, the app that I've built has no backup strategy and I want to know how to build a proper one. Currently, I have my RoR app and my MongoDB database on one instance. I've just now read about EBS volumes and snapshots, but I just can't wrap my head around it.
Supposedly EBS can be used as a datastore. If that is so, how do I set up a MongoDB database in EBS and migrate the data I have in my EC2 instance to it? I'm not familiar with configuring EBS and I've read the documentation and have more questions than answers.
In short, my instance is ephemeral storage right now and I want to turn it into persistent storage.
Thank you,
Don
It is pretty simple.
EBS is network disk volumes, it is used to store data.
A snapshot is an compress image backup, so this can apply to EC2 instance, RDS instances, even snapshot EBS volumes itself. After create the snapshot, it must store some where, thus, AWS use to store this backup into EBS.
Configure EBS is not difficult, it is little different that put on a new hard drive. You just need to "attach" an EBS volume to your instance. Then inside the EC2, do the usual OS disk initialisation work.
Because EBS is a dynamic storage, as long as your EC2 instance OS support it, you can extend the disk space anytime you need it (although it is recommended to do backup before doing it).
But from the operation perspective, you may want to consider putting your data into RDS if it is run for 24x7x365. So you don't need to deal with DB installation, complicate replication update,etc. If you run the DB occasionally, then you might want to stick to the EC2 instance mongodb.

What is the best way to take snapshots of an EC2 instance running MongoDB?

I wanted to automate taking snapshots of the volume attached to an EC2 instance running the primary node of our production MongoDB replicaSet. While trying to gauge the pitfalls and best practices over Google, I came across the fact that data inconsistency and corruption are very much possible while creating a snapshot but not of journaling is enabled, which it is in our case.
So my question is - is it safe to go ahead and execute aws ec2 create-snapshot --volume-id <volume-id> to get clean backups of my data?
Moreover, I plan on running the same command via a cron job that runs once every week. Is that a good enough plan to have scheduled backups?
For MongoDB on an EC2 instance I do the following:
mongodump to a backup directory on the EBS volume
zip the mongodump output directory
copy the zip file to an S3 bucket (with encryption and versioning enabled)
initiate a snapshot of the EBS volume
I write a script to perform the above tasks, and schedule it to run daily via cron. Now you will have a backup of the database on the EC2 instance, in the EBS snapshot, and on S3. You could go one step further by enabling cross region replication on the S3 bucket.
This setup provides multiple levels of backup. You should now be able to recover your database in the event of an EC2 failure, an EBS failure, an AWS Availability Zone failure or even a complete AWS Region failure.
I would recommend reading the official MongoDB documentation on EC2 deployments:
https://docs.mongodb.org/ecosystem/platforms/amazon-ec2/
https://docs.mongodb.org/ecosystem/tutorial/backup-and-restore-mongodb-on-amazon-ec2/

AMI for EC2 instance with a MongoDB?

I am running an Amazon EC2 instance with a MongoDB running on it.
Since I will need to use it only for some time, I was wondering if it is possible to keep only image of the system for the usage time with Amazon Machine Image. Any idea?
You can actually create an AMI from your server and then terminate the server when you don't need it.
When you need it again you can relaunch a new server based on the AMI you created. The downside to this is that your latest data may not be up to date. So I recommend creating the AMI right before you terminate the server.
Another alternative is to just use EBS backed storage/instances and just shutdown the instance when you don't need it. You can just start the instance when you need it. There's little cost associated with keeping an EBS volume around. Certainly much less than keeping your EC2 instance running all the time.
Hope this helps.
A machine stopped it´s a machine that Amazon don´t charge you.
You get charged for:
Online time
Storage space (assumably you store the image on S3 [EBS])
Elastic IP addresses
Bandwidth
But Amazon charge you for your AMI´s created.
So you can stop your machine and just start it when you need to use it.

Does the data in mongodb provisioned in EC2 gets replicated while Autoscaling?

To deploy a server in Amazon Ec2, I wish to have the mongodb master database in an Ec2 instance itself and at an average I would be having around 5-6 Ec2 instances running in parallel which are scaled by amazon auto-scaling group.
As database is updated frequently and all instances are under Elastic load balancer,it is hard to predict which users data is in which database of Ec2. By following this approach, am i assured of data consistency in mongodb across the instances while scaling up and down ? If it is not the good approach please suggest alternate ways of doing it.
When using Amazon autoscaling, new EC2 instances will be created from a root AMI image (for example, with an empty database).
As data is added to your database, that data is not synced back to the AMI image. So when a second EC2 instance is launched due to a scaling event, that new EC2 instance will have it's own blank database, because it will be based on the same root AMI image (with the blank database).
The two databases will not know about each other and no syncing will occur. Also, at any time, any of the EC2 instances may be deleted due to a scale-down event. So any data on that instance may be lost.
Separate your web layer from the database layer: use autoscaling to scale your web layer, but don't use autoscaling for your data layer.
MongoDB has it's own form of clustering for load balancing and high-availability. Use it rather than rolling your own using autoscaling.
It is not a standard practice to couple your web server with the database server. Here is what I would suggest.
Implement load balancing on your web servers as well as your mongo db instances so for the sake of argument, you will have 4 web and 4 mongo db servers.
For implementing load balancing on your mongo db servers, it is up to you if you wish to go with a master-slave tier where each mongo db server is a master as well as a slave(so that all instances have data synced) or you can look into sharding.