I see that mongo has the configuration option storage.directoryPerDB. But I only see storage.dbPath to specify which path data is stored.
We have 2 small frequently used "settings" databases that will be stored locally in the default location. There is another "results" database for large image files, that will be written often, but queried infrequently, which has a dedicated SSD drive for its storage, this data needs to be on is own drive because our application can generate hundreds of gigs of image data in a small amount of time.
How can I configure mongod to store a database on a different drive? The server is running on windows, if that makes any difference.
Nevermind. The documentation at http://docs.mongodb.org/manual/reference/configuration-options/#storage.directoryPerDB explains how to do it perfectly.
Along with http://technet.microsoft.com/en-us/library/cc753321.aspx#BKMK_CMD
which describes how to mount a drive to a folder location.
Related
Cloud platforms like Linode.com often provide hot-pluggable storage volumes that you can easily attach and detach from a Linux virtual machine without restarting it.
I am looking for a way to install Postgres so that its data and configuration ends up on a volume that I have mounted to the virtual machine. The end result should allow me to shut down the machine, detach the volume, spin up another machine with an identical version of Postgres already installed, attach the volume and have Postgres work just like it did on the old machine with all the data, file system permissions and server-wide configuration intact.
Is such a thing possible? Is there a reliable way to move installations (i.e databases and configuration, not the actual binaries) of Postgres across machines?
CLARIFICATION: the virtual machine has two disks:
the "built-in" one which is created when the VM is created and mounted to /. That's where Postgres gets installed to and you can't move this disk.
the hot-pluggable disk which you can easily attach and detach from a running VM. This is where I want Postgres data and configuration to be so I can just detach the disk (after shutting down the VM to prevent data loss/corruption) and attach it to another VM when I want my data to move so it behaves like it did on the old VM (i.e. no failures to start Postgres, no errors about permissions or missing files, etc).
This works just fine. It is not really any different to starting and stopping PostgreSQL and not removing the disk. There are a couple of things to consider though.
You have to make sure it is stopped + writing synced before unmounting the volume. Obvious enough, and I can't believe you'd be able to unmount before sync completed, but worth repeating.
You will want the same version of PostgreSQL, probably on the same version of operating system with the same locales too. Different distributions might compile it with different options.
Although you can put configuration and data in the same directory hierarchy, most distros tend to put config in /etc. If you compile from source yourself this won't be a problem. Alternatively, you can usually override the default locations or, and this is probably simpler, bind-mount the data and config directories into the places your distro expects.
Note that if your storage allows you to connect the same volume to multiple hosts in some sort of "read only" mode that won't work.
Edit: steps from comment moved into body for easier reading.
start up PG, create a table put one row in it.
Stop PG.
Mount your volume at /mnt/db
rsync /var/lib/postgresql/NN/main to /mnt/db/pg_data and /etc/postgresql/NN/main to /mnt/db/pg_etc
rename /var/lib/postgresql/NN/main and add .OLD to the name and do the same with the /etc
bind-mount the dirs from /mnt to replace them
restart PG
Test
Repeat
Return to step 8 until you are happy
Currently the project is using AEM 6.0 with mongo 2.6.10. Because of an known issue about maxPasses assertion, mongo fails to allocate the space required.
Adobe official doc mentioned that crx storage can be defined to use another file system under certain conditions. In this case, it is required to store the dam assets with size > 16M on local file storage instead of in mongoDB. See repository set up with repository.xml. However, details of how is not specified.
The question is how to config repository.xml to use local file system instead of mongoDB for files larger than a specific size?
I think you have not configured a separate DataStore. That's the reason your larger files are being persisted in your MongoDB. AEM allows you to configure NodeStore and DataStore separately.
Once you configure a separate DataStore, all your larger files(default > 16MB) will be stored separately in DataStore while all your regular nodes & properties will be stored in NodeStore.
There are multiple options to choose NodeStore & DataStores. In your case, I would suggest you to continue using your MongoDB as your NodeStore while configuring a separate FileSystem DataStore to store the binaries.
Please check the below documentation on how to configure separate DataStore :
https://docs.adobe.com/docs/en/aem/6-1/deploy/platform/data-store-config.html
For 6.0 version,
https://docs.adobe.com/docs/en/aem/6-0/deploy/upgrade/data-store-config.html
Is there a way to mount a storage bucket to an instance so it can be used by the webserver as storage? If not, how can I add more storage to the instance without adding another persistent disk with an OS?
Aside from attaching a new persistent disk, you could also use a number of FUSE based utilities to mount either a Google Cloud Storage or AWS S3 bucket as a local disk.
s3fs:
*Can work with Google Cloud or AWS
*Bucket can be mounted on multiple systems at same time
*Files are stored as objects on the bucket, so the files can be manipulated externally
*A con is that it can be a little bit slow if you have a lot of files
S3QL:
*Can work with Google Cloud or AWS
*Bucket can be mounted on one system
*Files are stored in a proprietary format, can't be manipulated outside of the mounted filesystem
*Much faster than s3fs for many files
*Doesn't handle network connectivity issues so well (manual fsck and remount if you lose network).
Hope this helps.
You can certainly create a new (larger) Persistent Disk and attach it to your instance as a data disk. This is a very good option, since it keeps your website data separate from your operating system. See the Persistent Disk docs for details on all the options.
In your case:
Create a new Persistent Disk for the data. Pick a size large enough for your data and large enough to get the I/O throughput you want. (See this chart for details)
Attach the disk to your instance.
I have fresh provisioned instances of apache and postgres all set to go. I would like to restore a dump or mount a logical volume with data to the postgres instance. Likewise, I'd like to ensure that the dump is written out or the volume unmounted when i bring the instance down.
Can I use a logical volume this way? How should I approach?
I see this:
How to handle data such as Mysql, web sites sources with Vagrant?
The other answer had the following suggestions. Below I will discuss their implications for PostgreSQL.
In the current version of Vagrant (1.0.3), you have two main options:
Use shared folders. You can put your MySQL data directory into a shared folder so that the data comes back onto your host machine. The
con of this is that shared folders are actually quite slow compared to
the native VM filesystem in VirtualBox, and you can run into weird
permission issues as well.
Setup a task (rake, make, etc.) to copy your MySQL data to your shared folder on demand. Then, before you decide to destroy your VM,
you can run the task to export your data to your shared folder, then
you can reimport the data when you bring your VM back up.
The shared folders approach may work, but if you do this you need to be extremely careful with file permissions. PostgreSQL tends to be very paranoid about this, so you may have to be cautious about group permissions.
I would recommend something based on the second approach with a base backup (using pg_basebackup) since you get a copy of your database. You can also archive your wal segments to that directory to have something that can be restored on demand to near-present conditions.
I know that MongoDB can scale vertically. What about if I am running out of disk?
I am currently using EC2 with EBS. As you know, I have to assign EBS for a fixed size.
What if the MongoDB growth bigger than the EBS size? Do I have to create a larger EBS and Copy & Paste the files?
Or shall we start more MongoDB instance and each connect to different EBS disk? In such case, I could connect to a different instance for different databases.
If you're running out of disk, you obviously need to get a bigger disk.
There are several ways to migrate your data, it really depends on the type of up-time you need. First steps of course involve bundling the machine and creating the new volume.
These tips go from easiest to hardest.
Can you take the database completely off-line for several minutes?
If so, do this (migration by copy):
Mount new EBS on the server.
Stop your app from connecting to Mongo.
Shut down mongod and wait for everything to write (check the logs)
Copy all of the data files (and probably the logs) to the new EBS volume.
While the copy is happening, update your mongod start script (or config file) to point to the new volume.
Start mongod and check connection
Restart your app.
Can you take the database off-line for just a few minutes?
If so, do this (slaving and switch):
Start up a new instance and mount the new EBS on that server.
Install / start mongod as a --slave pointing at the current database. (you may need to re-start the current as --master)
The slave will do a fresh synchronization. Once the slave is up-to-date, you'll do a "switch" (next steps).
Turn off writes from the system.
Shut down the original mongod process.
Re-start the "new" mongod as a master instead of the slave.
Re-activate system writes pointing at the new master.
Done correctly those last three steps can happen in minutes or even seconds.
Can you not afford any down-time?
If so, do this (master-master):
Start up a new instance and mount the new EBS on that server.
Install / start mongod as a master and a slave against the current database. (may need to re-start current as master, minimal down-time?)
The new computer should do a fresh synchronization.
Once the new computer is up-to-date, switch the system to point at the new server.
I know it seems like this last version is actually the best, but it can be a little dicey (as of this writing). The reason is simply that I've honestly had a lot of issues with "Master-Master" replication, especially if you don't start with both active.
If you plan on using this method, I highly suggest a smaller practice run first. If something bombs here, Mongo might simply wipe all of your data files which will have the effect of taking more stuff down.
If you get a good version of this please post the commands, I'd like to see it in action.
Doesn't the E in EBS stand for elastic meaning something like resizing on the fly?
Currently the MongoDB team is working on finishining sharding which will allow you horizontal scaling by partitioning data separately on different servers. Give it a month or two and it will work fine. The developers are quite good at keeping their promises.
http://api.mongodb.org/wiki/current/Sharding%20Introduction.html
http://api.mongodb.org/wiki/current/Sharding%20Limits.html
You could slave the bigger disk off the smaller until it's caught up
or
fsync+lock and take a file system snapshot and copy it onto the bigger disk.
well, I am using Mongo DB now. I am pretty amazed the performance it generated, especially on some simple sorting.
I believe it's a good tool for simple web application logic. The remaining concern for is how to scale and backup. I will continue to explore.
The only disadvantage I have is that I didn't have any good tools to reveal the data stored inside. For example, I want to put my logging from MYSQL into Mongo as well. However, it's pretty difficult for me to view the log. Previously, i can use MYSQL query to fetch what I want easily.
Anyway, it's a good tool and I will continue to use it.