benefits of using directoryperdb in MongoDB - mongodb

I have found out that there is an option directoryperdb but what are the benefits of using that instead of default file organization?
cheers,
/Marcin

The main benefit is being able to mount different volumes per database.

Related

What are the challenges or problems in Backing up and restoring Open source softwares like openstack and kubernetes?

I would like to understand the existing problems in the field of openstack and kubernetes with respect to backup and restore. Any link or reference to any research related matter would also be helpful.
my USD$0.02: There are several projects that I know of which attempt to backup k8s clusters at the metadata level
We focus on etcd backups, since our risk is wider than just the k8s descriptors, but as the adage goes: "they're not backups until you've tested them," which I (unfortunately) have not made the time to try
One also needs to exercise caution because any cluster state backups will contain cleartext Secret descriptors; so any backup solution needs to be aware of encryption or to skip over those descriptors

Mongo Replication

I have a mongo 2.4.8 database setup and running in a live environment. I am wanting to add a replica however I would like to use the latest version 3.2.9 for the replica.
Is the only way for me to do this to upgrade the current node to version 3.2.9 then add the replica?
My plan would be sync all the data to the new node make it primary then update the old node to the latest version is this possible?
yes, you can create a new node and make a replica, and update the old node.
few things to keep in mind are:-
The default storage engine for 3.2.9 will be wiredtiger and for 2.4.8 it will be mmapv1, so you would have to change the configuration so that you can keep on using mmapv1 as your storage engine.
Do replication very carefully. if not done properly, there are chances that the whole database is blown. i recommend you to take the backup of the database before doing replication
I would definitely go with the first method that you mentioned. Upgrade the current stand alone database and then create a replica set. I tried to find the best practice from Mongodb, but I couldn't find an answer. So, I asked Adam ex employee of MongoDB and creator M202 course to find his opinion.
Source: Adam, ex employee of Mongodb
I have gone with the route of a full mongo backup then restore into the new nodes.
The replication old to new was very fragile plus the backup is very fast to do as long as you allowed to bring the server down.

How to config AEM to use local file system instead of mongoDB for files larger than a specific size?

Currently the project is using AEM 6.0 with mongo 2.6.10. Because of an known issue about maxPasses assertion, mongo fails to allocate the space required.
Adobe official doc mentioned that crx storage can be defined to use another file system under certain conditions. In this case, it is required to store the dam assets with size > 16M on local file storage instead of in mongoDB. See repository set up with repository.xml. However, details of how is not specified.
The question is how to config repository.xml to use local file system instead of mongoDB for files larger than a specific size?
I think you have not configured a separate DataStore. That's the reason your larger files are being persisted in your MongoDB. AEM allows you to configure NodeStore and DataStore separately.
Once you configure a separate DataStore, all your larger files(default > 16MB) will be stored separately in DataStore while all your regular nodes & properties will be stored in NodeStore.
There are multiple options to choose NodeStore & DataStores. In your case, I would suggest you to continue using your MongoDB as your NodeStore while configuring a separate FileSystem DataStore to store the binaries.
Please check the below documentation on how to configure separate DataStore :
https://docs.adobe.com/docs/en/aem/6-1/deploy/platform/data-store-config.html
For 6.0 version,
https://docs.adobe.com/docs/en/aem/6-0/deploy/upgrade/data-store-config.html

Sphinx - NFS index

We are running sphinx 1.10 version. We are having multiple sphinx servers under Loadbalancer where searchd is running. We want to share the same index file across multiple servers via NFS. We do not want to do rsync as it would have different servers getting updated with indexes at different time and hence would create inconsistency in the search output.
Due to the .lock file creation, currently via NFS we are unable to start searchd in multiple servers. Any solution would be of great help!
You can use rsync, and then rotate all the servers in unison. basically you can do the reindexing, syncing. And then control when the servers actully rotate in the new index.
Works well. A couple of mentions of it here
http://sphinxsearch.com/forum/search.html?q=rsync+sighup&f=1
I can say it is impossible to share indexes between two or more searchd instances.
You have to implement something similar to rsync, see how we are doing Sphinx replication.

mongodb automatic failover / high availability on aws

I need the proper way of failover mechanism for mongodb on aws ec2. I know failover can be accomplished by replica sets, but what is the best way to fire a new mongo installed ubuntu-ec2 ami node and add it to replica set again automatically (with zero manual operation) and return the replica set to it's proper state ?
EBS has some problems, but if I use local instance storage, I will lost the dead nodes data, but does the replica got all the master data and so is replaca is enough to recover everthing (on mongo 1.8 with journaling), or do I have to use only EBS ?
How should I start mongo instances, If I should start with repair option, how can I sperate node's first run from failover restart ?
Regards,
The easiest way to bring up new nodes is to bring up a new node with a recent backup.
So now it's a question of how you do your backup and how you restore from the backup quickly.
The MongoDB site has a write up for backups (in general) and backups on EC2 specifically. There's also a write-up for adding a new set member.
You can do this with instance storage or EBS drives, but you'll need different strategies for each. There's really no single way to do this, so I would check out the docs I've linked to for a primer.
Highly recommend reading Sean Coates' article on mutli-node MongoDB Elections, failover and AWS - specifically, the subtlety on distributed arbiter nodes (e.g., make sure to give yourself a voting majority when an AZ goes down). A similar recommendation can be found in a comment on this (now-closed) MongoDB vs. Cassandra thread.