Put memcached on db or web server instance? - postgresql

For my Drupal-based site, I have an architecture with 3 instances running nginx, postgresql, & solr, respectively. I'd like to install Memcached. Should I put it on the nginx or postgresql server? What are the performance implications?

Memcached is very light on CPU usage, so it is a great candidate to gobble up spare web server RAM. Also, you will scale out you web tier much more than your other tiers, and Memcached clustering can pool that RAM together into one logical cache.
If you have any spare RAM on the DB, it is almost always best for performance to let the DB gobble it up.
TL;DR Let DB have all of the RAM, colocate memcached on web tier.
Source: http://code.google.com/p/memcached/wiki/NewHardware

The best is to have a separate server (if you can do that).
Otherwise, it depends on your servers CPU & memory utilization and availability requirements. In general I would avoid running anything extra on a DB server machine...since DB is the foundation of the system and has to be available and performing well.
if your Solr server does not have high traffic an don't utilize much memory I'd put it in there. Memcached servers known to be light on CPU. Also you should estimate how much memory memcached instance will need...to make sure its enough on the server.

Related

Cassandra and MongoDB minimum system requirements for Windows 10 Pro

RAM- 4GB,
PROCESSOR-i3 5010ucpu #2.10 GHz
64 bit OS
can Cassandra and MongoDB be installed in such a laptop? Will it run successfully?
The hardware configuration proposed does not meet the minimum requirements. For Cassandra, the documentation requests a minimum of 8GB of RAM and at least 2 cores.
MongoDB's documentation also states that it will need at least 2 real cores or one multi-core physical CPU. With 4GB in RAM, the WiredTiger will allocate 1.5GB for the cache. Please also note that MongoDB will require changes in BIOS to allow memory interleaving to enable Non-Uniform Access Memory, a.k.a. NUMA, such changes will impact the performance of the laptop for other processes.
Will it run successfully?
This will depend on the workload expected to be executed; there are documented examples where Cassandra was installed on a Raspberry Pi array, which since the design it was expected to have slow performance and have a limited amount of data that can be held in the cluster.
If you are looking to have a small sandbox to start using these databases there are other options, MongoDB has a service named Atlas, with a model of a database as a service, it offers a free tier for a 3-node replica and up to 512Mb of storage. For Cassandra there are similar options, AWS offers in the free tier a small cluster of their Managed Cassandra Service (MCS), Datastax is also planning to offer similar services with Constellation

Running MongoDB and Redis on two different containers in the same host machine

I have read somewhere that MongoDB and Redis server shouldn't be executed in the same host because the way that Redis manages the memory damages MongoDb. This is before Docker.io. But now thing seems are pretty different or not? Is is convenient running Redis server and MongoDB on two different containers on the same host machine?
Docker does not change your hardware, also it is the OS that deals with resources which is not virtualized so the same rules as a normal hardware should apply here.
RAM
MongoDB and Redis don't share any memory. The problem of using the same host will be that you can run out of RAM with these two processes, you can put a max size for redis, you can probably do the same for MongoDB, it is mandatory.
If your sizing is good (MongoDB RAM + Redis RAM < Hardware RAM), you won't get any swap on disk for redis (which is absolutely what you want to prevent) but maybe mongodb cache won't be as good (not enough place for optimization). Less memory for redis is always a challenge if your data grows: beware of out of memory if the data size is unpredictable!
If you use backups with redis, it uses more RAM than its dataset to produce the dump, so beware of that. It implies also using IO.
IO
In this case (less RAM) mongo will do a lot more of IO to access data. Redis, depending on your backup policy, can use IO or not (your choice). Worst case: if you use AOF on redis, it is a lot of IO so maybe IO can become a bottleneck in this architecture. If you don't use backups with redis: you won't have problems. Also a SSD is a good choice for Mongo.
CPU
I don't know if MongoDB uses a lot of CPU, but redis most of the time does not except during backups. If you use backups with redis: try to have two CPU cores available for it (one for redis, one for backup task).
Network
It depends on your number of clients. But you should check the throughput / input load of your machine to see if you are not saturating (using monit for instance with alerts). Sometimes it is the bottleneck, not enought throughput in one machine!
Many of today's services, in particular Databases, are very aggressive consuming resources and are designed thinking they will (or should) be executed in a dedicated machine for them. MongoDB and Redis try to keep a lot of data in memory and will try to take the more memory they can for themselves. To avoid this services take all the memory of your host machine you can limit the maximum memory used by a container using -m="<number><optional unit>" in docker run. E.g.: docker run -d -m="2g" -p 27017:27017 --name mongodb dockerfile/mongodb
So you can control in an easy way the resource limits of your services, and run them in the same host with a fine grained control of the resources. Anyway it's important to consider that the performance of these services is designed thought that the resources of the host machine will be fully available for them. For example there are other databases as Cassandra that will consume a lot of memory, and furthermore, are designed to have sequential access writing to disk. In these cases Docker will let you to run limiting the resources used, but if you run multiple services in the same host the performance of them will decrease severely.

Docker instead of multiple VMs

So we have around 8 VMs running on a 32 GB RAM and 8 Physical core server. Six of them run a mail server each(Zimbra), two of them run multiple web applications. The load on the servers are very high primarily because of heavy load on each VMs.
We recently came across Docker. It seems to be a cool idea to create containers of applications. Do you think it's a viable idea to run applications of each of these VMs inside 8 Docker Containers. Currently the server is heavily utilized because multiple VMs have serious I/O issues.
Or can docker be utilized in cases where we are only running web applications, and not email or any other infra apps. Do advise...
Docker will certainly alleviate your server's CPU load, removing the overhead from the hypervisor's with that aspect.
Regarding I/O, my tests revealed that Docker has its own overhead on I/O, due to how AUFS (or lately device mapper) works. In that front you will still gain some benefits over the hypervisor's I/O overhead, but not bare-metal performance on I/O. My observations, for my own needs, pointed that Docker was not "bare-metal performance like" when dealing with intense I/O services.
Have you thought about adding more RAM. 64GB or more? For a large zimbra deployment 4GB per VM may not be enough. Zimbra like all messaging and collaboration systems, is an IO bound application.
Having zmdiaglog (/opt/zimbra/libexec/zmdiaglog) data to see if you are allocating memory correctly would help. as per here;
http://wiki.zimbra.com/wiki/Performance_Tuning_Guidelines_for_Large_Deployments#Memory_Allocation

Are there major downsides to running MongoDB locally on VPS vs on MongoLab?

I have an account with MongoLab for MongoDB and the constant calls to this remote server from my app slow it down quite a lot. When I run the app locally on my computer with a local version of Mongod and MongoDB it's far, far faster, as would be expected.
When I deploy my app (running on Node/Express) it will be run from a VPS on CentOS. I have plenty of storage space available on my VPS, are there any major downsides to running MongoDB locally rather than remotely on Mongolab?
Specs of the VPS:
1024MB RAM
1024MB VSwap
4 CPU Cores # 3.3GHz+
60GB SSD space
1Gbps Port
3000GB Bandwidth
Nothing apart from the obvious:
You will have to worry about storage yourself. MongoDB does tend to take a lot of disk space. upgrading storage will probably be harder to manage than letting Mongolab take care of it.
You will have to worry about making sure the Mongo server doesn't crash and it's running fine.
You will have scaling issues in the future once the load on your application and your database increases.
Using a "database-as-a-service" like Mongolab frees you from worrying about a lot of hardware/OS/system level requirements and configuration. Memory optimization? Which file system? Connection limits? Visualization and IO ops issues? (thanks to Nikolay for pointing that one out)
If your VPS provider doesn't account for local traffic in your bandwidth, then you can set up another VPS for MongoDB. That way, the server will be closer so the requests will be faster, and also, it will have the benefits of being an independent server. It still won't be fully managed like MongoLab though.
[ Edit: As Chris pointed out, MongoLab also helps you with your database schema design and bundles MongoDB support with their plans, so that's also nice. ]
Also, this is a good question, but probably not appropriate for StackOverflow. dba.stackexchange.com and serverfault.com are good places for this question.

What are possible reasons for memcached to be significantly slower on a remote server?

I have a PHP/Apache server with 12GB of RAM. I have been running Memcached on the same machine with 6GB of allotted RAM.
I wanted to run Memcached on a separate server (same datacenter, vlan, subnet), just as I do for MySQL. I setup a separate, identical server with the same memcached configuration.
I am seeing a roughly 10x page load time using Memcached from the remote server than what I get when running locally. I have primed both caches and I still have a 10x load time from remote.
I'm having trouble trouble shooting this.
You're loading 500kb of data per pageload, in all small keys? How many keys per pageload is this?
Latency to a remote server is very low, but running many roundtrips is still a bad idea. Memcached clients support multi-get operations, where you batch many keys into a single request/response with much lower latency.
Just for info, DDR3-1333 is about 10667 MB/s.
If you have, let's say, Gigabit ethernet, I guess it can explains some of the problems you are experiencing...