How do I set the maximum memory size that Redis can use? - nosql

To be specific, I only have 1GB of free memory and would like to use only 300MB for Redis. How can I configure it so that it is only uses up to 300MB of memory?
Out of curiosity, what happen when you try to insert a new data and Redis is already used all the memory allocated?

maxmemory is the correct configuration option to prevent Redis from using too much RAM.
If an insert causes maxmemory to be exceeded, the insert operation will sometimes fail.
Redis will do everything in its power to prevent the operation from failing, though. In the newer versions of Redis, you can configure the memory reclaiming policies in the configuration, as well by setting the maxmemory-policy option.
Also, if you have virtual memory options turned on, Redis will begin to store stale data to the disk.
More info:
What does Redis do when it runs out of memory?

You can do that using maxmemory option: maxmemory 314572800 means 300mb.

Since the last answer is from 2011. I am going to write some updated information for users reading in 2019 using Ubuntu 18.04. The configuration file is located in /etc/redis/redis.conf and if you have installed using (default/recommended method) apt install redis-server the default memory limit is set to "0" which practically means there is "no limit" which can be troublesome if user has limited/small amount of RAM/memory. To set your custom memory limit you may simply edit configuration file and type "maxmemory 1gb" as the very first line. Restart redis service for changes to take effect. To verify changes use redis-cli config get maxmemory
Ubuntu 18.04 users may read more here: How to install and configure REDIS on Ubuntu 18.04

Related

AWS RDS with Postgres : Is OOM killer configured

We are running load test against an application that hits a Postgres database.
During the test, we suddenly get an increase in error rate.
After analysing the platform and application behaviour, we notice that:
CPU of Postgres RDS is 100%
Freeable memory drops on this same server
And in the postgres logs, we see:
2018-08-21 08:19:48 UTC::#:[XXXXX]:LOG: server process (PID XXXX) was terminated by signal 9: Killed
After investigating and reading documentation, it appears one possibility is linux oomkiller running having killed the process.
But since we're on RDS, we cannot access system logs /var/log messages to confirm.
So can somebody:
confirm that oom killer really runs on AWS RDS for Postgres
give us a way to check this ?
give us a way to compute max memory used by Postgres based on number of connections ?
I didn't find the answer here:
http://postgresql.freeideas.cz/server-process-was-terminated-by-signal-9-killed/
https://www.postgresql.org/message-id/CAOR%3Dd%3D25iOzXpZFY%3DSjL%3DWD0noBL2Fio9LwpvO2%3DSTnjTW%3DMqQ%40mail.gmail.com
https://www.postgresql.org/message-id/04e301d1fee9%24537ab200%24fa701600%24%40JetBrains.com
AWS maintains a page with best practices for their RDS service: https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_BestPractices.html
In terms of memory allocation, that's the recommendation:
An Amazon RDS performance best practice is to allocate enough RAM so
that your working set resides almost completely in memory. To tell if
your working set is almost all in memory, check the ReadIOPS metric
(using Amazon CloudWatch) while the DB instance is under load. The
value of ReadIOPS should be small and stable. If scaling up the DB
instance class—to a class with more RAM—results in a dramatic drop in
ReadIOPS, your working set was not almost completely in memory.
Continue to scale up until ReadIOPS no longer drops dramatically after
a scaling operation, or ReadIOPS is reduced to a very small amount.
For information on monitoring a DB instance's metrics, see Viewing DB Instance Metrics.
Also, that's their recommendation to troubleshoot possible OS issues:
Amazon RDS provides metrics in real time for the operating system (OS)
that your DB instance runs on. You can view the metrics for your DB
instance using the console, or consume the Enhanced Monitoring JSON
output from Amazon CloudWatch Logs in a monitoring system of your
choice. For more information about Enhanced Monitoring, see Enhanced
Monitoring
There's a lot of good recommendations there, including query tuning.
Note that, as a last resort, you could switch to Aurora, which is compatible with PostgreSQL:
Aurora features a distributed, fault-tolerant, self-healing storage
system that auto-scales up to 64TB per database instance. Aurora
delivers high performance and availability with up to 15 low-latency
read replicas, point-in-time recovery, continuous backup to Amazon S3,
and replication across three Availability Zones.
EDIT: talking specifically about your issue w/ PostgreSQL, check this Stack Exchange thread -- they had a long connection with auto commit set to false.
We had a long connection with auto commit set to false:
connection.setAutoCommit(false)
During that time we were doing a lot
of small queries and a few queries with a cursor:
statement.setFetchSize(SOME_FETCH_SIZE)
In JDBC you create a connection object, and from that connection you
create statements. When you execute the statments you get a result
set.
Now, every one of these objects needs to be closed, but if you close
statement, the entry set is closed, and if you close the connection
all the statements are closed and their result sets.
We were used to short living queries with connections of their own so
we never closed statements assuming the connection will handle the
things once it is closed.
The problem was now with this long transaction (~24 hours) which never
closed the connection. The statements were never closed. Apparently,
the statement object holds resources both on the server that runs the
code and on the PostgreSQL database.
My best guess to what resources are left in the DB is the things
related to the cursor. The statements that used the cursor were never
closed, so the result set they returned never closed as well. This
meant the database didn't free the relevant cursor resources in the
DB, and since it was over a huge table it took a lot of RAM.
Hope it helps!
TLDR: If you need PostgreSQL on AWS and you need rock solid stability, run PostgreSQL on EC2 (for now) and do some kernel tuning for overcommitting
I'll try to be concise, but you're not the only one who has seen this and it is a known (internal to Amazon) issue with RDS and Aurora PostgreSQL.
OOM Killer on RDS/Aurora
The OOM killer does run on RDS and Aurora instances because they are backed by linux VMs and OOM is an integral part of the kernel.
Root Cause
The root cause is that the default Linux kernel configuration assumes that you have virtual memory (swap file or partition), but EC2 instances (and the VMs that back RDS and Aurora) do not have virtual memory by default. There is a single partition and no swap file is defined. When linux thinks it has virtual memory, it uses a strategy called "overcommitting" which means that it allows processes to request and be granted a larger amount of memory than the amount of ram the system actually has. Two tunable parameters govern this behavior:
vm.overcommit_memory - governs whether the kernel allows overcommitting (0=yes=default)
vm.overcommit_ratio - what percent of system+swap the kernel can overcommit. If you have 8GB of ram and 8GB of swap, and your vm.overcommit_ratio = 75, the kernel will grant up to 12GB or memory to processes.
We set up an EC2 instance (where we could tune these parameters) and the following settings completely stopped PostgreSQL backends from getting killed:
vm.overcommit_memory = 2
vm.overcommit_ratio = 75
vm.overcommit_memory = 2 tells linux not to overcommit (work within the constraints of system memory) and vm.overcommit_ratio = 75 tells linux not to grant requests for more than 75% of memory (only allow user processes to get up to 75% of memory).
We have an open case with AWS and they have committed to coming up with a long-term fix (using kernel tuning params or cgroups, etc) but we don't have an ETA yet. If you are having this problem, I encourage you to open a case with AWS and reference case #5881116231 so they are aware that you are impacted by this issue, too.
In short, if you need stability in the near term, use PostgreSQL on EC2. If you must use RDS or Aurora PostgreSQL, you will need to oversize your instance (at additional cost to you) and hope for the best as oversizing doesn't guarantee you won't still have the problem.

AWS EB should create new instance once my docker reached its maximum memory limit

I have deployed my dockerized micro services in AWS server using Elastic Beanstalk which is written using Akka-HTTP(https://github.com/theiterators/akka-http-microservice) and Scala.
I have allocated 512mb memory size for each docker and performance problems. I have noticed that the CPU usage increased when server getting more number of requests(like 20%, 23%, 45%...) & depends on load, then it automatically came down to the normal state (0.88%). But Memory usage keeps on increasing for every request and it failed to release unused memory even after CPU usage came to the normal stage and it reached 100% and docker killed by itself and restarted again.
I have also enabled auto scaling feature in EB to handle a huge number of requests. So it created another duplicate instance only after CPU usage of the running instance is reached its maximum.
How can I setup auto-scaling to create another instance once memory usage is reached its maximum limit(i.e 500mb out of 512mb)?
Please provide us a solution/way to resolve these problems as soon as possible as it is a very critical problem for us?
CloudWatch doesn't natively report memory statistics. But there are some scripts that Amazon provides (usually just referred to as the "CloudWatch Monitoring Scripts for Linux) that will get the statistics into CloudWatch so you can use those metrics to build a scaling policy.
The Elastic Beanstalk documentation provides some information on installing the scripts on the Linux platform at http://docs.aws.amazon.com/elasticbeanstalk/latest/dg/customize-containers-cw.html.
However, this will come with another caveat in that you cannot use the native Docker deployment JSON as it won't pick up the .ebextensions folder (see Where to put ebextensions config in AWS Elastic Beanstalk Docker deploy with dockerrun source bundle?). The solution here would be to create a zip of your application that includes the JSON file and .ebextensions folder and use that as the deployment artifact.
There is also one thing I am unclear on and that is if these metrics will be available to choose from under the Configuration -> Scaling section of the application. You may need to create another .ebextensions config file to set the custom metric such as:
option_settings:
aws:elasticbeanstalk:customoption:
BreachDuration: 3
LowerBreachScaleIncrement: -1
MeasureName: MemoryUtilization
Period: 60
Statistic: Average
Threshold: 90
UpperBreachScaleIncrement: 2
Now, even if this works, if the application will not lower its memory usage after scaling and load goes down then the scaling policy would just continue to trigger and reach max instances eventually.
I'd first see if you can get some garbage collection statistics for the JVM and maybe tune the JVM to do garbage collection more often to help bring memory down faster after application load goes down.

Memory issues with new version of Mongodb

I'm using Mongodb on my Windows server 2012 for more than two years. Since the last update some weird issues started to happen which in the end lead to usage of the entire RAM memory.
The service Iv'e configured for Mongodb is as follows:
logpath=d:\data\log\mongod.log
dbpath=d:\data\db
storageEngine=wiredTiger
rest=true
#override port
port=27017
#configsvr = true
shardsvr = true
And in order to limit the Cache memory usage Iv'e added the following line:
wiredTigerCacheSizeGB=10
And this is where the weird stuff started happening. When I check the task manager it says that now Mongodb is really limited to 10GB as I defined in the service but it is actually using a lot more than 10GB.
In the first image you can see the memory consumption sorted by RAM consumption
While in fact the machine I'm using has 28GB in total
This crazy consumption leads to failure in the scripts I'm running, even the most basic ones, even when I only run simple queries like 'count' or 'distinct', I believe that this is a direct results of the memory consumption.
When I checked the log files I saw that there are many open connections that even when the session ends it indicates that still the same amount of connections is opened:
So in the end I have two major questions:
1. Is there a way of solving this issue without downgrading the Mongodb version?
2. The config file looks right? is everything there is necessary?
Memory usage in WiredTiger is a two-level cache:
First is the WiredTiger cache as controlled by --wiredTigerCacheSizeGB
Second is the Operating System filesystem cache. MongoDB automatically uses all free memory that is not used by the WiredTiger cache or by other processes
See also WiredTiger memory usage
For OS filesystem cache, MongoDB doesn't manage the memory it uses directly - it lets the OS manage it. Windows will try to use every last scrap of physical memory if it can - but lots of it should and will be thrown out if other processes request memory.
An alternative is to run mongod in a container (e.g. lxc, cgroups, Docker, etc.) that does not have access to all of the RAM available in a system.
Having said the above:
You are also running another database in the server i.e. mysqld. MongoDB, like some databases will perform better on a dedicated server to reduce memory contention.
Task Manager shows mongod is using 10GB, although the machine is using up to ~28GB. This may or may not be mongod as you have other processes as well.
Useful resources:
FAQ: Memory diagnostics for WiredTiger
FAQ: MongoDB Cache Handling
MongoDB Production Notes

Running MongoDB and Redis on two different containers in the same host machine

I have read somewhere that MongoDB and Redis server shouldn't be executed in the same host because the way that Redis manages the memory damages MongoDb. This is before Docker.io. But now thing seems are pretty different or not? Is is convenient running Redis server and MongoDB on two different containers on the same host machine?
Docker does not change your hardware, also it is the OS that deals with resources which is not virtualized so the same rules as a normal hardware should apply here.
RAM
MongoDB and Redis don't share any memory. The problem of using the same host will be that you can run out of RAM with these two processes, you can put a max size for redis, you can probably do the same for MongoDB, it is mandatory.
If your sizing is good (MongoDB RAM + Redis RAM < Hardware RAM), you won't get any swap on disk for redis (which is absolutely what you want to prevent) but maybe mongodb cache won't be as good (not enough place for optimization). Less memory for redis is always a challenge if your data grows: beware of out of memory if the data size is unpredictable!
If you use backups with redis, it uses more RAM than its dataset to produce the dump, so beware of that. It implies also using IO.
IO
In this case (less RAM) mongo will do a lot more of IO to access data. Redis, depending on your backup policy, can use IO or not (your choice). Worst case: if you use AOF on redis, it is a lot of IO so maybe IO can become a bottleneck in this architecture. If you don't use backups with redis: you won't have problems. Also a SSD is a good choice for Mongo.
CPU
I don't know if MongoDB uses a lot of CPU, but redis most of the time does not except during backups. If you use backups with redis: try to have two CPU cores available for it (one for redis, one for backup task).
Network
It depends on your number of clients. But you should check the throughput / input load of your machine to see if you are not saturating (using monit for instance with alerts). Sometimes it is the bottleneck, not enought throughput in one machine!
Many of today's services, in particular Databases, are very aggressive consuming resources and are designed thinking they will (or should) be executed in a dedicated machine for them. MongoDB and Redis try to keep a lot of data in memory and will try to take the more memory they can for themselves. To avoid this services take all the memory of your host machine you can limit the maximum memory used by a container using -m="<number><optional unit>" in docker run. E.g.: docker run -d -m="2g" -p 27017:27017 --name mongodb dockerfile/mongodb
So you can control in an easy way the resource limits of your services, and run them in the same host with a fine grained control of the resources. Anyway it's important to consider that the performance of these services is designed thought that the resources of the host machine will be fully available for them. For example there are other databases as Cassandra that will consume a lot of memory, and furthermore, are designed to have sequential access writing to disk. In these cases Docker will let you to run limiting the resources used, but if you run multiple services in the same host the performance of them will decrease severely.

MongoDB single server production setup

I am developing a server to a customer who has only one machine for his production deployment.
It's a CentOS 64bit with 8Gb of memory.
I am using Mongo and the question is, do I still need to deploy a replica set even though it's a single machine?
Will I get the advantages of a replica set or since it's a single machine it really does not matter and journaling is enough?
You definitely have to enable journaling (It will ensure consistent state even in cases of HW failure scenarios, you will not have to run costy repair command after a crash). You should enable RAID under the data directrory (Anyway this is in general recommended), while here will be crucial not to lose data due to a disk failure (You do not have copy on an other box or so). There is no option for HA within one box it is quite straightforward, however it is not harmful, and in some cases useful to configure 1 node (1 mongod) replicaset (Than you will have oplog). This will help for example when you likely to have MMS backup, or just for enable point in time backup feature of mongodump. Later if you will likely to scale out for HA this way you will only have to add the new nodes to your initially established replicaset.
Make no sense to run several replicas inside one box, while they will race on HW resources and will bring nothing as an advantage.