I am running a spark job (spark-submit) and facing outOfMemory and open files memory issues a lot. I am searching all over couldn't find anything helpful.
Can somebody help me increase the amazon emr default memory settings?
[hadoop#ip-10-0-52-76 emr]$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 31862
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 31862
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
Increasing the java heap size and open files size will resolve my issue.
For more information, I am using r3.4xlarge emr clusters. Thanks
In EMR you can change the memory setting in /etc/spark/conf/spark-defaults.conf file.
If tasks are getting outofmemory means, you should increase your executor memory. Please choose the executor memory based on data size.
spark.executor.memory 5120M
Incase, driver throws outofmemory error, you can increase the driver memory.
spark.driver.memory 5120M
Related
We are running kafka version 2.4.0. After 4-5 days of application running, it dies without any logs. We have 20gb box with xmx and xms set to 5gb. The GC activity of application is healthy and there are not GC issue. I don't see OOM killer being invoked as checked from system logs. There is 13gb available memory when process died.
total used free shared buff/cache available
Mem: 19 5 0 0 13 13
Swap: 0 0 0
The root cause for this was vm.max_map_count limit (default being 65k) being hit by the application. We concluded this by looking at
jmx.java.nio.BufferPool.mapped.Count
metrics in jmx mbean.
Another way to check this is
cat /proc/<kafka broker pid>/maps | wc -l
Updating the max_map_count limit fixed the issue for us.
Another way to fix this issue could have been
Increasing the segment creation duration or number of records when segment is triggered.
Have more instances so that each instance gets assigned lesser number of paritions.
I am using DSBulk to unload data into CSV from a DSE cluster installed under Kubernetes, My cluster consists of 9 Kubernetes Pods each with 120 GB Ram.
I have monitored the resources while unloading the data and observed that the more the data is fetched in CSV the more the ram is getting utilised and pods are restarting due to lack of memory.
If one Pod is down at a time the DSBulk unload won't fail, but if 2 Pods are down unload will fail with the exception :
Cassandra timeout during read query at consistency LOCAL_ONE (1 responses were required but
only 0 replica responded).
Is there a way to avoid this exceeding of memory happening or is there a way to increase the timeout duration.
The command I am using is :
dsbulk unload -maxErrors -1 -h ‘[“ < My Host > ”]’ -port 9042 -u < My user name >
-p < Password > -k < Key Space > -t < My Table > -url < My Table >
--dsbulk.executor.continuousPaging.enabled false --datastax-java-driver.basic.request.page-size 1000
--dsbulk.engine.maxConcurrentQueries 128 --driver.advanced.retry-policy.max-retries 100000
After a lot of Trial and Error, we found out the problem was with Kubernetes Cassandra pods using the main server's memory size as Max Direct Memory Size, rather than using the pods max assigned Ram.
The pods were assigned 120 GB of Ram, but Cassandra on each pod was assigning 185 GB Ram to file_cache_size, which made the unloading process fails as Kubernetes was rebooting each Pod that utilises Ram more than 120 GB.
The reason is that Max Direct Memory Size is calculated as:
Max direct memory = ((system memory - JVM heap size))/2
And each pod was using 325 GB as Max Direct Memory Size and each pods file_cache_size sets automatically to be half of Max Direct Memory Size value, So whenever a pod requests for memory more than 120 GB Kubernetes will restart it.
The solution to it was to set Max Direct Memory Size as an env variable in Kubernetes cluster's yaml file with a default value or to override it by setting the file_cache_size value on each pod's Cassandra yaml's file
i am monitoring weblogic server using jconsole tool. i found there is no memory leak in the heap. but i see resident memory size is growing very high and it is not coming down eventhough heap comes under 1GB. I have 6GB of heap size and 12GB of RAM. single java process is holding most of the memory. I am using weblogic9 and jdk1.5.
Once the server is restarted memory is coming down and again it started growing and reaching maximum within low time span.
-xms1024m -xmx6144m
Can someone help in resolving this issue?..Thanks in advance.
We have a Data ware house server running on Debian linux ,We are using PostgreSQL , Jenkins and Python.
It's been few day the memory of the CPU is consuming a lot by jenkins and Postgres.tried to find and check all the ways from google but the issue is still there.
Anyone can give me a lead on how to reduce this memory consumption,It will be very helpful.
below is the output from free -m
total used free shared buff/cache available
Mem: 63805 9152 429 16780 54223 37166
Swap: 0 0 0
below is the postgresql.conf file
Below is the System configurations,
Results from htop
Please don't post text as images. It is hard to read and process.
I don't see your problem.
Your machine has 64 GB RAM, 16 GB are used for PostgreSQL shared memory like you configured, 9 GB are private memory used by processes, and 37 GB are free (the available entry).
Linux uses available memory for the file system cache, which boosts PostgreSQL performance. The low value for free just means that the cache is in use.
For Jenkins, run it with these JAVA Options
JAVA_OPTS=-Xms200m -Xmx300m -XX:PermSize=68m -XX:MaxPermSize=100m
For postgres, start it with option
-c shared_buffers=256MB
These values are the one I use on a small homelab of 8GB memory, you might want to increase these to match your hardware
I'm running mongoDB 3.4 on a t2.micro EC2 instance (Amazon Linux 2.0 (2017.12))
Following is the ulimit -a configuration in the instance.
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 3867
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 50000
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 3867
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
You can see that the number of open files set is 50000.
So I expected that mongo will allow nearly 50000 connections to the running mongo instance. But I'm unable to get more than 4077 connections simultaneously. In the /var/log/mongodb/mongod.log I can see that the current open connections is 4077 and new connections are getting rejected because it fails to create threads for new connection requests.
I'm not even able to connect to the mongo shell from the terminal. Its unable to create the sockets. I can connect to the DB if I release the 4077 connections that are open now.
How can I specify the maximum simultaneous connections within the mongo config file? Do I change any other parameters in the OS environment like ulimit?
I can see that the current open connections is 4077 and new connections are getting rejected because it fails to create threads for new connection requests.
A t2.micro instance only provides 1GiB of RAM. Each database connection will use up to 1 MB of RAM, so with 4000+ connections you are likely to have exhausted the available resources of your server. Assuming you are using the default WiredTiger storage engine in MongoDB 3.4, you probably have 256MB of RAM allocated to the WiredTiger cache by default and the remaining memory has to be shared between connection threads, your O/S, and any other temporary memory allocation required by mongod.
How can I specify the maximum simultaneous connections within the mongo config file? Do I change any other parameters in the OS environment like ulimit?
Resource limits are intended to impose a reasonable upper bound so a system administrator can intervene before the system becomes non-responsive. There are two general categories of limits for connections: those imposed by your MongoDB server configuration (in this case net.maxIncomingConnections ) and those imposed by your operating system (ulimit -a).
In MongoDB 3.4, net.maxIncomingConnections defaults to 65,536 simultaneous connections, so ulimit settings for files or threads are typically reached before the connection limit.
For a server with more capacity than a t2.micro, it typically makes sense to increase limits from the default. However, given the limited resources of a t2.micro instance I would actually recommend reducing limits if you want your deployment to be stable.
For example, a more realistic limit would be to set net.maxIncomingConnections to 100 connections (or an expected max of 100MB of RAM for connections). In your case you are aiming for 50,000 connections so you could either set that value or leave the default (65,536) and rely on ulimit restrictions.
Your ulimit settings already allow more consumption than your instance can reasonably cope with, but the MongoDB manual has a reference if you'd like to Review and Set Resource Limits. You could consider increasing your -u value (max processes/threads) as this is likely the current ceiling you are hitting, but as with connections I would consider what is reasonable given available resources and your workload.