How kafka connector works with memory leaks? - apache-kafka

I launch the kafka connect image . i configured near 25 running source and sink connectors
when i drop inside this container i saw only 1 java processes
root#connect:/# ps -ef | grep java
root 1 0 3 Jun20 ? 01:32:06 java -Xms256M -Xmx2G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+ExplicitGCInvokesConcurrent -Djava.awt.headless=true -Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dkafka.logs.dir=/var/log/kafka -Dlog4j.configuration=file:/etc/kafka/connect-log4j.properties -cp /etc/kafka-connect/jars/*:/usr/share/java/kafka/*:/usr/share/java/confluent-common/*:/usr/share/java/kafka-serde-tools/*:/usr/share/java/monitoring-interceptors/*:/usr/bin/../share/java/kafka/*:/usr/bin/../share/java/confluent-support-metrics/*:/usr/share/java/confluent-support-metrics/* org.apache.kafka.connect.cli.ConnectDistributed /etc/kafka-connect/kafka-connect.properties
root 6263 6252 0 08:58 pts/1 00:00:00 grep java
root#connect:/#
Does it means that when we have a memory leak in one running custom connector it will crash kafka connect node ?

Multiple threads run in a single JVM. But, sure if you get OOM on only one connect task, then it'll blow the JVM, which is why you should add more servers (since you're running distributed mode) and increase the heap from only 2G max by setting KAFKA_HEAP_OPTS
Also, if running the container, a typical pattern might be one container per topic grouping. For example, 5 topics are going to Elasticsearch, 2 others going to HDFS, and 4 others to JDBC, etc. Would make 3 separate containers. That way your "blast radius" is smaller for a failed java process
If you're using the Confluent containers, set CONNECT_GROUP_ID to be the same for a set of containers, also make sure each grouping you make has its own config, offset, and status topics

Related

Kafka broker dying abruptly without any error log

We are running kafka version 2.4.0. After 4-5 days of application running, it dies without any logs. We have 20gb box with xmx and xms set to 5gb. The GC activity of application is healthy and there are not GC issue. I don't see OOM killer being invoked as checked from system logs. There is 13gb available memory when process died.
total used free shared buff/cache available
Mem: 19 5 0 0 13 13
Swap: 0 0 0
The root cause for this was vm.max_map_count limit (default being 65k) being hit by the application. We concluded this by looking at
jmx.java.nio.BufferPool.mapped.Count
metrics in jmx mbean.
Another way to check this is
cat /proc/<kafka broker pid>/maps | wc -l
Updating the max_map_count limit fixed the issue for us.
Another way to fix this issue could have been
Increasing the segment creation duration or number of records when segment is triggered.
Have more instances so that each instance gets assigned lesser number of paritions.

How to Limit the kafka server Memory?

I want to set the Max and minimum memory value for the Kafka server [9092 port]
Let say Max value is 2 GB, then memory usage should not exceeds the 2GB, but currently exceeds it.
I have link - https://kafka.apache.org/documentation/#java
Config From Apache site
-Xmx6g -Xms6g -XX:MetaspaceSize=96m -XX:+UseG1GC
-XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:G1HeapRegionSize=16M
-XX:MinMetaspaceFreeRatio=50 -XX:MaxMetaspaceFreeRatio=80 -XX:+ExplicitGCInvokesConcurrent
But I don't know how to configure it.
My goal is to setting a max memory limit value and Memory Value in Kubernetes Dashboard should not exceeds the max memory limit value.
Note -
Setting max memory limit value should not be in Kubernetes POD Level and its should be like setting value while Starting zookeeper,Kafka Server and kafka connect.
-Xmx1G -Xms256M Proof
Depending on the Image you are using for kafka you can supply these settings via the environment variable KAFKA_OPTS.
The documentation you are referring to is supplying these options to the call of 'java'. Kafka, Zookeeper etc. are jars and there for stated via java.

Kafka failed to map 1073741824 bytes for committing reserved memory

I am installing kafka on an aws t2 instance(one that has 1gb of memory).
(1) I download kafka_2.11-0.9.0.0
(2) I run zookeeper bin/zookeeper-server-start.sh config/zookeeper.properties
(3) I try running bin/kafka-server-start.sh config/server.properties
and I get
Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x00000000c0000000, 1073741824, 0) failed; error='Cannot allocate memory' (errno=12)
.#
.# There is insufficient memory for the Java Runtime Environment to continue.
.# Native memory allocation (mmap) failed to map 1073741824 bytes for committing reserved memory.
.# An error report file with more information is saved as:
.# /home/machine_name/kafka_2.11-0.9.0.0/hs_err_pid9161.log
I checked all propertes in the server.properties config file and in the documentation for properties that could try to do something like this but coudn't find anything.
Does anyone know why is kafka trying to allocated 1 gb when starting?
Kafka defaults to the following jvm memory parameters which mean that kafka will allocate 1GB at startup and use a maximum of 1GB of memory:
-Xmx1G -Xms1G
Just set KAFKA_HEAP_OPTS env variable to whatever you want to use instead. You may also just edit ./bin/kafka-server-start.sh and replace the values.
also if you have lower memory heap then try to reduce the size
-Xmx400M -Xms400M for both zookeeper and kafka
This issue might also relate to the maximum number of memory map areas allocated. It throws exactly the same error.
To remedy this you can run the following:
sysctl -w vm.max_map_count=200000
You want to set this in relation to your File Descriptor Limits. In summary, for every log segment on a broker, you require two map areas - one for index and one for time index.
For reference see the Kafka OS section: https://kafka.apache.org/documentation/#os
I was getting Java IO Exception: Map failed while starting Kafka-server. By analyzing previous logs it looks like it was failed because of insufficient memory in the java heap while loading logs. I changed the maximum memory size but it was not able to fix it. Finally, doing more research on google, I got to know that I had downloaded 32-bit version of java so downloading 64-bit version of java solved my problem.
Pass the KAFKA_HEAP_OPTS argument with your required memory value to run with.
Make sure to pass it in quotes - KAFKA_HEAP_OPTS="-Xmx512M -Xms512M"
docker run -it --rm --network app-tier -e KAFKA_HEAP_OPTS="-Xmx512M -Xms512M" -e KAFKA_CFG_ZOOKEEPER_CONNECT=zookeeper-server:2181 bitnami/kafka:latest kafka-topics.sh --list --bootstrap-server kafka-server:9092

How to check orientdb disk cache size at runtime?

I'm trying to tune an orientdb 2.1.5 embedded application, reducing at the minimum the I/O to disk.
I've read the documentation (http://orientdb.com/docs/last/Performance-Tuning.html) and used the storage.diskCache.bufferSize flag to increment the disk cache size.
Looking at htop and top (I'm on linux) I've not noticed any increment in java process's memory usage though. Even the two mbeans exposed by orient (O2QCacheMXBean and OWOWCacheMXBean) don't highlight evidence about the increment. So how can I be sure of the current disk cache size?
This is part of my java command line:
java -server -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=9999 -Djava.rmi.server.hostname=192.168.20.154 -Dstorage.useWAL=false -Dstorage.wal.syncOnPageFlush=false -Dstorage.diskCache.bufferSize=16384 -Dtx.useLog=false -Xms4g -Xmx4g -XX:+PerfDisableSharedMem -XX:+PrintGCDetails -XX:+PrintTenuringDistribution -XX:+PrintGCTimeStamps -Xloggc:/home/nexse/local/gc.log -jar my.jar
Thanks a lot.

Cassandra Node Memory Usage Imbalance

I am using Cassandra 1.2 with the new MurMur3Partitioner on centos.
On a 2 node cluster both set up with num_tokens=256
I see that one node is using much more memory than the other after inserting a couple million rows with CQL3.
When I run the free command
it shows 6GB usage on the second node and 1GB on the seed node.
However, when running
ps -e -o pid,vsz,comm= | sort -n -k 2
It shows the java process using about 6.8GB on each node.
Note that I have
MAX_HEAP_SIZE="4GB"
HEAP_NEWSIZE="400M"
set in cassandra-env.sh on each node.
Can anyone provide some insight?
This is most likely related to the general difficulties around reporting accurate memory utilization in Linux, especially as it relates to Java processes. Since Java processes reserve and allocate memory automatically, what the operating system sees can be misleading. The best way to understand what a Java process is doing is using JMX to monitor heap utilization. Tools such as VisualVM and jconsole work well for this.