Zookeeper Overall data size limit - apache-zookeeper

I am new to Zookeeper, trying to understand if it fits for my use case.
I have 10 million hierarchical data, which I want to store in Zookeeper.
10M key-value pair with size of the key and value will be 1KB each.
So the total data size is approximately ~20GB (10M * 2KB) without replication.
I know the zNodes data size limit is 1MB( which can be changed).
Questions:
Will zookeeper able to support 20GB of data, with no performance impact.
Is there max size after which the performance degrades?
Is there a limit to total number of nodes?

Zookeeper will no way be suitable for this use case. Zookeeper keeps dumping/snapshotting the data tree periodically and that means it will be dumping whole of the 20 GB data every few minutes. Moreover Zookeeper nodes in the cluster/ensemble are more like replica of each other and hence whole data is replicated to each Zookeeper node and hence no data partitioning either. Zookeeper is not a database.
I guess for your use case, it will be much better to go with some database or some distributed cache (Redis/Hazelcast etc.)
Anyway there are no limits on the total number of nodes on Zookeeper.

Related

NiFi PutDatabaseRecord to Postgres database : Performance improvement in loading data into Postgres database

We are trying to load data to postgres from oracle using nifi.
we are using PutDatabaseRecord to load data (which is in avro format).
we are using ExecuteSQL to extract data which is very fast but we can see that,
even though we are using 150+ threads for PutDatabaseRecord, it is maintaining an average of 1GB data writes for 5mins .
If suppose we are having 3 PutDatabaseRecord processors (i.e., let suppose for each table one processor) and each processor is of 50 threads, still it is maintaining an average of 1Gb for 5 mins (i.e., 250mb for 1 processor, 350 for 2nd processor and 400 for 3 processor. Or some other combinations but it is still 1Gb overall).
We are really, not sure if it is from postgres database end which is limiting write size or it's from nifi end.
Need help if we need to change NiFi properties or to change some settings in postgres, which will help the data loading performance.
One observation is that, data extraction from Oracle is very fast and we are able to see the Nifi queues are filling very quickly and waiting to be processed by PutDatabaseRecord process.
If you have a single NiFi instance, there will be limit on how much data you can push through regardless of the number of threads (once the number of threads reaches the number of cores on your machine). To increase throughput, you could set up a 3-5 node NiFi cluster and run the PutDatabaseRecord processors in parallel, then you should see 3-5 GB throughput to Postgres (as long as PG can handle that)

How much memory Kafka cluster needs?

How can i calculate how much memory and cpu my Kafka cluster needs?
My cluster consists from 3 nodes, with throughput of ~800 messages per second.
Currently they have (each) 6 GB ram, 2 CPU, 1T disk, and it seems to be not enough. How much would you allocate?
You would need to provide some more details regarding your use-case like average size of messages etc. but here's my 2 cents anyway:
Confluent's documentation might shed some light:
CPUs Most Kafka deployments tend to be rather light on CPU
requirements. As such, the exact processor setup matters less than the
other resources. Note that if SSL is enabled, the CPU requirements can
be significantly higher (the exact details depend on the CPU type and
JVM implementation).
You should choose a modern processor with multiple cores. Common
clusters utilize 24 core machines.
If you need to choose between faster CPUs or more cores, choose more
cores. The extra concurrency that multiple cores offers will far
outweigh a slightly faster clock speed.
How to compute your throughput
It might also be helpful to compute the throughput. For example, if you have 800 messages per second, of 500 bytes each then your throughput is 800*500/(1024*1024) = ~0.4MB/s. Now if your topic is partitioned and you have 3 brokers up and running with 3 replicas that would lead to 0.4/3*3=0.4MB/s per broker.
More details regarding your architecture can be found in Confluent's whitepaper Apache Kafka and Confluent Reference Architecture. Here's the section for memory usage,
ZooKeeper uses the JVM heap, and 4GB RAM is typically sufficient. Too
small of a heap will result in high CPU due to constant garbage
collection while too large heap may result in long garbage collection
pauses and loss of connectivity within the ZooKeeper cluster.
Kafka brokers use both the JVM heap and the OS page cache. The JVM heap is used for replication of partitions between brokers and for log
compaction. Replication requires 1MB (default replica.max.fetch.size)
for each partition on the broker. In Apache Kafka 0.10.1 (Confluent
Platform 3.1), we added a new configuration
(replica.fetch.response.max.bytes) that limits the total RAM used for
replication to 10MB, to avoid memory and garbage collection issues
when the number of partitions on a broker is high. For log compaction,
calculating the required memory is more complicated and we recommend
referring to the Kafka documentation if you are using this feature.
For small to medium-sized deployments, 4GB heap size is usually
sufficient. In addition, it is highly recommended that consumers
always read from memory, i.e. from data that was written to Kafka and
is still stored in the OS page cache. The amount of memory this
requires depends on the rate at this data is written and how far
behind you expect consumers to get. If you write 20GB per hour per
broker and you allow brokers to fall 3 hours behind in normal
scenario, you will want to reserve 60GB to the OS page cache. In cases
where consumers are forced to read from disk, performance will drop
significantly
Kafka Connect itself does not use much memory, but some connectors buffer data internally for efficiency. If you run multiple connectors
that use buffering, you will want to increase the JVM heap size to 1GB
or higher.
Consumers use at least 2MB per consumer and up to 64MB in cases of large responses from brokers (typical for bursty traffic).
Producers will have a buffer of 64MB each. Start by allocating 1GB RAM and add 64MB for each producer and 16MB for each consumer planned.
There are many different factors that need to be taken into consideration when it comes to tune the configuration of your architecture. I would suggest to go through the aforementioned documentation, monitor your existing cluster and resources and finally tune them accordingly.
I think you want to start by profiling your kafka cluster.
See the answer to this post: CPU Profiling kafka brokers.
It basically recommends that you use a prometheus and grafana stack to visualize your load on a timeline - from this you should be able to determine your bottleneck. And links to an article that describes how.
Also, you may find the post interresting, because the poster seems to have about the same workload as you.

kafka + how to avoid running out of disk storage

I want to described the following case that was on one of our production cluster
We have ambari cluster with HDP version 2.6.4
Cluster include 3 kafka machines – while each kafka have disk with 5 T
What we saw is that all kafka disks was with 100% size , so kafka disk was full and this is the reason that all kafka brokers was failed
df -h /kafka
Filesystem Size Used Avail Use% Mounted on
/dev/sdb 5T 5T 23M 100% /var/kafka
After investigation we saw that log.retention.hours=7 days
So seems that purging is after 7 days and maybe this is the reason that kafka disks are full with 100% even if they are huge – 5T
What we want to do now – is how to avoid this case in the future?
So
We want to know – how to avoid full used capacity on kafka disks
What we need to set in Kafka config in order to purge the kafka disk according to the disk size – is it possible ?
And how to know the right value of log.retention.hours ? according to the disk size or other?
In Kafka, there are two types of log retention; size and time retention. The former is triggered by log.retention.bytes while the latter by log.retention.hours.
In your case, you should pay attention to size retention that sometimes can be quite tricky to configure. Assuming that you want a delete cleanup policy, you'd need to configure the following parameters to
log.cleaner.enable=true
log.cleanup.policy=delete
Then you need to think about the configuration of log.retention.bytes, log.segment.bytes and log.retention.check.interval.ms. To do so, you have to take into consideration the following factors:
log.retention.bytes is a minimum guarantee for a single partition of a topic, meaning that if you set log.retention.bytes to 512MB, it means you will always have 512MB of data (per partition) in your disk.
Again, if you set log.retention.bytes to 512MB and log.retention.check.interval.ms to 5 minutes (which is the default value) at any given time, you will have at least 512MB of data + the size of data produced within the 5 minute window, before the retention policy is triggered.
A topic log on disk, is made up of segments. The segment size is dependent to log.segment.bytes parameter. For log.retention.bytes=1GB and log.segment.bytes=512MB, you will always have up to 3 segments on the disk (2 segments which reach the retention and the 3rd one will be the active segment where data is currently written to).
Finally, you should do the math and compute the maximum size that might be reserved by Kafka logs at any given time on your disk and tune the aforementioned parameters accordingly. Of course, I would also advice to set a time retention policy as well and configure log.retention.hours accordingly. If after 2 days you don't need your data anymore, then set log.retention.hours=48.
I think you have three options:
1) Increase the size of the disks until you notice that you have a comfortable amount of space free thanks to your increase and current retention policy of 7 days. For me a comfortable amount free is around 40% (but that is personal preference).
2) Lower your retention policy to for example 3 days and see if your disks are still full after a period of time. The right retention period varies between different use cases. If you don't need a backup of the data on Kafka when something goes wrong then just pick a very low retention period. If it is crucial that you have need those 7 days worth of data then you should not change the period but the disk sizes.
3) A combination of the options 1 and 2.
More information about optimal retention policies: Kafka optimal retention and deletion policy

what happens when maxsize is exceeded

When the size of the cluster rises chunks are divided. Docs say that "the balancer will not move chunks off an overloaded shard. This must happen manually."(doc here). So will redudant chunks of a shard, that has reached the maxsize limit, be moved to another shard that hasn't exceeded the maxsize, or will they stay on the same shard and one must manually move those extra bytes and chunks off the shard?
Docs say that "the balancer will not move chunks off an overloaded shard. This must happen manually.". So will redudant chunks of a shard, that has reached the maxsize limit, be moved to another shard that hasn't exceeded the maxsize, or will they stay on the same shard and one must manually move those extra bytes and chunks off the shard?
This is specific to when you have set maxSize limit for a shard and that limit has been reached. The balancer will no longer migrate chunks to that shard, and it will remain "full" unless you manually move some chunks to another shard via sh.moveChunk(). The default behaviour is to have no maxSize set so shards can use as much disk space as is available.
my scenario is that i have for example 2 servers one with a bigger hard drive than the other. So if one is 500GB and the other is 1TB and the first gets full with data, what happens when I add more data to the servers. Will the balancer know that the first is full and transfer the extra data from the first server to the second?
MongoDB balances data between shards on the basis of logical chunks that are contiguous ranges of values based on the shard key you have selected. By default a chunk represents roughly 64MB of data.
MongoDB is unaware of the underlying disk configuration, so if server with shardA has twice as much disk space as a server with shardB the balancer is still only considering the number of chunks associated with each shard (not the actual disk usage). Ideally all shards should have similar configuration in terms of hardware and disk space.
If you use the maxSize option to limit the storage on a specific shard, this setting only controls whether the balancer will move chunks to that shard once the maxSize has been reached.
For more information see Sharded Collection Balancing in the MongoDB documentation.

Does mongo replication split data or duplicate it

I am creating a mongoDB/nodejs based CMS and I am using GridFS to store all the uploaded docs. The question I have is this:
Does MongoDB replication sets allow increased amount of DB Storage, or
simply duplicates of the database. For Instance, if I have 5 servers
with 1TB of storage each, if I replica mongo across all of them, would
my GridFS system have theoretically 5TB of storage (minus caching and
padding) or 1TB of storage duplicated several times for better read
performance?
Thanks!
Informal description:
Replication = The same copy of the data on multiple nodes, i.e., 5 nodes with 1TB each provide 1TB overall.
Sharding / Partitioning = Fraction of the data goes to the nodes, i.e., 5 nodes with 1TB each provide 5TB overall.
Each approach has certain advantages and disadvantages, e.g., replication can help with read throughput and is good as backup, but slows down inserts (depending on commit level), whereas partitioning can help with insert throughput and distributed lookups.
Again, details left to the storage system implementor.
Sharding means splitting your data across multiple nodes, this is useful when you have a huge amount of data.
Replication means copying the data from a node to another node, and it's useful when your application is read heavy or you want to backup your data for example.
Resources:
http://www.mongodb.org/display/DOCS/Sharding
http://www.mongodb.org/display/DOCS/Replication
http://nosql-exp.blogspot.com/2010/09/mongodb-sharding-and-replication-with.html
Does MongoDB replication sets allow increased amount of DB Storage, or
simply duplicates of the database.
Mongo can do both.
The first case is called sharding.
The second case is called replication.