HornetQ store less meassages then max-queue-size - hornetq

HornetQ=2.2.14-Final
I have message with 10 bytes;
I set max-queue-size to be 100 bytes;
I expected to put 10 messages on the topic, but I successfully put less than 10
I understand that hornetq stores some metadata for each message and it affects the queue-size; Is this understanding correct?
How could I find the size of the metadata that hornetq stores for each message?
Is it affected by the message size?
(I would like to calculate how many messages I can / could store)

Related

What is the difference between message size and request size for AWS MSK?

Reading over the MSK Quota documentation, I see there are two values
Maximum message size 8 MB
Maximum request size 100 MB
Is the message size for individual messages and the request size for a batch request?
Yes. Kafka producer clients batch requests, rather than send one record at a time.
A batch contains many messages, each of a certain size.
A request size is the total of all messages, plus extra metadata.
"Request" size may also include other Kafka network interactions, however the ProduceBatch should be the largest.

Consume only a specific number of messages from the queue

Is it possible to consume only a specific number of messages like 10 or 50 or 100 messages from the 1000 that are in the queue? I was looking at 'fetch.max.bytes' config, but it seems like it is for a message size rather than number of messages.
I don't know how to set "max.partition.fetch.bytes" as my byte size is not the same in every message.
Is there a way to dynamically set this to read 10 or 50 or 100 messages per minute?
Or is there any way I can do this?
note : please "max.poll.records" note that I cannot use the method
Per minute? No, not really, because you have little control as a consumer client over producer speeds or even network speeds.
If you just want a static number, seek the consumer to a specific partition offset and simply count the number of records consumed until you're satisfied with the number, then commit the offsets back (or don't).

ActiveMQ Artemis Persisted Message Size Very Large

I am using ActiveMQ Artemis 2.11.0 with the native, file-based persistence configuration. Most of the content of my messages is around 5K characters. However, the "Persistent Size" of the message is often 7 to 10 times larger than that.
The message data is a JSON string of various sizes. The "Persistent Size" value that I'm seeing comes from the queue browsing feature of the ActiveMQ Artemis web console. For example, a message with a display body character count of 4553 characters has a persistent size of 26,930 bytes. The persisted record is 6 times larger than the message itself.
There are headers in the message, but not enough I think to account for the difference in message and persisted record size.
Can anyone please tell me why this is and whether or not I can do something to reduce the persisted size of messages?
Here's a related screenshot of the web console:

Producer side compression in apache kafka

I hve enabled snappy compression on producer side with a batch size of 64kb, and processing messages of 1 kb each and setting linger time to inf, does this mean till i process 64 messages, producer wont send the messages to kafka out topic...
In other words, will producer send each message to kafka or wait for 64 messages and send them in a single batch...
Cause the offsets are increasing one by one rather than in the multiple of 64
Edit - using flink-kafka connectors
Messages are batched by producer so that the network usage is minimized not to be written "as a batch" into Kafka's commitlog. What you are seeing is correctly done by Kafka as each message needs to be accounted for i.e. identified key / partition relationship, appended to the commitlog and then offset is incremented. Unless the first two steps are done, offset is not incremented.
Also there is data replication to be taken care of based on configurations as well as message tracking systems get updated for each message received (to support lag apis).
Also do note, the batch.size parameter considers ready to ship message's size, which has been pre-processed as 1. compressed 2. serialized by your favorite serializer.

How does kafka compression relate to configurations that refer to bytes?

It's unclear to me (and I haven't managed to find any documentation that makes it perfectly clear) how compression affects kafka configurations that deal with bytes.
Take a hypothetical message that is exactly 100 bytes, a producer with a batch size of 1000 bytes, and a consumer with a fetch size of 1000 bytes.
With no compression it seems pretty clear that my producer would batch 10 messages at a time and my consumer would poll 10 messages at a time.
Now assume a compression (specified at the producer -- not on the broker) that (for simplicity) compresses to exactly 10% of the uncompressed size.
With that same config, would my producer still batch 10 messages at a time, or would it start batching 100 messages at a time? I.e. is the batch size pre- or post-compression? The docs do say this:
Compression is of full batches of data
...which I take to mean that it would compress 1000 bytes (the batch size) down to 100 bytes and send that. Is that correct?
Same question for the consumer fetch. Given a 1K fetch size, would it poll just 10 messages at a time (because the uncompressed size is 1K) or would it poll 100 messages (because the compressed size is 1K)? I believe that the fetch size will cover the compressed batch, in which case the consumer would be fetching 10 batches as-produced-by-the-producer at a time. Is this correct?
It seems confusing to me that, if I understand correctly, the producer is dealing with pre-compression sizes and the consumer is dealing with post-compression sizes.
It's both simpler and more complicated ;-)
It's simpler in that both the producer and the consumer compresses and uncompresses the same Kafka Protocol Produce Requests and Fetch Requests and the broker just stores them with zero copy in their native wire format. Kafka does not compress individual messages before they are sent. It waits until a batch of messages (all going to the same partition) are ready for send and then compresses the entire batch and sends it as one Produce Request.
It's more complicated because you also have to factor in the linger time which will trigger a send of a batch of messages earlier than when the producer buffer size is full. You also have to consider that messages may have different keys, or for other reasons be going to different topic partitions on different brokers so it's not true to say that qty(10) records compressed to 100 bytes each go all as one batch to one broker as a single produce request of 1000 bytes (unless all the messages are being sent to a topic with a single partition).
From https://kafka.apache.org/0110/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html
The producer maintains buffers of unsent records for each partition.
These buffers are of a size specified by the batch.size config. Making
this larger can result in more batching, but requires more memory
(since we will generally have one of these buffers for each active
partition).
By default a buffer is available to send immediately even if there is
additional unused space in the buffer. However if you want to reduce
the number of requests you can set linger.ms to something greater than
0. This will instruct the producer to wait up to that number of milliseconds before sending a request in hope that more records will
arrive to fill up the same batch. This is analogous to Nagle's
algorithm in TCP. For example, in the code snippet above, likely all
100 records would be sent in a single request since we set our linger
time to 1 millisecond. However this setting would add 1 millisecond of
latency to our request waiting for more records to arrive if we didn't
fill up the buffer. Note that records that arrive close together in
time will generally batch together even with linger.ms=0 so under
heavy load batching will occur regardless of the linger configuration;
however setting this to something larger than 0 can lead to fewer,
more efficient requests when not under maximal load at the cost of a
small amount of latency.