ActiveMQ Artemis Persisted Message Size Very Large - persistence

I am using ActiveMQ Artemis 2.11.0 with the native, file-based persistence configuration. Most of the content of my messages is around 5K characters. However, the "Persistent Size" of the message is often 7 to 10 times larger than that.
The message data is a JSON string of various sizes. The "Persistent Size" value that I'm seeing comes from the queue browsing feature of the ActiveMQ Artemis web console. For example, a message with a display body character count of 4553 characters has a persistent size of 26,930 bytes. The persisted record is 6 times larger than the message itself.
There are headers in the message, but not enough I think to account for the difference in message and persisted record size.
Can anyone please tell me why this is and whether or not I can do something to reduce the persisted size of messages?
Here's a related screenshot of the web console:

Related

Artemis - Messages Sync Between Memory & Journal

When reading artemis docs understood that - artemis stores entire current active messages in memory and can offload messages to paging area for a given queue/topic as per the settings & artemis journals are append only.
With respect to this
How and when broker sync messages to and from from journal ( Only during restart ? )
How it identifies the message to be deleted from journal ( For ex : If journal is append only mode , if a consumer of a persistent message ACK the message , then how broker removes a single message from journal without keeping indexing).
Isn't it a performance hit to keep every active message in memory and even makes broker go out of memory. To avoid this , every queue/topic pagination settings have to be set in configuration otherwise broker may fill all the messages. Please correct me if wrong.
Any reference link that can explain about message sync and these information is helpful. Artemis docs explains regarding append only mode though but may be any section/article that explains these storage concepts and I might be missing.
By default, a durable message is persisted to disk after the broker receives it and before the broker sends a response back to the client that the message was received. In this way the client can know for sure that if it receives the response back from the broker that the durable message it sent was received and persisted to disk.
When using the NIO journal-type in broker.xml (i.e. the default configuration), data is synced to disk using java.nio.channels.FileChannel.force(boolean).
Since the journal is append-only during normal operation then when a message is acknowledged it is not actually deleted from the journal. The broker simply appends a delete record to the journal for that particular message. The message will then be physically removed from the journal later during "compaction". This process is controlled by the journal-compact-min-files & journal-compact-percentage parameters in broker.xml. See the documentation for more details on that.
Keeping message data in memory actually improves performance dramatically vs. evicting it from memory and then having to read it back from disk later. As you note, this can lead to memory consumption problems which is why the broker supports paging, blocking, etc. The main thing to keep in mind is that a message broker is not a storage medium like a database. Paging is a palliative measure meant to be used as a last resort to keep the broker functioning. Ideally the broker should be configured to handle the expected load without paging (e.g. acquire more RAM, allocate more heap). In other words, message production and message consumption should be balanced. The broker is designed for messages to flow through it. It can certainly buffer messages (potentially millions depending on the configuration & hardware) but when its forced to page the performance will drop substantially simply because disk is orders of magnitude slower than RAM.

What defines the scope of a kafka topic

I'm looking to try out using Kafka for an existing system, to replace an older message protocol. Currently we have a number of types of messages (hundreds) used to communicate among ~40 applications. Some are asynchronous at high rates and some are based upon request from user/events.
Now looking at Kafka, it breaks out topics and partitions etc. But I'm a bit confused as to what constitutes a topic. Does every type of message my applications produce get their own topic allowing hundreds of topics, or do I cluster them together to related message types? If the second answer, is it bad practice for an application to read a message and drop it when its contents are not what its looking for?
I'm also in a dilemma where there will be upwards of 10 copies of a single application (a display), all of which getting a very large amount of data (in form of a light weight video stream of sorts) and would be sending out user commands on each particular node. Would Kafka be a sufficient form of communication for this? Assuming that at most 10, but sometimes these particular applications may not have the desire to get the video stream at all times.
A third and final question: I read a bit about replay-ability of messages. Is this only within a single topic, or can the replay-ability go over a slew of different topics?
Kafka itself doesn't care about "types" of message. The only type it knows about are bytes, meaning that you are completely flexible to how you will serialize your datasets. Note, however that the default max message size is just 1MB, so "streaming video/images/media" is arguably the wrong use case for Kafka alone. A protocol like RTMP would probably make more sense
Kafka consumer groups scale horizontally, not in response to load. Consumers poll data at a rate at which they can process. If they don't need data, then they can be stopped, if they need to reprocess data, they can be independently seeked

Apache ActiveMQ Artemis message size configuration

I am trying out ActiveMQ Artemis for a messaging design. I am expecting messages with embedded file content (bytes). I do not expect them to be any bigger than 10MB. However, I want to know if there is a configurable way to handle that in Artemis.
Also is there a default maximum message size it supports?
I tried and searched for an answer but could not find any.
Also, my producer and consumer are both .Net AMQP implementations.
ActiveMQ Artemis itself doesn't place a limit on the size of the message. It supports arbitrarily large messages. However, you will be constrained by a few things:
The broker's heap space: If the client sends the message all in one chunk and that causes the broker to exceed it's available heap space then sending the message will fail. The broker has no control over how the AMQP client sends the message. I believe AMQP supports sending messages in chunks but I'm not 100% certain of that.
The broker's disk space: AMQP messages that are deemed "large" by the broker (i.e. those that cannot fit into a single journal file) will be stored directly on disk in the data/largemessages directory. The ActiveMQ Artemis journal file size is controlled by the journal-file-size configuration parameter in broker.xml. The default journal-file-size is 10MB. By default the broker will stop providing credits to the producer when disk space utilization hits 90%. This is controlled by the max-disk-usage configuration parameter in broker.xml.

Kafka Large message configuration support for Spring boot application producer consumer

When I am trying to publish through Kafka producer
in Spring boot application, I am getting error of RecordTooLargeException.
Error is :
org.apache.kafka.common.errors.RecordTooLargeException: The message is 1235934 bytes when serialized which is larger than the maximum request size you have configured with the max.request.size configuration.
I read other discussions about this problem but did not get any suitable support for this as I have to publish also as well as consume that message from client side.
Please help me by giving a brief configuration steps to do this.
Nice thing about Kafka is that is has great exception messages that are pretty much self explanatory. It is basically saying that your message is too large (which you have concluded by yourself, I believe).
If you check the docs for producer config search for the max.request.size in the table for an explanation, it says:
The maximum size of a request in bytes. This setting will limit the
number of record batches the producer will send in a single request to
avoid sending huge requests. This is also effectively a cap on the
maximum record batch size. Note that the server has its own cap on
record batch size which may be different from this.
You can configure this value in your producer configuration, like so:
properties.put(ProducerConfig.MAX_REQUEST_SIZE_CONFIG, "value-in-bytes");
However, the default is pretty much good for 90% use cases. If you can avoid sending such large messages or perhaps try to compress the messages (this will work wonders when talking about throughput), like so:
properties.setProperty(ProducerConfig.COMPRESSION_TYPE_CONFIG, "snappy");
There are 2 other compression types but this one is from Google and is pretty efficient. Along with compression, you can tweak 2 other values to get much better performance (batch.size and linger.ms) but you would have to test for your use case.

HornetQ store less meassages then max-queue-size

HornetQ=2.2.14-Final
I have message with 10 bytes;
I set max-queue-size to be 100 bytes;
I expected to put 10 messages on the topic, but I successfully put less than 10
I understand that hornetq stores some metadata for each message and it affects the queue-size; Is this understanding correct?
How could I find the size of the metadata that hornetq stores for each message?
Is it affected by the message size?
(I would like to calculate how many messages I can / could store)