ActiveMQ Artemis data journal folder - activemq-artemis

I am trying to understand how ActiveMQ Artemis manages the journal files (under data/journal) and when it creates new ones. I read the documentation, but it wasn't clear how the files are create. I have a broker.xml has simple settings (won't be able to share unfortunately). Here are a few:
journal min threads - 2
journal pool size - 50
file size is defaulted to 10MB.
ActiveMQ Artemis starts, and I see 2 files are already created under /data/journal. I am now running a request that posts a lot of messages in a very short time. These messages are being actively consumed. I am publishing a lot of messages but they are not accumulating as fast because the consumer is consuming them. However, this doesn't cause the files to grow to recreate the space issue.
As the message volume is going up I don't see the number of files going up. It goes up and stays at 12 files for a long time.
I can understand the message # is not sufficient trigger additional files if only the latest journal file is being written to. However, I see all the 11 files have updated timestamps making me think they are being rotated.
My paging directory is empty.
I am trying to understand why the journal is not growing despite the message volume.

From what I can tell, message consumption has reached a break-even point with message production so that messages are not accumulating on the broker therefore the journal is not actually growing in size. The journal files are simply being re-used as messages are being added & acknowledged.

Related

JBoss HornetQ/ActiveMQ Artemis saving queue message to file system

Can someone please help me in understanding the impact of saving Hornetq/ActiveMQ Artemis messages to the file system and bypassing the queue every time?
The message is more than 2GB, and I run into Maximum size 2GB exceeded exception in HornetQ described here. So I was planning to not add the message to queue but write it manually to disk and pass the path of the file in header and read the message from the file. I really don't know the performance impact so asking if I do for all messages less than 2GB, will there be any performance impact?
Given the information you've provided I don't think anybody but you can determine the "performance impact" of manually writing the file to disk vs. sending the file to the broker.
Generally speaking, you will save the time required to send the file to the broker in the first place, but you don't indicate how fast the hard-drives are. If the hard-drive on the broker is much faster than the hard-drive on the client then it may take longer overall to write the file to disk manually.
Also, if the network is slow between the clients and the broker and fast between the clients and the shared drive where you're going to write the file then it may be overall faster to write the file to disk manually.
Ultimately it's going to be up to you to test the performance impact of your changes.

How does persistence work in ActiveMQ Artemis?

In ActiveMQ 5.x when using kahadb for persistence all the files are managed in a single database. This can have serious consequences.
I have hundreds of queues that see millions of messages per day. If a consumer of a queue is temporarily stopped for maintenance reasons the queues continue to fill and empty, and the one whose consumer is suspended sees the messages accumulate. But on the disc it is different. Kahadb indeed marks the deleted (consumed) messages, but cannot free the place if a more present message is kept in the database. This is the case with those that accumulate in the suspended queue.
Very quickly the disk space is full.
To remedy this, you have to change the configuration and use mkahadb. In this case there is one database per queue and therefore on the disk only the suspended queue takes up space.
I am considering switching to Artemis. But the persistence has been completely redesigned. So what happens in terms of disk occupancy when suspending a consumer?
This question is pretty broad, but I'll take a crack at it...
By default ActiveMQ Artemis uses a file-based journal. The journal consists of a pool of files that can grow and shrink based on configuration (see journal-min-files and journal-pool-files in the documentation). The size of each file is also configurable (i.e. via journal-file-size).
An initial pool of files will be created when the broker starts and as messages are stored and the initial pool of files fills up then additional files will be created. As messages are consumed the pool can shrink through a process called "compaction" which is also configurable (see journal-compact-min-files and journal-compact-percentage in the documentation). As long as 1 record in a journal file is considered "live" (e.g. an unconsumed message) then the whole journal file will remain. However, you can tune the impact of this to fit your environment (e.g. by lowering the journal-file-size, making compaction more aggressive, etc.). To be clear, if compaction runs and there is a journal file with only 1 "live" record that means all the other journal files are "full" and at most you will only ever have 1 journal file like that.
Also, you can configure max-disk-usage to block producers from sending more messages once disk utilization reaches a certain point.
Ultimately, if a consumer becomes inactive (for whatever reason) then the messages that consumer was supposed to receive will accumulate in the queues (and potentially on disk). If you want to prevent messages from accumulating in the first place you could implement flow control or blocking for producers. However, even if they do accumulate the file-based journal should be able to grow and shrink as needed.
If I understood correctly there is no way to guarantee that only the payload is kept. (as with mkahadb)
But we can limit the size of the pages and fix their number.
Considering the very large number of queues I have to manage, I think the best is to divide this into a cluster. But I am worried. because when an application is in maintenace (and I have 10 000) the messages of the others cannot be erased because the messages accumulate in a queue. It is clear that whatever the configuration in a few seconds I will crash or stop.
I am surprised to see stop communication between two applications because two others no longer communicate with each other. This is a strong limitation compared to ActiveMQ.
this will limit the problem but not solve it.
with mkahadb if I have 2 queues A and B, that A receives a message every second and B receives 5000/s and the consumers of B consume them immediately. the queue B is always empty or almost and occupies very little disk. If the consumer of A is stopped. the queue A increases but the queue B does not occupy more disk.
With Artemis if I reduce the journal size to 5000. Every second a journal file is full and deleted. If A stops, there must be 1 message from A in the journal. We therefore keep 5000 messages on the disk every second. Although queue B is almost always empty. if I reduce the journal size to 500 I keep less messages but it still grows 500 times faster than with mkahaDB. And if I reduce the journal to 1 to get the same result as with mkahadb, but I force Artemis to handle millions of files which collapses the perf.
I have the impression that Artémis is not made to have very large numbers of queues contrary to ActiveMQ.
thank

What happens if you delete Kafka snapshot files?

The question is simple. What will happen if you delete the Kafka snapshot files in the Kafka log dir. Will Kafka be able to start? Will it have to do a slow rebuild of something?
Bonus question what exactly do the snapshot files contain?
Background for this question
I have a cluster that has been down for a couple of days due to simultaneous downtime on all brokers and a resulting corrupted broker. Now when it starts it is silent for hours (in the log file no new messages). By inspecting the JVM I have found that all the (very limited) cpu usage is spent in loadproducersfromlog function/method. By reading the comments above it is suggested that this is an attempt to recover producer state from the snapshots. I do not care about this. I just want my broker back so I am thinking if I can simply delete the snapshots to get Kafka started again.
If snapshot files are deleted, during start up method log.loadSegmentFiles(), all messages in the partition will have to be read to recreate the snapshot even if log and index files are present. This will increase the time to load partition.
For contents of snapshot file, please refer writeSnapshot() in ProducerStateManager.
https://github.com/apache/kafka/blob/980b725bb09ee42469534bf50d01118ce650880a/core/src/main/scala/kafka/log/ProducerStateManager.scala
Parameter log.dir defines where topics (ie, data) is stored (supplemental for log.dirs property).
A snapshot basically gives you a copy of your data at one point in time.
In a situation like yours, instead of waiting for a response, you could:
change the log.dirs path, restart everything and see how it goes;
backup the snapshots, saving them in a different location, delete them all from the previous one and see how it goes.
After that you're meant to be able to start up Kafka.

How does Kafka guarantee sequential disk access?

I'm a newbie for Kafka. When I read the documentation of Kafka, I saw that Kafka is performing well because of sequential disk access.
But how is that possible? In Java(or something else), If I use File I/O, OS will handle it appropriately. However, I can't know if OS store the files I want to store in multiple sectors or in contiguous sectors. So, Kafka cannot always say that sequential disk access occurs in my opinion.
Am I true or not?
Kafka does not always access disk sequentially but it does some things that make it much more likely that disk access is often sequential. All Kafka messages are stored in larger segment files (1GB each by default) and since Kafka messages are not deleted when consumed (like in other message brokers) Kafka will not end up creating a fragmented filesystem over time by continuously creating and deleting many variable length files. Instead it creates segment files and then appends to that file until it reaches 1GB (a configurable limit). Only when all messages in the segment expire will it delete the entire 1GB segment. This means that often these 1GB sections of disk are actually laid out as contiguous blocks. It is a recommended best practice to keep these Kafka commit log files on a dedicated filesystem so it does not get fragmented by other apps reading and writing variable length files into the same filesystem. More importantly most reading an writing to these segment files is sequential and goes through OS page cache so as to reduce disk I/O even further by caching the most often accessed pages in memory. This is why it is a recommendation to tune the kernel to set swappiness to 1 to reduce the likelihood that these cached pages would get swapped out of memory.

Can you tansfer the journal files between instances of HornetQ?

Can the journal files from one instance of HornetQ be copied to and read by another instance?
You can do that as long as you copy bindings / journals and jms altogether.
There are references to queueIDs from message journal that are held into bindings journal.
JMS journal is not that important on this sense but I would keep it as well.
I wouldn't do the copy while the journals are being written as you may get files out of sequence. a Journal is a sequential log and doing these operations on the middle may take you to undesirable results.
You're safe to do it when the journal is not being written (either server idle or stopped).