Can the journal files from one instance of HornetQ be copied to and read by another instance?
You can do that as long as you copy bindings / journals and jms altogether.
There are references to queueIDs from message journal that are held into bindings journal.
JMS journal is not that important on this sense but I would keep it as well.
I wouldn't do the copy while the journals are being written as you may get files out of sequence. a Journal is a sequential log and doing these operations on the middle may take you to undesirable results.
You're safe to do it when the journal is not being written (either server idle or stopped).
Related
I am trying to understand how ActiveMQ Artemis manages the journal files (under data/journal) and when it creates new ones. I read the documentation, but it wasn't clear how the files are create. I have a broker.xml has simple settings (won't be able to share unfortunately). Here are a few:
journal min threads - 2
journal pool size - 50
file size is defaulted to 10MB.
ActiveMQ Artemis starts, and I see 2 files are already created under /data/journal. I am now running a request that posts a lot of messages in a very short time. These messages are being actively consumed. I am publishing a lot of messages but they are not accumulating as fast because the consumer is consuming them. However, this doesn't cause the files to grow to recreate the space issue.
As the message volume is going up I don't see the number of files going up. It goes up and stays at 12 files for a long time.
I can understand the message # is not sufficient trigger additional files if only the latest journal file is being written to. However, I see all the 11 files have updated timestamps making me think they are being rotated.
My paging directory is empty.
I am trying to understand why the journal is not growing despite the message volume.
From what I can tell, message consumption has reached a break-even point with message production so that messages are not accumulating on the broker therefore the journal is not actually growing in size. The journal files are simply being re-used as messages are being added & acknowledged.
In ActiveMQ 5.x when using kahadb for persistence all the files are managed in a single database. This can have serious consequences.
I have hundreds of queues that see millions of messages per day. If a consumer of a queue is temporarily stopped for maintenance reasons the queues continue to fill and empty, and the one whose consumer is suspended sees the messages accumulate. But on the disc it is different. Kahadb indeed marks the deleted (consumed) messages, but cannot free the place if a more present message is kept in the database. This is the case with those that accumulate in the suspended queue.
Very quickly the disk space is full.
To remedy this, you have to change the configuration and use mkahadb. In this case there is one database per queue and therefore on the disk only the suspended queue takes up space.
I am considering switching to Artemis. But the persistence has been completely redesigned. So what happens in terms of disk occupancy when suspending a consumer?
This question is pretty broad, but I'll take a crack at it...
By default ActiveMQ Artemis uses a file-based journal. The journal consists of a pool of files that can grow and shrink based on configuration (see journal-min-files and journal-pool-files in the documentation). The size of each file is also configurable (i.e. via journal-file-size).
An initial pool of files will be created when the broker starts and as messages are stored and the initial pool of files fills up then additional files will be created. As messages are consumed the pool can shrink through a process called "compaction" which is also configurable (see journal-compact-min-files and journal-compact-percentage in the documentation). As long as 1 record in a journal file is considered "live" (e.g. an unconsumed message) then the whole journal file will remain. However, you can tune the impact of this to fit your environment (e.g. by lowering the journal-file-size, making compaction more aggressive, etc.). To be clear, if compaction runs and there is a journal file with only 1 "live" record that means all the other journal files are "full" and at most you will only ever have 1 journal file like that.
Also, you can configure max-disk-usage to block producers from sending more messages once disk utilization reaches a certain point.
Ultimately, if a consumer becomes inactive (for whatever reason) then the messages that consumer was supposed to receive will accumulate in the queues (and potentially on disk). If you want to prevent messages from accumulating in the first place you could implement flow control or blocking for producers. However, even if they do accumulate the file-based journal should be able to grow and shrink as needed.
If I understood correctly there is no way to guarantee that only the payload is kept. (as with mkahadb)
But we can limit the size of the pages and fix their number.
Considering the very large number of queues I have to manage, I think the best is to divide this into a cluster. But I am worried. because when an application is in maintenace (and I have 10 000) the messages of the others cannot be erased because the messages accumulate in a queue. It is clear that whatever the configuration in a few seconds I will crash or stop.
I am surprised to see stop communication between two applications because two others no longer communicate with each other. This is a strong limitation compared to ActiveMQ.
this will limit the problem but not solve it.
with mkahadb if I have 2 queues A and B, that A receives a message every second and B receives 5000/s and the consumers of B consume them immediately. the queue B is always empty or almost and occupies very little disk. If the consumer of A is stopped. the queue A increases but the queue B does not occupy more disk.
With Artemis if I reduce the journal size to 5000. Every second a journal file is full and deleted. If A stops, there must be 1 message from A in the journal. We therefore keep 5000 messages on the disk every second. Although queue B is almost always empty. if I reduce the journal size to 500 I keep less messages but it still grows 500 times faster than with mkahaDB. And if I reduce the journal to 1 to get the same result as with mkahadb, but I force Artemis to handle millions of files which collapses the perf.
I have the impression that Artémis is not made to have very large numbers of queues contrary to ActiveMQ.
thank
I've been read a lot about MongoDB recently, but one topic I can't find any clear material on, is how data is written to the journal and oplog.
So this is what I understand of the process so far, please correct me where I'm wrong
A client connect to mongod and performs a write. The write is stored in the socket buffer
When Mongo is available (not sure what available means at this point), data is written to the journal?
The mongoDB docs then say that writes every 60 seconds are flushed from the journal onto disk. By this I can only assume this mean written to the primary and the oplog. If this is the case, how to writes appear earlier than the 60 seconds sync interval?
Some time later, secondaries suck data from the primary or their sync source and update their oplog and databases. It seems very vague about when exactly this happens and what delays it.
I'm also wondering if journaling was disabled (I understand that's a really bad idea), at what point does the oplog and database get updated?
Lastly I'm a bit stumpted at which points in this process, the write locks get created. Is this just when the database and oplog are updated or at other times too?
Thanks to anyone who can shed some light on this or point me to some reading material.
Simon
Here is what happens as far as I understand it. I simplified a bit, but it should make clear how it works.
A client connects to mongo. No writes done so far, and no connection torn down, because it really depends on the write concern what happens now.Let's assume that we go with the (by the time of this writing) default "acknowledged".
The client sends it's write operation. Here is where I am really not sure. Either after this step or the next one the acknowledgement is sent to the driver.
The write operation is run through the query optimizer. It is here where the acknowledgment is sent because with in an acknowledged write concern, you may be returned a duplicate key error. It is possible that this was checked in the last step. If I should bet, I'd say it is after this one.
The output of the query optimizer is then applied to the data in memory Actually to the data of the memory mapped datafiles, to the memory mapped oplog and to the journal's memory mapped files. Queries are answered from this memory mapped parts or the according data is mapped to memory for answering the query. The oplog is read from memory if present, too.
Every 100ms in general the journal is synced to disk. The precise value is determined by a number of factors, one of them being the journalCommitInterval configuration parameter. If you have a write concern of journaled, the driver will be notified now.
Every syncDelay seconds, the current state of the memory mapped files is synced to disk I think the journal is truncated to the entries which weren't applied to the data yet, but I am not too sure of that since that it should basically never happen that data in the journal isn't yet applied to the current data.
If you have read carefully, you noticed that the data is ready for the oplog as early as it has been run through the query optimizer and was applied to the files mapped into memory. When the oplog entry is pulled by one of the secondaries, it is immediately applied to it's data of the memory mapped files and synced in the disk the same way as on the primary.
Some things to note: As soon as the relatively small data is written to the journal, it is quite safe. If a node goes down between two syncs to the datafiles, both the datafiles and the oplog can be restored from their last state in the datafiles and the journal. In general, the maximum data loss you can have is the operations recorded into the log after the last commit, 50ms in median.
As for the locks. If you have written carefully, there aren't locks imposed on a database level when the data is synced to disk. Write locks may be created in order to assure that only one thread at any given point in time modifies a given document. There are other write locks possible , but in general, they should be rather rare.
Write locks on the filesystem layer are created once, though only implicitly, iirc. During application startup, a lock file is created in the root directory of the dbpath. Any other mongod instance will refuse to do any operation on those datafiles while a valid lock exists. And you shouldn't either ;)
Hope this helps.
Using MongoDB (via PyMongo) in the default "acknowledged" write concern mode, is it the case that if I have a line that writes to the DB (e.g. a mapReduce that outputs a new collection) followed by a line that reads from the DB, the read will always see the changes from the write?
Further, is the above true for all stricter write concerns than "acknowledged," i.e. "journaled" and "replica acknowledged," but not true in the case of "unacknowledged"?
If the write has been acknowledged, it should have been written to memory, thus any subsequent query should get the current data. This won't work if you have a replica set and allow reads from secondaries.
Journaled writes are written to the journal file on disk, which protects your data in case of power / hardware failures, etc. This shouldn't have an impact on consistency, which is covered as soon as the data is in memory.
Any replica configuration in the write concern will ensure that writes need to be acknowledged by the majority / all nodes in the replica set. This will only make a difference if you read from replicas or to protect your data against unreachable / dead servers.
For example in case of WiredTiger database engine, there'll be a cache of pages inside memory that are periodically written and read from disk, depending on memory pressure. And, in case of MMAPV1 storage engine, there would be a memory mapped address space that would correspond to pages on the disk. Now, the secondary structure that's called a journal. And a journal is a log of every single thing that the database processes - notice that the journal is also in memory.
When does the journal gets written to the disk?
When the app request something to the mongodb server via a TCP connection - and the server is gonna process the request. And it's going to write it into the memory pages. But they may not write to the disk for quite a while, depending on the memory pressure. It's also going to update request into the journal. By default, in the MongoDB driver, when we make a database request, we wait for the response. Say an acknowledged insert/update. But we don't wait for the journal to be written to the disk. The value that represents - whether we're going to wait for this write to be acknowledged by the server is called w.
w = 1
j = false
And by default, it's set to 1. 1 means, wait for this server to respond to the write. By default, j equals false, and j which stands for journal, represents whether or not we wait for this journal to be written to be written to the disk before we continue. So, what are the implications of these defaults? Well, the implications are that when we do an update/insert - we're really doing the operation in memory and not necessarily to the disk. This means, of course, it's very fast. And periodically (every few seconds) the journal gets written to the disk. It won't be long, but during this window of vulnerability when the data has been written into the server's memory into the pages, but the journal has not yet been persisted to the disk, if the server crashed, we could lose the data. We also have to realize that, as a programmer just because the write came back as good and it was written successfully to the memory. It may not ever persist to disk if the server subsequently crashes. And whether or not this is the problem depends on the application. For some applications, where there are lots of writes and logging small amount of data, we might find that it's very hard to even keep up with the data stream, if we wait for the journal to get written to the disk, because the disk is going to be 100 times, 1,000 times slower than memory for every single write. But for other applications, we may find that it's completely necessary for us to wait for this to be journaled and to know that it's been persisted to the disk before we continue. So, it's really upto us.
The w and j value together are called write concern. They can be set in the driver, at the collection level, database level or a client level.
1 : wait for the write to be acknowledged
0 : do not wait for the write to be acknowledged
TRUE : sync to journal
FALSE : do not sync to journal
There are also other values for w as well that also have some significance. With w=1 & j=true we could make sure that those writes have been persisted to disk. Now, if the writes have been written to the journal, then what happens is if the server crashes, then even though the pages may not be written back to disk yet, on recovery, the server can look at the journal on the disk - the mongod process and recreate all the writes that were not yet persisted to the pages. Because, they've been written to the journal. So, that's why this gives us a greater level of safety.
I had MSMQ fill up on me the other day (forgot to turn off journalling on a queue) and saw in the storage folder that there seems to be different types of .mp files.
Some were p*.mp and are 4096K, and others are l*.mp and are 8K.
What are the differences between these files?
From the plumber's mate:
All MSMQ messages are written to
memory mapped files in the Storage
directory:
R*.MQ files are for Express messages
P*.MQ files are for Persistent (Recoverable) messages, including
Transactional messages
J*.MQ files are for Journal messages
L*.MQ files are bitmap indexes to the Persistent and Journal files