MSMQ .mp files - What are the different types? - msmq

I had MSMQ fill up on me the other day (forgot to turn off journalling on a queue) and saw in the storage folder that there seems to be different types of .mp files.
Some were p*.mp and are 4096K, and others are l*.mp and are 8K.
What are the differences between these files?

From the plumber's mate:
All MSMQ messages are written to
memory mapped files in the Storage
directory:
R*.MQ files are for Express messages
P*.MQ files are for Persistent (Recoverable) messages, including
Transactional messages
J*.MQ files are for Journal messages
L*.MQ files are bitmap indexes to the Persistent and Journal files

Related

ActiveMQ Artemis data journal folder

I am trying to understand how ActiveMQ Artemis manages the journal files (under data/journal) and when it creates new ones. I read the documentation, but it wasn't clear how the files are create. I have a broker.xml has simple settings (won't be able to share unfortunately). Here are a few:
journal min threads - 2
journal pool size - 50
file size is defaulted to 10MB.
ActiveMQ Artemis starts, and I see 2 files are already created under /data/journal. I am now running a request that posts a lot of messages in a very short time. These messages are being actively consumed. I am publishing a lot of messages but they are not accumulating as fast because the consumer is consuming them. However, this doesn't cause the files to grow to recreate the space issue.
As the message volume is going up I don't see the number of files going up. It goes up and stays at 12 files for a long time.
I can understand the message # is not sufficient trigger additional files if only the latest journal file is being written to. However, I see all the 11 files have updated timestamps making me think they are being rotated.
My paging directory is empty.
I am trying to understand why the journal is not growing despite the message volume.
From what I can tell, message consumption has reached a break-even point with message production so that messages are not accumulating on the broker therefore the journal is not actually growing in size. The journal files are simply being re-used as messages are being added & acknowledged.

JBoss HornetQ/ActiveMQ Artemis saving queue message to file system

Can someone please help me in understanding the impact of saving Hornetq/ActiveMQ Artemis messages to the file system and bypassing the queue every time?
The message is more than 2GB, and I run into Maximum size 2GB exceeded exception in HornetQ described here. So I was planning to not add the message to queue but write it manually to disk and pass the path of the file in header and read the message from the file. I really don't know the performance impact so asking if I do for all messages less than 2GB, will there be any performance impact?
Given the information you've provided I don't think anybody but you can determine the "performance impact" of manually writing the file to disk vs. sending the file to the broker.
Generally speaking, you will save the time required to send the file to the broker in the first place, but you don't indicate how fast the hard-drives are. If the hard-drive on the broker is much faster than the hard-drive on the client then it may take longer overall to write the file to disk manually.
Also, if the network is slow between the clients and the broker and fast between the clients and the shared drive where you're going to write the file then it may be overall faster to write the file to disk manually.
Ultimately it's going to be up to you to test the performance impact of your changes.

How to modify the configuration of Kafka to process large amount of data

I am using kafka_2.10-0.10.0.1. I have two questions:
- I want to know how I can modify the default configuration of Kafka to process large amount of data with good performance.
- Is it possible to configure Kafka to process the records in memory without storing in disk?
thank you
Is it possible to configure Kafka to process the records in memory without storing in disk?
No. Kafka is all about storing records reliably on disk, and then reading them back quickly off of disk. In fact, its documentation says:
As a result of taking storage seriously and allowing the clients to control their read position, you can think of Kafka as a kind of special purpose distributed filesystem dedicated to high-performance, low-latency commit log storage, replication, and propagation.
You can read more about its design here: https://kafka.apache.org/documentation/#design. The implementation section is also quite interesting: https://kafka.apache.org/documentation/#implementation.
That said, Kafka is also all about processing large amounts of data with good performance. In 2014 it could handle 2 million writes per second on three cheap instances: https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines. More links about performance:
https://docs.confluent.io/current/kafka/deployment.html
https://www.confluent.io/blog/optimizing-apache-kafka-deployment/
https://community.hortonworks.com/articles/80813/kafka-best-practices-1.html
https://www.cloudera.com/documentation/kafka/latest/topics/kafka_performance.html

How does Kafka guarantee sequential disk access?

I'm a newbie for Kafka. When I read the documentation of Kafka, I saw that Kafka is performing well because of sequential disk access.
But how is that possible? In Java(or something else), If I use File I/O, OS will handle it appropriately. However, I can't know if OS store the files I want to store in multiple sectors or in contiguous sectors. So, Kafka cannot always say that sequential disk access occurs in my opinion.
Am I true or not?
Kafka does not always access disk sequentially but it does some things that make it much more likely that disk access is often sequential. All Kafka messages are stored in larger segment files (1GB each by default) and since Kafka messages are not deleted when consumed (like in other message brokers) Kafka will not end up creating a fragmented filesystem over time by continuously creating and deleting many variable length files. Instead it creates segment files and then appends to that file until it reaches 1GB (a configurable limit). Only when all messages in the segment expire will it delete the entire 1GB segment. This means that often these 1GB sections of disk are actually laid out as contiguous blocks. It is a recommended best practice to keep these Kafka commit log files on a dedicated filesystem so it does not get fragmented by other apps reading and writing variable length files into the same filesystem. More importantly most reading an writing to these segment files is sequential and goes through OS page cache so as to reduce disk I/O even further by caching the most often accessed pages in memory. This is why it is a recommendation to tune the kernel to set swappiness to 1 to reduce the likelihood that these cached pages would get swapped out of memory.

Can you tansfer the journal files between instances of HornetQ?

Can the journal files from one instance of HornetQ be copied to and read by another instance?
You can do that as long as you copy bindings / journals and jms altogether.
There are references to queueIDs from message journal that are held into bindings journal.
JMS journal is not that important on this sense but I would keep it as well.
I wouldn't do the copy while the journals are being written as you may get files out of sequence. a Journal is a sequential log and doing these operations on the middle may take you to undesirable results.
You're safe to do it when the journal is not being written (either server idle or stopped).