MSMQ consuming large amounts of memory when processing messages with NServiceBus - msmq

I have two windows services which use NserviceBus. One writes messages to the queue and the other reads from it and do some processing. All the queues are transactional and the NserviceBus endpoints are configured as below.
.IsTransactional(true)
.IsolationLevel(IsolationLevel.ReadCommitted)
.MsmqTransport()
.RunTimeoutManager()
.UseInMemoryTimeoutPersister()
.MsmqSubscriptionStorage()
.DisableRavenInstall()
.JsonSerializer()
The issue is when a large amount of messages (170,000+) are queued, MSMQ service (mqsvc.exe) chews up quite a bit of memory (1.5 - 2.0 GB) and that memory doesn't get released for at least 5 - 6 hours. The average message size is around 5 - 10 KB. And it seems like the more messages you queue the more memory it uses. The NServiceBus based Windows Services memory consumption are in perfectly acceptable limits (50 - 100 MB) and do not increase no matter how many messages they process.
Any ideas on why MSMQ would use this much memory and takes quite long to release it?
Thanks heaps.

This is perfectly normal. MSMQ uses storage in 4MB blocks of memory which map to the files in the Storage folder. 170,000 messages at 5-10kb each is 0.85-1.7GB so no surprise you're seeing so much virtual memory being allocated. To reduce the overhead of deleting and creating files as messages are removed or arrive, the storage files are kept for 6 hours. After this period, the empty files are deleted. You can configure this, as discussed in my blog post:
Forcing MSMQ to clean up its storage files

On the off-chance it will help anyone - this post on google groups by msmq legend John Breakwell documents how to actually clean down all the messages in storage completely, which is sometimes desirable/necessary
https://groups.google.com/d/msg/microsoft.public.msmq.performance/jByfXUwXFw8/i1hVP1WJpJgJ
I had 8GB of files but no messages in any queues, and the msmq service would take around 2 hours just to enter the started state. Purging any queue would take 10s of minutes and cause massive memory spikes, which did not then get released for days, if ever.
If you're ever in this situation, rather than re-install message queuing you can just follow these steps:
Stop Message Queuing service
Go to the MSMQ storage location (usually C:\Windows\System32\msmq\storage)
Delete ONLY the P*.MQ, J*.MQ, R*.MQ, and L*.MQ files

Related

ActiveMQ Artemis data journal folder

I am trying to understand how ActiveMQ Artemis manages the journal files (under data/journal) and when it creates new ones. I read the documentation, but it wasn't clear how the files are create. I have a broker.xml has simple settings (won't be able to share unfortunately). Here are a few:
journal min threads - 2
journal pool size - 50
file size is defaulted to 10MB.
ActiveMQ Artemis starts, and I see 2 files are already created under /data/journal. I am now running a request that posts a lot of messages in a very short time. These messages are being actively consumed. I am publishing a lot of messages but they are not accumulating as fast because the consumer is consuming them. However, this doesn't cause the files to grow to recreate the space issue.
As the message volume is going up I don't see the number of files going up. It goes up and stays at 12 files for a long time.
I can understand the message # is not sufficient trigger additional files if only the latest journal file is being written to. However, I see all the 11 files have updated timestamps making me think they are being rotated.
My paging directory is empty.
I am trying to understand why the journal is not growing despite the message volume.
From what I can tell, message consumption has reached a break-even point with message production so that messages are not accumulating on the broker therefore the journal is not actually growing in size. The journal files are simply being re-used as messages are being added & acknowledged.

How does persistence work in ActiveMQ Artemis?

In ActiveMQ 5.x when using kahadb for persistence all the files are managed in a single database. This can have serious consequences.
I have hundreds of queues that see millions of messages per day. If a consumer of a queue is temporarily stopped for maintenance reasons the queues continue to fill and empty, and the one whose consumer is suspended sees the messages accumulate. But on the disc it is different. Kahadb indeed marks the deleted (consumed) messages, but cannot free the place if a more present message is kept in the database. This is the case with those that accumulate in the suspended queue.
Very quickly the disk space is full.
To remedy this, you have to change the configuration and use mkahadb. In this case there is one database per queue and therefore on the disk only the suspended queue takes up space.
I am considering switching to Artemis. But the persistence has been completely redesigned. So what happens in terms of disk occupancy when suspending a consumer?
This question is pretty broad, but I'll take a crack at it...
By default ActiveMQ Artemis uses a file-based journal. The journal consists of a pool of files that can grow and shrink based on configuration (see journal-min-files and journal-pool-files in the documentation). The size of each file is also configurable (i.e. via journal-file-size).
An initial pool of files will be created when the broker starts and as messages are stored and the initial pool of files fills up then additional files will be created. As messages are consumed the pool can shrink through a process called "compaction" which is also configurable (see journal-compact-min-files and journal-compact-percentage in the documentation). As long as 1 record in a journal file is considered "live" (e.g. an unconsumed message) then the whole journal file will remain. However, you can tune the impact of this to fit your environment (e.g. by lowering the journal-file-size, making compaction more aggressive, etc.). To be clear, if compaction runs and there is a journal file with only 1 "live" record that means all the other journal files are "full" and at most you will only ever have 1 journal file like that.
Also, you can configure max-disk-usage to block producers from sending more messages once disk utilization reaches a certain point.
Ultimately, if a consumer becomes inactive (for whatever reason) then the messages that consumer was supposed to receive will accumulate in the queues (and potentially on disk). If you want to prevent messages from accumulating in the first place you could implement flow control or blocking for producers. However, even if they do accumulate the file-based journal should be able to grow and shrink as needed.
If I understood correctly there is no way to guarantee that only the payload is kept. (as with mkahadb)
But we can limit the size of the pages and fix their number.
Considering the very large number of queues I have to manage, I think the best is to divide this into a cluster. But I am worried. because when an application is in maintenace (and I have 10 000) the messages of the others cannot be erased because the messages accumulate in a queue. It is clear that whatever the configuration in a few seconds I will crash or stop.
I am surprised to see stop communication between two applications because two others no longer communicate with each other. This is a strong limitation compared to ActiveMQ.
this will limit the problem but not solve it.
with mkahadb if I have 2 queues A and B, that A receives a message every second and B receives 5000/s and the consumers of B consume them immediately. the queue B is always empty or almost and occupies very little disk. If the consumer of A is stopped. the queue A increases but the queue B does not occupy more disk.
With Artemis if I reduce the journal size to 5000. Every second a journal file is full and deleted. If A stops, there must be 1 message from A in the journal. We therefore keep 5000 messages on the disk every second. Although queue B is almost always empty. if I reduce the journal size to 500 I keep less messages but it still grows 500 times faster than with mkahaDB. And if I reduce the journal to 1 to get the same result as with mkahadb, but I force Artemis to handle millions of files which collapses the perf.
I have the impression that Artémis is not made to have very large numbers of queues contrary to ActiveMQ.
thank

Mirth performance benchmark

We are using mirth connect for message transformation from hl7 to text and storing the transformed messages to azure sql database. Our current performance is 45000 messages per hour .
machine configuration is
8 GB RAM and 2 core CPU. Memory assigned to mirth is -XMS = 6122MB
We don't have any idea about what could be performance parameters for Mirth with above configurations. Anyone have idea about performance benchmarks for Mirth connect?
I'd recommend looking into the Max Processing Threads option in version 3.4 and above. It's configurable in the Source Settings (Source tab). By default it's set to 1, which means only one message can process through the channel's main processing thread at any given time. This is important for certain interfaces where order of messages is paramount, but obviously it limits throughput.
Note that whatever client is sending your channel messages also needs to be reconfigured to send multiple messages in parallel. For example if you have a single-threaded process that is sending your channel messages via TCP/MLLP one after another in sequence, increasing the max processing threads isn't necessarily going to help because the client is still single-threaded. But, for example, if you stand up 10 clients all sending to your channel simultaneously, then you'll definitely reap the benefits of increasing the max processing threads.
If your source connector is a polling type, like a File Reader, you can still benefit from this by turning the Source Queue on and increasing the Max Processing Threads. When the source queue is enabled and you have multiple processing threads, multiple queue consumers are started and all read and process from the source queue at the same time.
Another thing to look at is destination queuing. In the Advanced (wrench icon) queue settings, there is a similar option to increase the number of Destination Queue Threads. By default when you have destination queuing enabled, there's just a single queue thread that processes messages in a FIFO sequence. Again, good for message order but hampers throughput.
If you do need messages to be ordered and want to maximize parallel throughput (AKA have your cake and eat it too), you can use the Thread Assignment Variable in conjunction with multiple destination Queue Threads. This allows you to preserve order among messages with the same unique identifier, while messages pertaining to different identifiers can process simultaneously. A common use-case is to use the patient MRN for this, so that all messages for a given patient are guaranteed to process in the order they were received, but messages longitudinally across different patients can process simultaneously.
We are using an AWS EC2 4c.4xlarge instance to test a bare bone Proof of Concept performance limit. We got about 50 msgs/sec without obvious bottlenecks on cpu/memory/network/disk io/db io and etc. Want to push the limits higher. Please share your observations if any.
We run the same process. Mirth -> Azure SQL Database. We're running through performance testing right now and have been stuck at 12 - 15 messages/second (43000 - 54000 per hour).
We've run tests on each channel and found this:
1 channel source: file reader -> destination: Azure SQL DB was about 36k per hour
2 channel source: file reader -> destination: Azure SQL DB was about 59k per hour
3 channel source: file reader -> destination: Azure SQL DB was about 80k per hour
We've added multi-threading (2,4,8) to both the source and destination on 1 channel with no performance increase. Mirth is running on 8GB mem and 2 Cores with heap size set to 2048MB.
We are now going to run through a few tests with mirth running on similar "hardware" as a C4.4xlarge which in Azure is 16 cores and 32GB mem. There is 200gb of SSD available as well.
Our goal is 100k messages per hour per channel.

Jboss Activemq 6.1.0 queue message processing slows down after 10000 messages

Below is the configuration:
2 JBoss application nodes
5 listeners on the application node with 50 threads each, supports
clustering and is set up as active-active listener, so they run on
both app nodes
The listener simply gets the message and logs the information into
database
50000 messages are posted into ActiveMQ using JMeter.
Here is the observation on first execution:
Total 50000 messages are consumed in approx 22 mins.
first 0-10000 messages consumed in 1 min approx
10000-20000 messages consumed in 2 mins approx
20000-30000 messages consumed in 4 mins approx
30000-40000 messages consumed in 6 mins approx
40000-50000 messages consumed in 8 mins
So we see the message consumption time is increasing with increasing number of messages.
Second execution without restarting any of the servers:
50000 messages consumed in 53 mins approx!
But after deleting data folder of activemq and restarting activemq,
performance again improves but degrades as more data enters the queue!
I tried multiple configuration in activemq.xml, but no success...
Anybody faced similar issue, and got any solution ? Let me know. Thanks.
I've seen similar slowdowns in our production systems when pending message counts go high. If you're flooding the queues then the MQ process can't keep all the pending messages in memory, and has to go to disk to serve a message. Performance can fall off a cliff in these circumstances. Increase the memory given to the MQ server process.
Also looks as though the disk storage layout is not particularly efficient - perhaps having each message as a file in a single directory? This can make access time rise as traversing disk directory takes longer.
50000 messages in > 20 mins seems very low performance.
Following configuration works well for me (these are just pointers. You may already have tried some of these but see if it works for you)
1) Server and queue/topic policy entry
// server
server.setDedicatedTaskRunner(false)
// queue policy entry
policyEntry.setMemoryLimit(queueMemoryLimit); // 32mb
policyEntry.setOptimizedDispatch(true);
policyEntry.setLazyDispatch(true);
policyEntry.setReduceMemoryFootprint(true);
policyEntry.setProducerFlowControl(true);
policyEntry.setPendingQueuePolicy(new StorePendingQueueMessageStoragePolicy());
2) If you are using KahaDB for persistence then use per destination adapter (MultiKahaDBPersistenceAdapter). This keeps the storage folders separate for each destination and reduces synchronization efforts. Also if you do not worry about abrupt server restarts (due to any technical reason) then you can reduce then disk sync efforts by
kahaDBPersistenceAdapter.setEnableJournalDiskSyncs(false);
3) Try increasing the memory usage, temp and storage disk usage values at server level.
4) If possible increase prefetchSize in prefetch policy. This will improve performance but also increases the memory footprint of consumers.
5) If possible use transactions in consumers. This will help to reduce the message acknowledgement handling and disk sync efforts by server.
Point 5 mentioned by #hemant1900 solved the problem :) Thanks.
5) If possible use transactions in consumers. This will help to reduce
the message acknowledgement handling and disk sync efforts by server.
The problem was in my code. I had not used transaction to persist the data in consumer, which is anyway bad programming..I know :(
But didn't expect that could have caused this issue.
Now 50000, messages are getting processed in less than 2 mins.

How to increase number of messages that can be stored in MSMQ

We have a number of MSMQ queues throughout our system, both private and public queues. Sometimes a windows service that reads from a queue will crash, and so messages will build up in that queue. Once the queue gets to a certain size (maybe 60K messages), all queues on that server will stop working, throwing errors about insufficient resources.
My question is, how are the queues really working behind the scenes, are they storing messages in RAM or on the hard drive? Does it run out of resources and crash when the server runs out of RAM? If it's using some allocated space on the hard drive, is there a way to increase the allowable size? If it's using RAM, can I simply add RAM to the servers and then that will increase the allowable size?
I need to make sure that when a service goes down, we can handle storing 100K or 200K messages in that queue while we work on fixing the service, as those messages are critical to our business.
Here is an article on MSDN that seems to address your question (as John points out below, this only applies to Windows Server 2000 so should probably be ignored by most people): Resource management in MSMQ applications. Specifically:
For MSMQ 1.0 and MSMQ 2.0, the combined size of messages capable of being stored on one machine is not limited to the amount of RAM in the machine or the size of the hard disk, but to the amount of virtual address space provided to the MSMQ service by the operating system (this limitation has been lifted in MSMQ 3.0). Each process in an x86 machine is allotted a virtual 4 GB of addressable memory. 2GB is reserved for use in kernel mode and 2GB for user mode. The MSMQ Queue Manager operates in user mode and therefore has an addressable 2GB of virtual address space to work with. Each message's data is stored in RAM, which is backed up by the system's paging file or memory mapped files. MSMQ uses memory mapped files to store both express and recoverable messages. Since we are limited to 2GB of addressable memory, we are limited to 2GB worth of messages on a disk. When you take into account the memory utilized by MSMQ code and its internal data structures, as well as file allocation to store message files on disk, we end up with between 1.4GB and 1.6GB worth of messages that can be stored on disk.
Note   This limitation of 1.6GB can be raised to approximately 2.6GB by enabling 3GB tuning on the MSMQ Service. See Q171793 for more information on how to enable 3GB tuning.
Edit: the tuning link seems to be broken. I believe it should be pointing here.
In terms of later versions of MSMQ, John discusses the issue in a blog post.
Maximum number of messages
This one is not as simple to work out. From my Insufficient Resources post we know that each message needs 75 bytes of kernel memory for indexing so, for example, 2 million messsages would require roughly 150 megabytes. It would seem, therefore, that all you need to do is add more RAM. After looking at a comparison of 32-bit and 64-bit memory architectures, though, you will quickly have to move to the 64-bit platform to take advantage of your investment as 32-bit machines max out at 450 MB of paged pool memory regardless of the amount of RAM fitted.
But, again, if you are trying to work out what amount of RAM will generate the paged pool memory required to accommodate a billion MSMQ messages, your design spec is up for some serious reviewing.
Not sure about the in-depth answer, but on a surface level anyhow, a non-transactional queue stores messages in memory, whereas a transactional queue stores messages on disk.
UPDATE
As John states below, all messages are held on disk whether durable or non-durable queues are used.