How to increase number of messages that can be stored in MSMQ - msmq

We have a number of MSMQ queues throughout our system, both private and public queues. Sometimes a windows service that reads from a queue will crash, and so messages will build up in that queue. Once the queue gets to a certain size (maybe 60K messages), all queues on that server will stop working, throwing errors about insufficient resources.
My question is, how are the queues really working behind the scenes, are they storing messages in RAM or on the hard drive? Does it run out of resources and crash when the server runs out of RAM? If it's using some allocated space on the hard drive, is there a way to increase the allowable size? If it's using RAM, can I simply add RAM to the servers and then that will increase the allowable size?
I need to make sure that when a service goes down, we can handle storing 100K or 200K messages in that queue while we work on fixing the service, as those messages are critical to our business.

Here is an article on MSDN that seems to address your question (as John points out below, this only applies to Windows Server 2000 so should probably be ignored by most people): Resource management in MSMQ applications. Specifically:
For MSMQ 1.0 and MSMQ 2.0, the combined size of messages capable of being stored on one machine is not limited to the amount of RAM in the machine or the size of the hard disk, but to the amount of virtual address space provided to the MSMQ service by the operating system (this limitation has been lifted in MSMQ 3.0). Each process in an x86 machine is allotted a virtual 4 GB of addressable memory. 2GB is reserved for use in kernel mode and 2GB for user mode. The MSMQ Queue Manager operates in user mode and therefore has an addressable 2GB of virtual address space to work with. Each message's data is stored in RAM, which is backed up by the system's paging file or memory mapped files. MSMQ uses memory mapped files to store both express and recoverable messages. Since we are limited to 2GB of addressable memory, we are limited to 2GB worth of messages on a disk. When you take into account the memory utilized by MSMQ code and its internal data structures, as well as file allocation to store message files on disk, we end up with between 1.4GB and 1.6GB worth of messages that can be stored on disk.
Note   This limitation of 1.6GB can be raised to approximately 2.6GB by enabling 3GB tuning on the MSMQ Service. See Q171793 for more information on how to enable 3GB tuning.
Edit: the tuning link seems to be broken. I believe it should be pointing here.
In terms of later versions of MSMQ, John discusses the issue in a blog post.
Maximum number of messages
This one is not as simple to work out. From my Insufficient Resources post we know that each message needs 75 bytes of kernel memory for indexing so, for example, 2 million messsages would require roughly 150 megabytes. It would seem, therefore, that all you need to do is add more RAM. After looking at a comparison of 32-bit and 64-bit memory architectures, though, you will quickly have to move to the 64-bit platform to take advantage of your investment as 32-bit machines max out at 450 MB of paged pool memory regardless of the amount of RAM fitted.
But, again, if you are trying to work out what amount of RAM will generate the paged pool memory required to accommodate a billion MSMQ messages, your design spec is up for some serious reviewing.

Not sure about the in-depth answer, but on a surface level anyhow, a non-transactional queue stores messages in memory, whereas a transactional queue stores messages on disk.
UPDATE
As John states below, all messages are held on disk whether durable or non-durable queues are used.

Related

Memory Address Translation in OS

Is Memory address translation only useful when the total size of virtual memory
(summed over all processes) needs to be larger than physical memory?
Basically, the size of virtual memory depends on what you call "virtual memory". If you call virtual memory the virtual memory of one process then virtual memory has the same size than physical memory. If you call virtual memory the whole virtual memory of all processes than virtual memory can (technically) have an infinite size. This is because every process can have a whole address space. The virtual address space of one process cannot be bigger than physical memory because the processor has limited bits to address this memory. In modern long mode the processor has only 48 bits to address RAM at the byte level. This gives a very big amount of RAM but most systems will have 8GB to 32GB.
Technically on a 8GB RAM computer, every process could have 8GB allocated. I say technically because eventually, the computer will constantly be removing page frames from RAM and that will put too much overhead on the OS and on the computer which will make your system freeze. In the end, the size of the sum of the virtual memory of every process is limited by the capacity of your system (and OS) to have an efficient page swapping algorithm (and on your willingness to have a slow system).
Now to answer your question, paging (virtual memory) is used also to avoid fragmentation and for securing the system. With the old segmentation model, fragmentation was an issue because you had to run a complex algorithm to determine which part of memory a process gets. With paging, the smallest granularity of memory is 4KB. This makes everything much easier because a small process just gets a 4KB page and the process can work in that page the way it wants. While a bigger process will get several pages and can allocate more pages by doing a system call. There is still the issue of external fragmentation but it is mostly due to latency of accessing high memory vs low memory. Basically, paging solves the issue of external fragmentation because a process can get a page anywhere (where it's available) and it will not make a difference (except for high vs low memory). There is still the issue of internal fragmentation with paging.
Paging also secures the system. With segmentation you had several levels of ring protection. With paging you have only user or supervisor. With segmentation, the memory is not well protected because one process can access the memory of another process in the same segment. With paging, there are 2 different protections. The first protection is the ring itself (user vs supervisor) the second are the page tables. The page tables isolate one process from another because the memory accesses are translated to other positions in RAM. It is the job of the OS to fill the page tables properly so that one process doesn't have access to the physical memory allocated to another process. The user vs supervisor bit in the page tables, prevent one process from accessing the kernel except via a system call interface (the instruction syscall in assembly for x86).

If an application program run in RAM where does Operating system run?

I have read that Operating System is loaded in main memory (RAM) when computer boots up. Also, application programs are loaded into main memory (RAM) for execution. How do both of these run simultaneously in main memory? Does the operating system stop its execution when an application program is running?
I don't know of a good overview of these areas, so I'll try to help.
Memory (RAM) can be visualized as a collection of lockers. Each locker can store something independently of all other lockers. Each locker has a number, so you can find a particular locker easily. In RAM, the locker is a byte that can store a value between zero and 255, and the locker number is an address. Better than a locker; you can open the byte at address zero, then the byte at address 1000000 instantly. You don't have to walk down a long hallway. That is what the R in RAM refers to: Random, as in Random Access Memory. Essentially every location takes the same amount of time to access.
Machines have a lot of RAM, on the order of billions of bytes. Even very big operating systems do not need all of RAM; if they require 50 million bytes, that is only 50 / 1000 or 5% of what is now considered a small system. That leaves 950 million bytes for programs to use. If every program was as big as the operating system, you could run 950/50 = 19 of them. There are tricks to permit running even more.
One of the fundamental jobs of the operating system is to provision resources like RAM to applications, and make sure that applications cannot snoop on or modify each others RAM without prior arrangement. To do this, the operating system typically uses a trick where program addresses are indirectly translated to RAM addresses under control of the operating system. This way, all applications can think they have ram at (say) address 4194304. This trick is called an MMU (Memory Management Unit), and the details start to explode at this point.
Review:
RAM is a collection of places to store numbers, and each storage place has a unique address.
There is lots of RAM, so we just have to divvy it up between applications.
We can keep applications RAM separate and secret from other applications.
The operating system only uses a relatively small amount of RAM.

Multi core machine - cpu load metric

In a multi core machine what is the best metric to understand whether cpu is loaded or not ?
I have a web application that sends a post request to apache CGI server. CGI server loops over the post data and launches perl process for each of the item in the loop. Since requests from clients ends up hitting a single endpoint, I am concerned if I end up creating lots of processes which my server can't handle. Hence I wanted to understand what system metric should I check before launching a new process from loop.
Note: I have a 20 core machine.
The reason the answer isn't easy to find, is that it depends on the nature of your processes, and which system constraint is your limiting factor.
For CPU intensive work, then the metric to look at is load average - load average is a measure of processes in a runnable state - very roughly if LA is the same as number of cores, then you're running your CPUs at maximum.
However, it's increasingly the case that CPU is not the limiting factor - you may have a finite amount of memory, and memory hungry processes will consume it. 'spare' memory is used for caching, so filling the whole lot up actually starts to slow things down (because you have a smaller cache). Over spilling the available will either cause swapping or OOMkiller.
But as you mention apache and web, then chances are pretty good that your network pipe is a limiting factor - controlling bandwidth from the local host is actually surprisingly hard.
And then there's disk IO - which may also be a factor - I think that's unlikely for a web server, because your outbound network will usually be a tighter limit.
It all depends what your processes are doing - if they're lightweight 'helpers' that are mostly idle, or heavyweight 'grinders' that all introduce noticeable load.
So the best answer I can give is a very vague estimate - if your processes are CPU intensive, cap them at 2 per core. If your processes are memory, aim to consume about 50% of your system RAM. If your processes are IO intensive, aim to consume about 50% of your IO (either network or disk).

MSMQ consuming large amounts of memory when processing messages with NServiceBus

I have two windows services which use NserviceBus. One writes messages to the queue and the other reads from it and do some processing. All the queues are transactional and the NserviceBus endpoints are configured as below.
.IsTransactional(true)
.IsolationLevel(IsolationLevel.ReadCommitted)
.MsmqTransport()
.RunTimeoutManager()
.UseInMemoryTimeoutPersister()
.MsmqSubscriptionStorage()
.DisableRavenInstall()
.JsonSerializer()
The issue is when a large amount of messages (170,000+) are queued, MSMQ service (mqsvc.exe) chews up quite a bit of memory (1.5 - 2.0 GB) and that memory doesn't get released for at least 5 - 6 hours. The average message size is around 5 - 10 KB. And it seems like the more messages you queue the more memory it uses. The NServiceBus based Windows Services memory consumption are in perfectly acceptable limits (50 - 100 MB) and do not increase no matter how many messages they process.
Any ideas on why MSMQ would use this much memory and takes quite long to release it?
Thanks heaps.
This is perfectly normal. MSMQ uses storage in 4MB blocks of memory which map to the files in the Storage folder. 170,000 messages at 5-10kb each is 0.85-1.7GB so no surprise you're seeing so much virtual memory being allocated. To reduce the overhead of deleting and creating files as messages are removed or arrive, the storage files are kept for 6 hours. After this period, the empty files are deleted. You can configure this, as discussed in my blog post:
Forcing MSMQ to clean up its storage files
On the off-chance it will help anyone - this post on google groups by msmq legend John Breakwell documents how to actually clean down all the messages in storage completely, which is sometimes desirable/necessary
https://groups.google.com/d/msg/microsoft.public.msmq.performance/jByfXUwXFw8/i1hVP1WJpJgJ
I had 8GB of files but no messages in any queues, and the msmq service would take around 2 hours just to enter the started state. Purging any queue would take 10s of minutes and cause massive memory spikes, which did not then get released for days, if ever.
If you're ever in this situation, rather than re-install message queuing you can just follow these steps:
Stop Message Queuing service
Go to the MSMQ storage location (usually C:\Windows\System32\msmq\storage)
Delete ONLY the P*.MQ, J*.MQ, R*.MQ, and L*.MQ files

Total memory consumption of the system

Is it correct to assume that the total memory consumption (virtual + physical) of a system is sum of "Memory Usage" and "VM Size" columns shown by the task manager in windows?
Read these posts by Mark Russinovich:
http://blogs.technet.com/markrussinovich/archive/2008/07/21/3092070.aspx
http://blogs.technet.com/markrussinovich/archive/2008/11/17/3155406.aspx
In modern Windows there really is no single truth about "Total Memory Consumption". It depends of course on the definition, but the real question is what you want to do with the answer.
Some processes like SQL-Server tend to use every byte of memory they can get their hands on, if you let them. The .NET CLR garbage collector monitors memory use and acts accordingly, trying to free more memory when it gets scarce.
So for instance you can have a system with 8 GB of physical memory, of which 90% is "used". How much of that memory is actually needed, is very hard to say. The same system may run on a 4 GB machine with no noticeable performance loss or any other issues.
If you want to explore some of the complexities of memory management under Windows, download "VMMap v2.0" from the former sysinternals site. It shows very detailed memory usage per process and may aid you in your quest.
To quote from VMMaps Help:
VMMap categorizes memory into one of several types:
Image
The memory represents an executable file such as a .exe or .dll. The Details column shows the file's path.
Private
Private memory cannot be shared with other processes, is charged against the system commit limit, and typically contains application data.
Shareable
Shareable memory can be shared with other processes, is charged against the system commit limit and typically contains data shared between DLLs in different processes or inter-process communication messages. The Windows APIs refer to this type of memory as pagefile-backed sections.
Mapped File
The memory represents a file on disk and the Details column shows the file's path. Mapped files typically contain application data.
Heap
Heaps represent memory managed by the user-mode heap manager and, like Private memory, is charged against the system commit limit and contains application data.
Managed Heap
Managed heap represents memory that's allocated and used by the .NET garbage collector.
Stack
Stacks are memory used to store function parameters, local function variables and function invocation records for individual threads. Stacks are charged agains the commit limit and typically grow on demand.
System
System memory is kernel-mode physical memory associated with the process. The vast majority of System memory consists of the process page tables.
Free
Free memory regions are spaces in the process address space that are not allocated.
Now you just need to define what types of memory you consider as "used", add these up for all processes, remove multiple duplicates and look at the number... There is a reason why in task manager or other tools, there is no single number labeled "Total Memory Consumption" :-)
No, physical memory and virtual memory may overlap. If a page of memory is in virtual memory and then paged in to physical memory the virtual memory is not necessarily freed, it may be reserved for when the page gets paged out again.