nServiceBus with large XML messages - msmq

I have read about the true messaging and that instead of sending payload on the bus, it sends an identifier. In our case, we have a lot of legacy apps/services and those were designed to receive the payload of messages (xml) that is close to 4MB (close MSMQ limit). Is there a way for nService bus to handle large payload and persist messages automatically or another work-around, so that the publisher/subscriber services don't have to worry neither about the payload size, nor about how to de/re-hydrate the payload?
Thank you in advance.

You could use the Message Sequence pattern. In NServiceBus, you would split the payload in the sender, wrap the chunks in a custom 'Sequence' IMessage, and then implement a saga at the other end to extract the chunks & reassemble. You would need to put some effort into error handling & timeouts.

You can always use the quick "fix" of compressing the messages.
A POCO serialized with the binary serializer can be compressed down by a large margin. We saw our messages that were 20mb compressed down to 3.1mb.
So if your messages are hovering around 4mb it might be simple to just write an IMessageSerializer that automatically compresses the message while it is on the wire.

I'm not aware of any internal NServiceBus capability to associate extra data with a message out of band.
I think you're right on the mark - if the entire payload can't fit within the limit, then it's better to persist it elsewhere on your own and then passing an ID.
However, it may be possible for you to design a message structure such that a message could implement an IHasPayload interface (which would perhaps incorporate an ID and a Type?), and then your application logic could have a common method for getting the payload given an IHasPayload message.


Sorting Service Bus Queue Messages

i was wondering if there is a way to implement metadata or even multiple metadata to a service bus queue message to be used later on in an application to sort on but still maintaining FIFO in the queue.
So in short, what i want to do is:
Maintaining Fifo, that s First in First Out structure in the queue, but as the messages are coming and inserted to the queue from different Sources i want to be able to sort from which source the message came from with for example metadata.
I know this is possible with Topics where you can insert a property to the message, but also i am unsure if it is possible to implement multiple properties into the topic message.
Hope i made my self clear on what i am asking is possible.
I assume you use .NET API. If this case you can use Properties dictionary to write and read your custom metadata:
BrokeredMessage message = new BrokeredMessage(body);
message.Properties.Add("Source", mySource);
You are free to add multiple properties too. This is the same for both Queues and Topics/Subscriptions.
i was wondering if there is a way to implement metadata or even multiple metadata to a service bus queue message to be used later on in an application to sort on but still maintaining FIFO in the queue.
To maintain FIFO in the queue, you'd have to use Message Sessions. Without message sessions you would not be able to maintain FIFO in the queue itself. You would be able to set a custom property and use it in your application and sort out messages once they are received out of order, but you won't receive message in FIFO order as were asking in your original question.
If you drop the requirement of having an order preserved on the queue, the the answer #Mikhail has provided will be suitable for in-process sorting based on custom property(s). Just be aware that in-process sorting will be not a trivial task.

Sending large files with Spray

I know very similar questions have been asked before. But I don't think the solutions I found on google/stackoverflow are suitable for me.
I started to write some web services with Scala/Spray, and it seems the best way to send large files without consuming large amouns of memory is using the stream marshalling. This way Spray will send http chunks. Two questions:
Is it possible to send the file without using HTTP chunks and without reading the entire file into memory?
AFAIK akka.io only process one write at a time, meaning it can buffer one write until it has been passed on to the O/S kernel in full. Would it be possible to tell Spray, for each HTTP response, the length of the content? Thereafter Spray would ask for new data (through akka messages) untill the entire content length is completed. Eg, I indicate my content length is 100 bytes. Spray sends a message asking for data to my actor, I provide 50 bytes. Once this data is passed on to the O/S, spray sends another message asking for new data. I provide the remaining 50 bytes... the response is completed then.
Is it possible to send the file without using HTTP chunks [on the wire]
Yes, you need to enable chunkless streaming. See http://spray.io/documentation/1.2.4/spray-routing/advanced-topics/response-streaming/
Chunkless streaming works regardless whether you use the Stream marshaller or provide the response as MessageChunks yourself. See the below example.
without reading the entire file into memory
Yes, that should work if you supply data as a Stream[Array[Byte]] or Stream[ByteString].
[...] Thereafter Spray would ask for new data [...]
That's actually almost like it already works: If you manually provide the chunks you can request a custom Ack message that will be delivered back to you when the spray-can layer is able to process the next part. See this example for how to stream from a spray route.
I indicate my content length is 100 bytes
A note upfront: In HTTP you don't strictly need to specify a content-length for responses because a response body can be delimited by closing the connection which is what spray does if chunkless streaming is enable. However, if you don't want to close the connection (because you would lose this persistent connection) you can now specify an explicit Content-Length header in your ChunkedResponseStart message (see #802) which will prevent the closing of the connection.

Streaming Of ZeroMQ Events Back To Client

I have a use case where by i wish to have a ZeroMQ Request / Reply socket 'stream' back results, is this possible with MultiPart messages (i.e. The Reply sockets streams the frames back before HasMore = false?) or am i approaching this incorrectly?
The situation:
1) Client makes a query (Request) for some records
2) Server looks up Database for results and responds with the current large amount records (Reply) split into frames
3) Server must wait until a Server Side event is generated before the final Frame is sent (HasMore = false)
4) Client wont get the previous Frames until the Final Event has been generated and HasMore = false
Thanks for your help.
As far as I understand what you're aiming for, it sounds like what you have will work the way you expect. See here for more discussion on message frames. The salient points:
As you say, all of the frames will be sent to the client at one time, they will be stored on the server until HasMore is set to false.
One important thing to remember here, if it's a truly large amount of data, you must be able to fit the entire data set into memory, because it'll be stored in your server memory until the entire message with all frames is complete, and then it'll be received into memory before it's processed on the client side.
I assume primarily what you're looking for is a way to iteratively build up a message before you send it? And perhaps to be able to deal with the data on the client iteratively as well? Also you get a guarantee that you won't lose part of the data in the middle, you either get the whole message or lose the whole message (as opposed to instead sending each frame as a separate message). This is one of the primary use cases for frames, so you've done well.
The only thing I object to is using the word "stream", as that implies that the data is being sent to the client continuously as it's being processed on the server, and that's explicitly not what you're trying to do (nor is it possible with ZMQ message frames).

FIX - Determining message length?

In FIX what determines a message length? Because I have read if a message exceeds its length it will be sent in fragments.
Splitting up large messages is not an implicit part of the FIX protocol.
Some counterparties may choose to split up data into multiple messages instead of sending giant messages, but they don't have to. In my experience, I've seen counterparties send ridiculously large messages.
If a message is split up, it's because the sending party chose to implement their system that way.

Replacing a message in a jms queue

I am using activemq to pass requests between different processes. In some cases, I have multiple, duplicate message (which are requests) in the queue. I would like to have only one. Is there a way to send a message in a way that it will replace an older message with similar attributes? If there isn't, is there a way to inspect the queue and check for a message with specific attributes (in this case I will not send the new message if an older one exists).
Clarrification (based on Dave's answer): I am actually trying to make sure that there aren't any duplicate messages on the queue to reduce the amount of processing that is happening whenever the consumer gets the message. Hence I would like either to replace a message or not even put it on the queue.
This sounds like an ideal use case for the Idempotent Consumer which removes duplicates from a queue or topic.
The following example shows how to do this with Apache Camel which is the easiest way to implement any of the Enterprise Integration Patterns, particularly if you are using ActiveMQ which comes with Camel integrated out of the box
The only trick to this is making sure there's an easy way to calculate a unique ID expression on each message - such as pulling out an XPath from the document or using as in the above example some unique message header
You could browse the queue and use selectors to identify the message. However, unless you have a small amount of messages this won't scale very well. Instead, you message should just be a pointer to a database-record (or set of records). That way you can update the record and whoever gets the message will then access the latest version of the record.