MSMQ - are messages deleted from the mq files once read? - msmq

Is it possible to read a queue message from a persistent mq file (e.g. p000001.mq) that has been processed and deleted, or is the message removed straight away?
The mq files haven't shrunk when deleting messages, but I don't appear to be able to open them in QueueExplorer.

"Is it possible to read a queue message from a persistent mq file that has been processed and deleted."
No. If you open the file in notepad then you should be able to see that the message data is still there but a flag will have been set so that MSMQ knows to make the message invisible.
MQ files do not shrink immediately as that impacts disk I/O performance.
MSMQ performs file cleanup at two points:
Service startup
After the MessageCleanupInterval (default 6 hours).

Related

Message count is not zero even after all messages are consumed and acknowledged

We have containerized ActiveMQ Artemis 2.16.0 and deployed it as a K8s deployment for KEDA.
We use STOMP using stomp.py python module. The ACK-mode is set as client-individual and consumerWindowSize = 0 on the connection. We are promptly acknowledging the message as soon as we read it.
The problem is, sometimes, the message count in the web console does not become zero even after all the messages are actually consumed and acknowledged. When I browse the queue, I don't see any messages in it. This is causing KEDA to spin up pods unnecessarily. Please refer to the attached screenshots I attached in the JIRA for this issue.
I fixed the issue in my application code. My requirement was one queue listener should consume only one message and exit gracefully. So, soon after sending ACK for the consumed message, I disconnected the connection, instead of waiting for the sleep duration to disconnect.
Thanks, Justin, for spending time on this.

Kafka Conenct: Automatically terminating after processing all data

I want to backup and restore a huge amount of data in a Kafka topic to various destinations (file, another topic, S3, ...) using Kafka Connect. However, it runs in a streaming mode and hence never terminates. But in my scenario it should exit automatically after processing all data that is currently in the topic (it is ensured in my context that all producers are shut down before the backup starts).
Is there any option/ parameter so that a Kafka Connect connector automatically terminates after all current data is processed and e.g. stored in a file?
AFAIK there is no such option. You can create "watchdog" checking lag on your Kafka Connect group.id and once lag is processed, e.g. = 0, you shutdown the process.
As we do it in our company: we start consumer to process messages every 3-6 hours to process lag, create file and then terminates. File is being uploaded to other destination.

Ingesting a log file into HDFS using Flume while it is being written

What is the best way to ingest a log file into HDFS while it is being written ? I am trying to configure Apache Flume, and am trying to configure sources that can offer me data reliability as well. I was trying to configure "exec" and later also looked at "spooldir" but the following documentation at flume.apache.org has put doubt on my own intent -
Exec Source:
One of the most commonly requested features is the use case like-
"tail -F file_name" where an application writes to a log file on disk and
Flume tails the file, sending each line as an event. While this is
possible, there’s an obvious problem; what happens if the channel
fills up and Flume can’t send an event? Flume has no way of indicating
to the application writing the log file, that it needs to retain the
log or that the event hasn’t been sent for some reason. Your
application can never guarantee data has been received when using a
unidirectional asynchronous interface such as ExecSource!
Spooling Directory Source:
Unlike the Exec source, "spooldir" source is reliable and will not
miss data, even if Flume is restarted or killed. In exchange for this
reliability, only immutable files must be dropped into the spooling
directory. If a file is written to after being placed into the
spooling directory, Flume will print an error to its log file and stop
processing.
Anything better is available that I can use to ensure Flume will not miss any event and also reads in realtime ?
I would recommend using the Spooling Directory Source, because of its reliability. A workaround for the inmmutability requirement is to compose the files in a second directory, and once they reach certain size (in terms of bytes or amount of logs), move them to the spooling directory.

How does hornetq persist messages?

We are in the process of planning our server machine switch. While we are doing the switch, we need to be able to continue to receive traffic and save the JMS messages that are generated.
Is it possible to move the persisted message queue from one JBoss 7.1.1/HornetQ to another?
HornetQ uses a set of binary journal files to store the messages in the queues.
You can use export journal / export data... or you can use bridges to transfer data.
You should find some relevant information at the documentation on hornetq.org

Msmq disaster recovery

I trying to use synchronisation tool (double take) to synchronize the MSMQ storage folder "C:\Windows\System32\msmq\storage"
from one server to another one
The problem that once the files moved to the second server, the Message queue service couldn’t be started
I found that if I exclude the *.MQ files the synchronization work fine but in this case I will be losing the transactional messages
Anybody have a solution to keep the transactinal messages ?
Thank you
MSMQ uses multiple files in the storage directory for transactional messages. Any attempt to copy the storage directory while MSMQ is working on transactional messages is likely to result in files that are not in synch with each other. Only guaranteed way to do this is to stop the MSMQ service first. This is how MQBKUP.EXE works, for example.
Cheers
John