How does hornetq persist messages? - jboss

We are in the process of planning our server machine switch. While we are doing the switch, we need to be able to continue to receive traffic and save the JMS messages that are generated.
Is it possible to move the persisted message queue from one JBoss 7.1.1/HornetQ to another?

HornetQ uses a set of binary journal files to store the messages in the queues.

You can use export journal / export data... or you can use bridges to transfer data.
You should find some relevant information at the documentation on hornetq.org

Related

Kafka and IIDR CDC

I am trying to build a CDC pipeline using : DB2--IBM CDC --Kafka
and I am trying to figure out the right way to setup this .
I tried below things -
1.Setup a 3 node kafka cluster on linux on prem
2.Installed IIDR CDC software on linux on prem using - setup-iidr-11.4.0.1-5085-linux-x86.bin file . The CDC instance is up and running .
The various online documentation suggest to install 'IIDR management console ' to configure the source datastore and CDC server configuration and also Kafka subscription configuration to build the pipeline .
Currently I do not have the management console installed .
Few questions on this -
1.Is there any alternative to IBM CDC management console for setting up the kafka-CDC pipeline ?
2.How can I get the IIDR management console ? and if we install it on our local windows dekstop and try to connect to CDC/Kafka which are on remote linux servers, will it work ?
3.Any other method to setup the data ingestion IIDR CDC to Kafka ?
I am fairly new to CDC/ IIDR , please help !
I own the development of the IIDR Kafka target for our CDC Replication product.
Management Console is the best way to setup the subscription initially. You can install it on a windows box.
Technically I believe you can use our scripting language called CHCCLP to setup a subscription as well. But I recommend using the GUI.
Here are links to our resources on our IIDR (CDC) Kafka Target. Search for the "Kafka" section.
"https://www.ibm.com/developerworks/community/wikis/home?lang=en#!/wiki/W8d78486eafb9_4a06_a482_7e7962f5ac59/page/IIDR%20Wiki"
An example of setting up a subscription and replicating is this video
https://ibm.box.com/s/ur8jokg6tclsx5fcav5g86a3n57mqtd5
Management console and access server can be obtained from IBM fix central.
I have installed MC/Access server on my VM and on my personal windows box to use it against my linux VMs. You will need connectivity of course.
You can definitely follow up with our Support and they'll be able to sort you out. Plus we have docs in our knowledge centre on MC starting here.... https://www.ibm.com/support/knowledgecenter/en/SSTRGZ_11.4.0/com.ibm.cdcdoc.mcadminguide.doc/concepts/overview_of_cdc.html
You'll find our Kafka target is very flexible it comes with five different formats to write data into Kafka, and you can choose to capture data in an audit format, or the Kafka compaction compatible key, null for a delete method.
Additionally you can even use the product to write several records to several different topics in several formats from a single insert operation. This is useful if some of your consumer apps want JSON and others Avro binary. Additionally you can use this to put all the data to more secure topics, and write out just some of the data to topics that more people have access to.
We even have customers who encrypt columns in flight when replicating.
Finally the product's transformations can be parallelized even if you choose to only use one producer to write out data.
Actually one more finally, we additionally provide the option to use a special consumer which produces database ACID semantics for data written into Kafka and shred across topics and partitions. It re-orders it. we call it the transactionally consistent consumer. It provides operation order, bookmarks for restarting applications, and allows parallelism in performance but ordered, exactly once, deduplicated consumption of data.
From my talk at the Kafka Summit...
https://www.confluent.io/kafka-summit-sf18/a-solution-for-leveraging-kafka-to-provide-end-to-end-acid-transactions

How to delete JMS Queue filestore

I'm using Jboss Messaging with Jboss 4.2.2.GA. I have some problem with QUEUE, when i restart server, some persistent message will continue process. Now I want delete it but i don't know where it store ?
Use a non-persistent queue....
Data will be stored in /server/default/data/hornetq/*.hq file if you are using hornetQ

Ingesting a log file into HDFS using Flume while it is being written

What is the best way to ingest a log file into HDFS while it is being written ? I am trying to configure Apache Flume, and am trying to configure sources that can offer me data reliability as well. I was trying to configure "exec" and later also looked at "spooldir" but the following documentation at flume.apache.org has put doubt on my own intent -
Exec Source:
One of the most commonly requested features is the use case like-
"tail -F file_name" where an application writes to a log file on disk and
Flume tails the file, sending each line as an event. While this is
possible, there’s an obvious problem; what happens if the channel
fills up and Flume can’t send an event? Flume has no way of indicating
to the application writing the log file, that it needs to retain the
log or that the event hasn’t been sent for some reason. Your
application can never guarantee data has been received when using a
unidirectional asynchronous interface such as ExecSource!
Spooling Directory Source:
Unlike the Exec source, "spooldir" source is reliable and will not
miss data, even if Flume is restarted or killed. In exchange for this
reliability, only immutable files must be dropped into the spooling
directory. If a file is written to after being placed into the
spooling directory, Flume will print an error to its log file and stop
processing.
Anything better is available that I can use to ensure Flume will not miss any event and also reads in realtime ?
I would recommend using the Spooling Directory Source, because of its reliability. A workaround for the inmmutability requirement is to compose the files in a second directory, and once they reach certain size (in terms of bytes or amount of logs), move them to the spooling directory.

NServiceBus and remote input queues with an MSMQ cluster

We would like to use the Publish / Subscribe abilities of NServiceBus with an MSMQ cluster. Let me explain in detail:
We have an SQL Server cluster that also hosts the MSMQ cluster. Besides SQL Server and MSMQ we cannot host any other application on this cluster. This means our subscriber is not allowed to run on the clsuter.
We have multiple application servers hosting different types of applications (going from ASP.NET MVC to SharePoint Server 2010). The goal is to do a pub/sub between all these applications.
All messages going through the pub/sub are critical and have an important value to the business. That's why we don't want local queues on the application server, but we want to use MSMQ on the cluster (in case we lose one of the application servers, we don't risk losing the messages since they are safe on the cluster).
Now I was assuming I could simply do the following at the subscriber side:
<?xml version="1.0" encoding="utf-8"?>
<configuration>
<configSections>
<section name="MsmqTransportConfig" type="NServiceBus.Config.MsmqTransportConfig, NServiceBus.Core" />
....
</configSections>
<MsmqTransportConfig InputQueue="myqueue#server" ErrorQueue="myerrorqueue"
NumberOfWorkerThreads="1" MaxRetries="5" />
....
</configuration>
I'm assuming this used to be supported seen the documentation: http://docs.particular.net/nservicebus/messaging/publish-subscribe/
But this actually throws an exception:
Exception when starting endpoint, error has been logged. Reason:
'InputQueue' entry in 'MsmqTransportConfig' section is obsolete. By
default the queue name is taken from the class namespace where the
configuration is declared. To override it, use .DefineEndpointName()
with either a string parameter as queue name or Func parameter
that returns queue name. In this instance, 'myqueue#server' is defined
as queue name.
Now, the exception clearly states I should use the DefineEndpointName method:
Configure.With()
.DefaultBuilder()
.DefineEndpointName("myqueue#server")
But this throws an other exception which is documented (input queues should be on the same machine):
Exception when starting endpoint, error has been logged. Reason: Input
queue must be on the same machine as this process.
How can I make sure that my messages are safe if I can't use MSMQ on my cluster?
Dispatcher!
Now I've also been looking into the dispatcher for a bit and this doesn't seem to solve my issue either. I'm assuming also the dispatcher wouldn't be able to get messages from a remote input queue? And besides that, if the dispatcher dispatches messages to the workers, and the workers go down, my messages are lost (even though they were not processed)?
Questions?
To summarize, these are the things I'm wondering with my scenario in NServiceBus:
I want my messages to be safe on the MSMQ cluster and use a remote input queue. Is this something is should or shouldn't do? Is it possible with NServiceBus?
Should I use a dispatcher in this case? Can it read from a remote input queue? (I cannot run the dispatcher on the cluster)
What if the dispatcher dispatchers messages to the workers and one of the workers goes down? Do I lose the message(s) that were being processed?
Phill's comment is correct.
The thing is that you would get the type of fault tolerance you require practically by default if you set up a virtualized environment. In that case, the C drive backing the local queue of your processes is actually sitting on the VM image on your SAN.
You will need a MessageEndpointMappings section that you will use to point to the Publisher's input queue. This queue is used by your Subscriber to drop off subscription messages. This will need to be QueueName#ClusterServerName. Be sure to use the cluster name and not a node name. The Subscriber's input queue will be used to receive messages from the Publisher and that will be local, so you don't need the #servername.
There are 2 levels of failure, one is that the transport is down(say MSMQ) and the other is that the endpoint is down(Windows Service). In the event that the endpoint is down, the transport will handle persisting the messages to disk. A redundant network storage device may be in order.
In the event that the transport is down, assuming it is MSMQ the messages will backup on the Publisher side of things. Therefore you have to account for the size and number of messages to calculate how long you want to messages to backup for. Since the Publisher is clustered, you can be assured that the messages will arrive eventually assuming you planned your disk appropriately.

Msmq disaster recovery

I trying to use synchronisation tool (double take) to synchronize the MSMQ storage folder "C:\Windows\System32\msmq\storage"
from one server to another one
The problem that once the files moved to the second server, the Message queue service couldn’t be started
I found that if I exclude the *.MQ files the synchronization work fine but in this case I will be losing the transactional messages
Anybody have a solution to keep the transactinal messages ?
Thank you
MSMQ uses multiple files in the storage directory for transactional messages. Any attempt to copy the storage directory while MSMQ is working on transactional messages is likely to result in files that are not in synch with each other. Only guaranteed way to do this is to stop the MSMQ service first. This is how MQBKUP.EXE works, for example.
Cheers
John