HornetQ concurrent session usage warning - hornetq

With regards to this post, I am using HornetQ 2.4.0 embedded, and using the hornetq core api. I use it to queue messages received via a web service call. I have another thread that dequeues messages synchronously, and processes them. When enqueuing, I sometimes get the "WARN: HQ212051: Invalid concurrent session usage" warning. Does this apply to this embedded usage that I am doing, where I am not using the JMS api? I'm running in embedded jetty, and don't have a container. If I do need to guarantee single thread access, I would have to do my own pool or do a thread local session, correct? (would rather not synch access to the shared objects for performance reasons)


Use Kafka in AWS EBS microservice docker environment to avoid losing user requests and handle more concurrent hits

Currently, I am using AWS EBS microservice docker environment to deploy the micro-services which are written in Scala and Akka. If anyone of the microservice docker is crashed and restarted again. In this case, we will lose the user requests and service will not return any response for those cases. My current architecture can handle up to 1000 concurrent requests without any issues. To avoid this issues, I am planning to store and retrieve all the requests and responses using Kafka.
So I want to use Kafka to manage the request and responses of all my web services and include a separate service or web socket to process all the requests and store the responses again to Kafka. In this case, if my core process docker crashed or restarted. It won't lose any request and responses at any point in time. It will again start to read the requests from Kafka and process it.
All the web services will store the request in relevant topic in Kafka and get the response from relevant response topic and return back to an API response. I have found the following library to use Kafka in Scala web services.
Please check the attached architecture diagram which I am going to use it to efficiently handle a large number of concurrent requests from client apps. Is it a good approach to proceed? Do I need to changes anything in my architecture?
I have created this architecture after done more research about Kafka and microservice dockers. Please let me know if anything wrong with this architecture.
This is Kafka's bread and butter so I don't think you're going to run into any architectural issues with this. Just be aware that there is a pretty large amount of operational overhead with Kafka. A good resource for getting started is Kafka: The Definitive Guide written by Confluent (https://www.confluent.io/wp-content/uploads/confluent-kafka-definitive-guide-complete.pdf) . It goes into detail on a lot of common operational issues that they don't mention in the documentation.

Kafka Custom Authorizer

I use Kafka and using the custom authorizer.
From the custom authorizer, I call a microservice for authorizaton. It works fine for a while and stars throwing the following exception in the logs and the whole cluster becomes unresponsive. The exception keeps coming until i restart the cluster. But, the whole cluster works fine without any issues even for months without the custom authorizer. Is there any bug in the Kafka version or anything wrong with the custom authorizer.
TRACE [ReplicaFetcherThread-0-39], Issuing to broker 1 of fetch request kafka.server.ReplicaFetcherThread$FetchRequest#8c63320 (kafka.server.ReplicaFetcherThread)
[2017-06-30 08:29:17,473] TRACE [ReplicaFetcherThread-2-1], Issuing to broker 1 of fetch request kafka.server.ReplicaFetcherThread$FetchRequest#67a143a (kafka.server.ReplicaFetcherThread)
[2017-06-30 08:29:17,473] WARN [ReplicaFetcherThread-3-1], Error in fetch kafka.server.ReplicaFetcherThread$FetchRequest#12d29e06 (kafka.server.ReplicaFetcherThread)
java.io.IOException: Connection to <HOST:PORT> (id: 1 rack: null) failed
at kafka.utils.NetworkClientBlockingOps$.awaitReady$1(NetworkClientBlockingOps.scala:83)
at kafka.utils.NetworkClientBlockingOps$.blockingReady$extension(NetworkClientBlockingOps.scala:93)
at kafka.server.ReplicaFetcherThread.sendRequest(ReplicaFetcherThread.scala:248)
at kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:238)
at kafka.server.ReplicaFetcherThread.fetch(ReplicaFetcherThread.scala:42)
at kafka.server.AbstractFetcherThread.processFetchRequest(AbstractFetcherThread.scala:118)
at kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:103)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
My Custom authorizer uses microservice for checking authorization and caches data in a guava caches with the expiry time of 10 mins.
I suggest taking a thread dump to see what all the threads are doing.
Just a guess here, given there isn't much info to go on.
If you have as single cache instance what could be happening is that once the cache expires, all requests start hitting the microservice for authorization info and, since this adds latency, the thread pool gets exhausted. A thread dump can tell you how many threads are calling the microservice simultaneously.
If this is indeed the problem, one of the options you could consider, is to use a separate cache per thread (using a Thread-local variable). That way each thread's cache will expire at its own time and won't cause other threads hitting the microservice at exactly the same time.
Another, and a better way IMO is to remove the blocking calls to the microservice from the authorize code-path completely. Instead of a fall-through cache, have the cache always up to date by updating it from a separate background thread. This way no latency is ever added to the authorize calls.

Can I use Kafka queue in my Rest WEBSERVICE

I have a rest based application deployed in server(tomcat) ,
Every request comes to server it takes 1 second of time to serve, now I have a issue, sometimes Server receive more request then it is capable of serving which making server non responsive. Now I was thinking if I can store the requests in a queue so that server can pull request and serve that request and handle the pick time issue.
Now I was thinking can Kafka be helpful for this, if yes any pointer where I can start.
You can use Kafka (or any other messaging system for this ex- ActiveMQ, RabbitMQ etc).
When WebService receives request, add request (with all details required to process it) in Kafka queue (using Kafka message producer details)
Separate service (having Kafka consumer details) will read from topic(queue) and process it.
In case need to send message to client when request is processed, server can push information to client using WebSocket (Or client can poll for request status however this need request status endpoint and will cause load on that endpoint).
Apache Kafka would be helpful in your case. If you use a Kafka broker, it will allow you to face with a peak of requests. The requests will be stored in a queue as you mentionned and be treated by your server at its own speed.
As your are using tomcat, I guess you developped your server in Java. Apache Kafka propose a Java API which is quite easy to use.

Weblogic 12c - Suddenly JMS Server stopped processing messages

We are facing a strange issue in our webservice application.
It has 6 weblogic managed instances (4 # m01,m02,m04,m05 - handles webservice requests which post the messages to JMS queues, 2 # m03,m06 - JMS instances which have MDB components which actually process the messages from queue).
We have observed one of the JMS instance (M06) is stopped processing messages all of sudden without any errors in the application or server logs. We observed the connection factory is not responding. This also causing hogging threads in service instances while posting the and searching the messages from the JMS queues. We are not able to see any issue from the thread dumps as well.
Adding to this when we try to stop the M06 instance it is not going down, eventually we had to kill the instance process and start the instance to resolve the issue. Then it is working fine for few days then again issue resurfacing.
We are using weblogic 12c.
Any one had faced this kind of issue earlier. Or any one have any idea what could have went wrong. Your inputs are greatly appreciated.
If I'll be you, I'll start by creating error queue, to get rid of any "poisoned" messages. More information can be found here: http://middlewaremagic.com/weblogic/?p=4670. Then try to check error queue and message content there.
Secondly, try to turn off mentioned instance (M06) at all, if bottleneck/errors does not appear on some other node, check M06 instance configuration and compare it with other nodes -> issue will be definitely somewhere there.

NServiceBus and remote input queues with an MSMQ cluster

We would like to use the Publish / Subscribe abilities of NServiceBus with an MSMQ cluster. Let me explain in detail:
We have an SQL Server cluster that also hosts the MSMQ cluster. Besides SQL Server and MSMQ we cannot host any other application on this cluster. This means our subscriber is not allowed to run on the clsuter.
We have multiple application servers hosting different types of applications (going from ASP.NET MVC to SharePoint Server 2010). The goal is to do a pub/sub between all these applications.
All messages going through the pub/sub are critical and have an important value to the business. That's why we don't want local queues on the application server, but we want to use MSMQ on the cluster (in case we lose one of the application servers, we don't risk losing the messages since they are safe on the cluster).
Now I was assuming I could simply do the following at the subscriber side:
<?xml version="1.0" encoding="utf-8"?>
<section name="MsmqTransportConfig" type="NServiceBus.Config.MsmqTransportConfig, NServiceBus.Core" />
<MsmqTransportConfig InputQueue="myqueue#server" ErrorQueue="myerrorqueue"
NumberOfWorkerThreads="1" MaxRetries="5" />
I'm assuming this used to be supported seen the documentation: http://docs.particular.net/nservicebus/messaging/publish-subscribe/
But this actually throws an exception:
Exception when starting endpoint, error has been logged. Reason:
'InputQueue' entry in 'MsmqTransportConfig' section is obsolete. By
default the queue name is taken from the class namespace where the
configuration is declared. To override it, use .DefineEndpointName()
with either a string parameter as queue name or Func parameter
that returns queue name. In this instance, 'myqueue#server' is defined
as queue name.
Now, the exception clearly states I should use the DefineEndpointName method:
But this throws an other exception which is documented (input queues should be on the same machine):
Exception when starting endpoint, error has been logged. Reason: Input
queue must be on the same machine as this process.
How can I make sure that my messages are safe if I can't use MSMQ on my cluster?
Now I've also been looking into the dispatcher for a bit and this doesn't seem to solve my issue either. I'm assuming also the dispatcher wouldn't be able to get messages from a remote input queue? And besides that, if the dispatcher dispatches messages to the workers, and the workers go down, my messages are lost (even though they were not processed)?
To summarize, these are the things I'm wondering with my scenario in NServiceBus:
I want my messages to be safe on the MSMQ cluster and use a remote input queue. Is this something is should or shouldn't do? Is it possible with NServiceBus?
Should I use a dispatcher in this case? Can it read from a remote input queue? (I cannot run the dispatcher on the cluster)
What if the dispatcher dispatchers messages to the workers and one of the workers goes down? Do I lose the message(s) that were being processed?
Phill's comment is correct.
The thing is that you would get the type of fault tolerance you require practically by default if you set up a virtualized environment. In that case, the C drive backing the local queue of your processes is actually sitting on the VM image on your SAN.
You will need a MessageEndpointMappings section that you will use to point to the Publisher's input queue. This queue is used by your Subscriber to drop off subscription messages. This will need to be QueueName#ClusterServerName. Be sure to use the cluster name and not a node name. The Subscriber's input queue will be used to receive messages from the Publisher and that will be local, so you don't need the #servername.
There are 2 levels of failure, one is that the transport is down(say MSMQ) and the other is that the endpoint is down(Windows Service). In the event that the endpoint is down, the transport will handle persisting the messages to disk. A redundant network storage device may be in order.
In the event that the transport is down, assuming it is MSMQ the messages will backup on the Publisher side of things. Therefore you have to account for the size and number of messages to calculate how long you want to messages to backup for. Since the Publisher is clustered, you can be assured that the messages will arrive eventually assuming you planned your disk appropriately.