Apache Curator : No leader is getting selected intermittently - apache-zookeeper

I am using Apache Curator Leader Election Recipe : https://curator.apache.org/curator-recipes/leader-election.html in my application.
Zookeeper version : 3.5.7
Curator : 4.0.1
Below are the sequence of steps:
1. Whenever my tomcat server instance is getting up, I create a single CuratorFramework instance(single instance per tomcat server) and start it :
CuratorFramework client = CuratorFrameworkFactory.newClient(connectionString, retryPolicy);
client.start();
if(!client.blockUntilConnected(10, TimeUnit.MINUTES)){
LOGGER.error("Zookeeper connection could not establish!");
throw new RuntimeException("Zookeeper connection could not establish");
}
Create an instance of LSAdapter and start it:
LSAdapter adapter = new LSAdapter(client, <some_metadata>);
adapter.start();
Below is my LSAdapter class :
public class LSAdapter extends LeaderSelectorListenerAdapter implements Closeable {
//<Class instance variables defined>
public LSAdapter(CuratorFramework client, <some_metadata>) {
leaderSelector = new LeaderSelector(client, <path_to_be_used_for_leader_election>, this);
leaderSelector.autoRequeue();
}
public void start() throws IOException {
leaderSelector.start();
}
#Override
public void close() throws IOException {
leaderSelector.close();
}
#Override
public void takeLeadership(CuratorFramework client) throws Exception {
final int waitSeconds = (int) (5 * Math.random()) + 1;
LOGGER.info(name + " is now the leader. Waiting " + waitSeconds + " seconds...");
LOGGER.debug(name + " has been leader " + leaderCount.getAndIncrement() + " time(s) before.");
while (true) {
try {
Thread.sleep(TimeUnit.SECONDS.toMillis(waitSeconds));
//do leader tasks
} catch (InterruptedException e) {
LOGGER.error(name + " was interrupted.");
//cleanup
Thread.currentThread().interrupt();
} finally {
}
}
}
}
When server instance is getting down, close LSAdapter instance(which application is using) and close CuratorFramework client created
CloseableUtils.closeQuietly(lsAdapter);
curatorFrameworkClient.close();
The issue I am facing is that at times, when server is restarted, no leader gets elected. I checked that by tracing the log inside takeLeadership(). I have two tomcat server instances with above code, connecting to same zookeeper quorum and most of the times one of the instance becomes leader but when this issue happens, both of them becomes follower. Please suggest what am I doing wrong.

As I answered on Curator's Jira, you are swallowing the interrupted exception. When you get InterruptedException you must exit your takeLeadership(). In your code example, you are merely resetting the interrupted state and continuing the loop - this will cause an infinite loop of interrupted exceptions, btw. After calling Thread.currentThread().interrupt(); you should exit the while loop.

Related

flink kafkaproducer send duplicate message in exactly once mode when checkpoint restore

I am writing a case to test flink two step commit, below is overview.
sink kafka is exactly once kafka producer. sink step is mysql sink extend two step commit. sink compare is mysql sink extend two step commit, and this sink will occasionally throw a exeption to simulate checkpoint failed.
When checkpoint is failed and restore, I find mysql two step commit will work fine, but kafka consumer will read offset from last success and kafka producer produce messages even he was done it before this checkpoint failed.
How to avoid duplicate message in this case?
Thanks for help.
env:
flink 1.9.1
java 1.8
kafka 2.11
kafka producer code:
dataStreamReduce.addSink(new FlinkKafkaProducer<>(
"flink_output",
new KafkaSerializationSchema<Tuple4<String, String, String, Long>>() {
#Override
public ProducerRecord<byte[], byte[]> serialize(Tuple4<String, String, String, Long> element, #Nullable Long timestamp) {
UUID uuid = UUID.randomUUID();
JSONObject jsonObject = new JSONObject();
jsonObject.put("uuid", uuid.toString());
jsonObject.put("key1", element.f0);
jsonObject.put("key2", element.f1);
jsonObject.put("key3", element.f2);
jsonObject.put("indicate", element.f3);
return new ProducerRecord<>("flink_output", jsonObject.toJSONString().getBytes(StandardCharsets.UTF_8));
}
},
kafkaProps,
FlinkKafkaProducer.Semantic.EXACTLY_ONCE
)).name("sink kafka");
checkpoint settings:
StreamExecutionEnvironment executionEnvironment = StreamExecutionEnvironment.getExecutionEnvironment();
executionEnvironment.enableCheckpointing(10000);
executionEnvironment.getCheckpointConfig().setTolerableCheckpointFailureNumber(0);
executionEnvironment.getCheckpointConfig().setPreferCheckpointForRecovery(true);
mysql sink:
dataStreamReduce.addSink(
new TwoPhaseCommitSinkFunction<Tuple4<String, String, String, Long>,
Connection, Void>
(new KryoSerializer<>(Connection.class, new ExecutionConfig()), VoidSerializer.INSTANCE) {
int count = 0;
Connection connection;
#Override
protected void invoke(Connection transaction, Tuple4<String, String, String, Long> value, Context context) throws Exception {
if (count > 10) {
throw new Exception("compare test exception.");
}
PreparedStatement ps = transaction.prepareStatement(
" insert into test_two_step_compare(slot_time, key1, key2, key3, indicate) " +
" values(?, ?, ?, ?, ?) " +
" ON DUPLICATE KEY UPDATE indicate = indicate + values(indicate) "
);
ps.setString(1, context.timestamp().toString());
ps.setString(2, value.f0);
ps.setString(3, value.f1);
ps.setString(4, value.f1);
ps.setLong(5, value.f3);
ps.execute();
ps.close();
count += 1;
}
#Override
protected Connection beginTransaction() throws Exception {
LOGGER.error("compare in begin transaction");
try {
if (connection.isClosed()) {
throw new Exception("mysql connection closed");
}
}catch (Exception e) {
LOGGER.error("mysql connection is error: " + e.toString());
LOGGER.error("reconnect mysql connection");
String jdbcURI = "jdbc:mysql://";
Class.forName("com.mysql.jdbc.Driver");
Connection connection = DriverManager.getConnection(jdbcURI);
connection.setAutoCommit(false);
this.connection = connection;
}
return this.connection;
}
#Override
protected void preCommit(Connection transaction) throws Exception {
LOGGER.error("compare in pre Commit");
}
#Override
protected void commit(Connection transaction) {
LOGGER.error("compare in commit");
try {
transaction.commit();
} catch (Exception e) {
LOGGER.error("compare Commit error: " + e.toString());
}
}
#Override
protected void abort(Connection transaction) {
LOGGER.error("compare in abort");
try {
transaction.rollback();
} catch (Exception e) {
LOGGER.error("compare abort error." + e.toString());
}
}
#Override
protected void recoverAndCommit(Connection transaction) {
super.recoverAndCommit(transaction);
LOGGER.error("compare in recover And Commit");
}
#Override
protected void recoverAndAbort(Connection transaction) {
super.recoverAndAbort(transaction);
LOGGER.error("compare in recover And Abort");
}
})
.setParallelism(1).name("sink compare");
I'm not quite sure I understand the question correctly:
When checkpoint is failed and restore, I find mysql two step commit will work fine, but kafka producer will read offset from last success and produce message even he was done it before this checkpoint failed.
Kafka producer is not reading any data. So, I'm assuming your whole pipeline rereads old offsets and produces duplicates. If so, you need to understand how Flink ensures exactly once.
Periodic checkpoints are created to have a consistent state in case of failure.
These checkpoints contain the offset of the last successfully read record at the time of the checkpoint.
Upon recovery Flink will reread all records from the offset stored in the last successful checkpoint. Thus, the same records will be replayed as have been generated in between last checkpoint and failure.
The replayed records will restore the state right before the failure.
It will produce duplicate outputs originating from the replayed input records.
It is the responsibility of the sinks to ensure that no duplicates are effectively written to the target system.
For the last point, there are two options:
only output data, when a checkpoint has been written, such that no effective duplicates can ever appear in the target. This naive approach is very universal (independent of the sink) but adds the checkpointing interval to the latency.
let the sink deduplicate the output.
The latter option is used for the Kafka sink. It uses Kafka transactions for letting it deduplicate data. To avoid duplicates on consumer side, you need to ensure it's not reading uncommitted data as mentioned in the documentation. Also make sure your transaction timeout is large enough that it doesn't discard data between failure and recovery.

Netty Client Connect with Server, but server does not fire channelActive/Registered

I have the following architecture in use:
- [Client] - The enduser connecting to our service.
- [GameServer] - The game server on which the game is running.
- [GameLobby] - A server that is responsible for matching Clients with a GameServer.
If we have for example 4 Clients that want to play a game and get matched to a GameLobby, then the first time all these connection succeeds properly.
However when they decide to rematch, then one of the Clients will not properly connect.
The connection between all the Clients and the GameServer happens simultaneously.
Clients that rematch first removes their current connection with the GameServer and head into the lobby again.
This connection will succeed, no errors are thrown. Even using a ChannelFuture it shows that the client connection was made properly, the following values are retrieved to show that the client thinks the connection was correct:
- ChannelFuture.isSuccess() = True
- ChannelFuture.isDone() = True
- ChannelFuture.cause() = Null
- ChannelFuture.isCancelled() = False
- Channel.isOpen() = True
- Channel.isActive() = True
- Channel.isRegistered() = True
- Channel.isWritable() = True
Thus the connection was properly made according to the Client. However on the GameServer at the SimpleChannelInboundHandler, the method ChannelRegistered/ChannelActive is never called for that specific Client. Only for the other 3 Clients.
All the 4 Clients, the GameServer, and the Lobby are running on the same IPAddress.
Since it only happens when (re)connecting again to the GameServer, I thought that is had to do with not properly closing the connection. Currently this is done through:
try {
group.shutdownGracefully();
channel.closeFuture().sync();
} catch (InterruptedException e) {
e.printStackTrace();
}
On the GameServer the ChannelUnregister is called thus this is working, and the connection is destroyed.
I have tried adding listeners to the ChannelFuture of the malfunctioning channel connection, however according to the channelFuture everything works, which is not the case.
I tried adding ChannelOptions to allow for more Clients queued to the server.
GameServer
The GameServer server is initialized as follow:
// Create the bootstrap to make this act like a server.
ServerBootstrap serverBootstrap = new ServerBootstrap();
serverBootstrap.group(bossGroup)
.channel(NioServerSocketChannel.class)
.childHandler(new ChannelInitialisation(new ClientInputReader(gameThread)))
.option(ChannelOption.SO_BACKLOG, 1000)
.childOption(ChannelOption.SO_KEEPALIVE, true)
.childOption(ChannelOption.TCP_NODELAY, true);
bossGroup.execute(gameThread); // Executing the thread that handles all games on this GameServer.
// Launch the server with the specific port.
serverBootstrap.bind(port).sync();
The GameServer ClientInputReader
#ChannelHandler.Sharable
public class ClientInputReader extends SimpleChannelInboundHandler<Packet> {
private ServerMainThread serverMainThread;
public ClientInputReader(ServerMainThread serverMainThread) {
this.serverMainThread = serverMainThread;
}
#Override
public void channelRegistered(ChannelHandlerContext ctx) throws Exception {
System.out.println("[Connection: " + ctx.channel().id() + "] Channel registered");
super.channelRegistered(ctx);
}
#Override
protected void channelRead0(ChannelHandlerContext ctx, Packet packet) {
// Packet handling
}
}
The malfunction connection is not calling anything of the SimpleChannelInboundHandler. Not even ExceptionCaught.
The GameServer ChannelInitialisation
public class ChannelInitialisation extends ChannelInitializer<SocketChannel> {
private SimpleChannelInboundHandler channelInputReader;
public ChannelInitialisation(SimpleChannelInboundHandler channelInputReader) {
this.channelInputReader = channelInputReader;
}
#Override
protected void initChannel(SocketChannel ch) throws Exception {
ChannelPipeline pipeline = ch.pipeline();
// every packet is prefixed with the amount of bytes that will follow
pipeline.addLast(new LengthFieldBasedFrameDecoder(Integer.MAX_VALUE, 0, 4, 0, 4));
pipeline.addLast(new LengthFieldPrepender(4));
pipeline.addLast(new PacketEncoder(), new PacketDecoder(), channelInputReader);
}
}
Client
Client creating a GameServer connection:
// Configure the client.
group = new NioEventLoopGroup();
Bootstrap b = new Bootstrap();
b.group(group)
.channel(NioSocketChannel.class)
.option(ChannelOption.TCP_NODELAY, true)
.handler(new ChannelInitialisation(channelHandler));
// Start the client.
channel = b.connect(address, port).await().channel();
/* At this point, the client thinks that the connection was succesfully, as the channel is active, open, registered and writable...*/
ClientInitialisation:
public class ChannelInitialisation extends ChannelInitializer<SocketChannel> {
private SimpleChannelInboundHandler<Packet> channelHandler;
ChannelInitialisation(SimpleChannelInboundHandler<Packet> channelHandler) {
this.channelHandler = channelHandler;
}
#Override
public void initChannel(SocketChannel ch) throws Exception {
// prefix messages by the length
ch.pipeline().addLast(new LengthFieldBasedFrameDecoder(Integer.MAX_VALUE, 0, 4, 0, 4));
ch.pipeline().addLast(new LengthFieldPrepender(4));
// our encoder, decoder and handler
ch.pipeline().addLast(new PacketEncoder(), new PacketDecoder(), channelHandler);
}
}
ClientHandler:
public class ClientPacketHandler extends SimpleChannelInboundHandler<Packet> {
#Override
public void channelActive(ChannelHandlerContext ctx) throws Exception {
super.channelActive(ctx);
System.out.println("Channel active: " + ctx.channel().id());
ctx.channel().writeAndFlush(new PacketSetupClientToGameServer());
System.out.println("Sending setup packet to the GameServer: " + ctx.channel().id());
// This is successfully called, as the client thinks the connection was properly made.
}
#Override
protected void channelRead0(ChannelHandlerContext ctx, Packet packet) {
// Reading packets.
}
}
I expect that the Client could connect properly to the server. Since the other Clients are properly connecting and the client could previously connect just fine.
TL;DR: When multiple Clients try to create a new match, there is a possibility that one, possibly more, Client(s) will not connect properly with the server, after the previous connection was closed.
For some that struggle with this issue in some way or another.
I did a workaround that allows me to continue even tho there is still a bug inside the Netty framework (as far as I am concerned). The workaround is quite simple just create a connection pool.
My solution uses a maximum of five connections inside the connection pool. If one of the connection gets no reply from the GameServer, then it is not that big of a deal, since there are four others that will have a high chance of succeeding. I know this is a bad workaround, but I could not find any information on this issue. It works and only gives a maximum delay of 5 seconds (each retry takes a second)

Kafka Transactional Producer

I am using Kafka 2 and I was going through the following link.
https://cwiki.apache.org/confluence/display/KAFKA/KIP-98+-+Exactly+Once+Delivery+and+Transactional+Messaging
Below is my sample code for Transactional producer.
My code:
public void runProducer(final int sendMessageCount) throws Exception {
final Producer<Long, String> producer = createProducer();
producer.initTransactions();
final long time = System.currentTimeMillis();
try {
producer.beginTransaction();
for (long index = time; index < (time + sendMessageCount); index++) {
final ProducerRecord<Long, String> record =
new ProducerRecord<>(TOPIC, index,
"Test " + index);
// send returns Future
producer.send(record).get();
}
producer.commitTransaction();
}
catch (ProducerFencedException | OutOfOrderSequenceException | AuthorizationException e) {
e.printStackTrace();
// We can't recover from these exceptions, so our only option is to close the producer and exit.
producer.close();
}
catch (final KafkaException e) {
e.printStackTrace();
// For all other exceptions, just abort the transaction and try again.
producer.abortTransaction();
}
finally {
producer.flush();
producer.close();
}
}
Questions:
Do we need to call endTransaction after commitTransaction ?
Do we need to call sendOffsetsToTransaction? What will happen if I don't include this?
How does it work when we deploy the same code to multiple servers with same transactionId? Do we need to have a separate transactionId for each instance? Say, machine1 crashes after beginTransaction() and after sending few records? How does machine2 with same transactionId recovers.
Machine1 is using transactionId "test" and it crashed after beginTransaction() and after producing few records. When the same instance comes up how does it resume the same transaction? We will actually again start from init & begin transaction.
How does it work for the same topic which was not involving in transaction and involving in transaction now? I am starting a new consumerGroup with transaction_committed, Will it read the messages which were committed before the transaction? Will the consumer with transaction_uncommitted see the messages which were aborted by transaction?

jboss-eap-6 HA singleton deploying multiple web archives in standalone configuration

I am able to deploy my ear and wars in my standalone cluster. 2 of my wars are for the HA singleton. Soon after starting the first standalone jboss-eap-6, I start the second. When all my applications have deployed successfully I open J-Console, I notice that one of my singleton war is running on the first jboss-eap-6 and the second singleton war is running on the second jboss-eap-6. Also in Jconsole, there was only 1 jboss-eap-6 reporting as primary.
My question is: Is there some way in jboss-eap-6 standalone.xml I can force only 1 jboss-eap-6 to run the singleton HA wars. Or would I have to package the wars into an ear?
I don't think there is anything in standalone.xml that would change the behaviour of a war. In any case you should be using standalone-ha.xml for a cluster with HA singletons deployed.
The JBoss High Availability Singleton architecture changed significantly between JBoss EAP 5 and 6.
Under JBoss EAP 5 you just placed your deployable object in a deploy-hasingleton special deployment folder. Under JBoss EAP 6 your classes need to implement a JBoss Service Layer, specifically org.jboss.msc.service.Service along with a org.jboss.msc.service.ServiceActivator. it is the implementation of these service classes that control the instantiation and management of your HA Singleton. I have not tried deploying a hasingleton as a war and I have some doubts because I suspect the dependent service classes may not be available in the web container.
The ServiceActivator is responsible for managing the lifecycle of the Service. The ServiceActivator implementation class needs to be listed in a file META-INF/service/org.jboss.msc.service.ServiceActivator for JBoss to activate it during its startup / deployment.
Example:
Create a Service Activator
public abstract class SingletonActivator<T extends Serializable> implements ServiceActivator {
#Override
public SingletonService<String> instantiateSingleton() {
return new SingletonService<String>();
}
public ServiceName getServiceName() {
return ServiceName.JBOSS.append("my", "ha", "singleton");
}
/**
* Activated by the Service Activator
*
* #param service
* #param serviceName
* - the Singleton Service Name that is registered in the JBOSS cluster
*/
#Override
public final void activate(ServiceActivatorContext context) {
SingletonService<T> service = instantiateSingleton();
SingletonService<T> singleton = new SingletonService<T>(service, getServiceName());
/*
* The NamePreference is a combination of the node name (-Djboss.node.name) and the name of
* the configured cache "singleton". If there is more than 1 node, it is possible to add more than
* one name and the election will use the first available node in that list.
*/
// e.g. singleton.setElectionPolicy(new PreferredSingletonElectionPolicy(new SimpleSingletonElectionPolicy(), new NamePreference("node1/singleton")));
// or singleton.setElectionPolicy(new PreferredSingletonElectionPolicy(new SimpleSingletonElectionPolicy(), new NamePreference("node1/singleton"), new
// NamePreference("node2/singleton")));
singleton.build(new DelegatingServiceContainer(context.getServiceTarget(), context.getServiceRegistry())).setInitialMode(ServiceController.Mode.ACTIVE).install();
}
}
Create A HA Singleton Service Class that is solely responsible for looking up and invoking your EJB containing your business logic
public class SingletonService<T> implements Service<T> {
protected ScheduledExecutorService deployDelayThread = null;
/**
* The node we are running on
*/
protected String nodeName;
/**
* A flag whether the service is started (or scheduled to be started)
*/
protected final AtomicBoolean started = new AtomicBoolean(false);
/**
* Container life cycle call upon activation. This will construct the singleton instance in this JVM and start the Timer.
*/
#Override
public final void start(StartContext context) throws StartException {
this.nodeName = System.getProperty("jboss.node.name");
logger.info("Starting service '" + this.getClass().getName() + "' on node " + nodeName);
if (!started.compareAndSet(false, true)) {
throw new StartException("The service " + this.getClass().getName() + " is still started!");
}
// MSC does not allow this thread to be blocked so we let the service know that the start is asynchronous and the result will be advised later.
// We delay the actual deployment of the Singleton for a few seconds to allow time for a HASingleton Election to be held and won by one of the instances.
// If the winner is not this instance (prior to deployemnt) then stop(Context) is invoked which sets started to false and the deployment does not occur.
// context.asynchronous();
deployDelayThread.schedule(new StartSingletonAsync(context), 10, TimeUnit.SECONDS);
context.complete();
}
/** Introduces a 5s delay in starting the Singleton bean giving time for the the ha singleton election to be held and won */
private class StartSingletonAsync implements Runnable {
private StartSingletonAsync(StartContext context) {
}
#Override
public void run() {
try {
startSingletonBean();
} catch (StartException e) {
logger.info("Start Exception", e);
}
// be nice to the garbage collector, we don't need this any more
deployDelayThread.shutdown();
deployDelayThread = null;
}
}
private void startSingletonBean() throws StartException {
try {
if (!started.get()) {
throw new StartException("Aborted due to service stopping");
}
// Start your EJB
InitialContext ic = new InitialContext();
bean = ic.lookup(getJndiName());
bean.startHaSingleton();
logger.info("*** Master Only: HASingleton service " + getJndiName() + " started on master:" + nodeName);
if (!bean.isRunning()) {
logger.error("ERROR Bean should be running");
}
} catch (NamingException e) {
throwStartException(e);
}
}
private void throwStartException(Exception e) throws StartException {
String message = "Could not initialize HASingleton" + getJndiName() + " on " + nodeName;
logger.error(message, e);
throw new StartException(message, e);
}
/**
* Container life cycle call when activated
*/
#Override
public final void stop(StopContext context) {
if (deployDelayThread != null) {
deployDelayThread.shutdownNow();
}
if (!started.compareAndSet(true, false) || bean == null) {
logger.warn("The service '" + this.getClass().getName() + "' is not active!");
} else {
try {
InitialContext ic = new InitialContext();
bean = (JmxMBean) ic.lookup(getJndiName());
bean.stopHaSingleton();
logger.info("*** Master Only: HASingleton service " + getJndiName() + " stopped on master:" + nodeName);
} catch (EJBException e) {
// Note: all these exceptions are already logged by JBoss
} catch (NamingException e) {
logger.error("Could not stop HASingleton service " + getJndiName() + " on " + nodeName, e);
}
logger.info("MASTER ONLY HASingleton service '" + this.getClass().getName() + "' Stopped on node " + nodeName);
}
}
private String getJndiName() {
return "java.global/path/to/your/singleton/ejb";
}
}
Finally list your Activator class in META-INF/servcie/org.jboss.msc.service.ServiceActivator
com.mycompany.singletons.SingletonActivator
You may also need to add dependencies to the manifest META-INF/MANIFEST.MF file inside your jar as follows: Dependencies: org.jboss.msc, org.jboss.as.clustering.singleton, org.jboss.as.server
There is a more extensive implementation guide available from Redhat at https://access.redhat.com/documentation/en-US/JBoss_Enterprise_Application_Platform/6.4/html/Development_Guide/Implement_an_HA_Singleton.html. You may need to create a Redhat account to access this. There is also a quickstart example in the JBoss distribution.
So after further evaluation, the 2 singletons eventually would merge together after a few minutes, thus creating 1 intended singleton.

How to dispatch incoming NetSocket handlers into different event loop threads?

I'm trying to use Vertx to implement a TCP server, accepting incoming connections and then handling different sockets. Since each socket can be handled independently, the handlers belonging to different sockets are supposed to run in different event loop threads concurrently.
According to Vert.x document,
Standard verticles are assigned an event loop thread when they are created and the start method is called with that event loop. When you call any other methods that takes a handler on a core API from an event loop then Vert.x will guarantee that those handlers, when called, will be executed on the same event loop.
I think, this code snippet can print different thread names:
Vertx vertx = Vertx.vertx(); // The number of event loop threads is 2*core.
vertx.createNetServer().connectHandler(socket -> {
vertx.deployVerticle(new AbstractVerticle() {
#Override
public void start() throws Exception {
socket.handler(buffer -> {
log.trace(socket.toString() + ": Socket Message");
socket.close();
});
}
});
}).listen(port);
But unfortunately, all handlers were located in the same thread.
23:59:42.359 [vert.x-eventloop-thread-1] TRACE Server - io.vertx.core.net.impl.NetSocketImpl#253fa4f2: Socket Message
23:59:42.364 [vert.x-eventloop-thread-1] TRACE Server - io.vertx.core.net.impl.NetSocketImpl#465f1533: Socket Message
23:59:42.365 [vert.x-eventloop-thread-1] TRACE Server - io.vertx.core.net.impl.NetSocketImpl#5ab8dac: Socket Message
23:59:42.366 [vert.x-eventloop-thread-1] TRACE Server - io.vertx.core.net.impl.NetSocketImpl#5fc72993: Socket Message
23:59:42.367 [vert.x-eventloop-thread-1] TRACE Server - io.vertx.core.net.impl.NetSocketImpl#38ee66d7: Socket Message
23:59:42.368 [vert.x-eventloop-thread-1] TRACE Server - io.vertx.core.net.impl.NetSocketImpl#6a60a74: Socket Message
23:59:42.369 [vert.x-eventloop-thread-1] TRACE Server - io.vertx.core.net.impl.NetSocketImpl#5f3921e1: Socket Message
23:59:42.370 [vert.x-eventloop-thread-1] TRACE Server - io.vertx.core.net.impl.NetSocketImpl#39d41024: Socket Message
... more than 100+ lines ...
An opposite example is similar to this echo server written in BOOST.ASIO. The handlers run in different event loop threads if a thread pool is used to execute io_service::run().
So, my question is how to run these handlers concurrently?
Actually, you do something entirely different from what you intend.
Each time you receive connection on your socket, you launch a new actor,
Simplest way to prove that:
Vertx vertx = Vertx.vertx(); // The number of event loop threads is 2*core.
vertx.createHttpServer().requestHandler(request -> {
vertx.deployVerticle(new AbstractVerticle() {
String uuid = UUID.randomUUID().toString(); // Some random unique number
#Override
public void start() throws Exception {
request.response().end(uuid + " " + Thread.currentThread().getName());
}
});
}).listen(8888);
vertx.setPeriodic(1000, r -> {
System.out.println(vertx.deploymentIDs().size()); // Print verticles count every second
});
I'm using httpServer just because it's easier to check in browser.
As wrong as it may be, you'll still see that you should receive different threads:
fe931b18-89cc-4c6a-9d6a-8565bb1f1c12 vert.x-eventloop-thread-9
277330da-4df8-4e91-bd8f-82c0f62156d0 vert.x-eventloop-thread-11
bbd3207c-80a4-41d8-9be5-b40727badc84 vert.x-eventloop-thread-13
Now to how you should do it:
// We create 10 workers
for (int i = 0; i < 10; i++) {
vertx.deployVerticle(new AbstractVerticle() {
#Override
public void start() {
vertx.eventBus().consumer("processMessage", (request) -> {
// Do something smart
// Reply
request.reply("I'm on thread " + Thread.currentThread().getName());
});
}
});
}
// This is your handler
vertx.createHttpServer().requestHandler(request -> {
// Only one server, that should dispatch events to workers as quickly as possible
vertx.eventBus().send("processMessage", null, (response) -> {
if (response.succeeded()) {
request.response().end("Request :" + response.result().body().toString());
}
// Handle errors
});
}).listen(8888);
vertx.setPeriodic(1000, r -> {
System.out.println(vertx.deploymentIDs().size()); // Notice that number of workers doesn't change
});
It's not possible to determine which event loop Vert.x will assign to each of your verticles without more details (number of cores of your test machines for example).
Anyway, it is not a good idea to deploy a verticle per incoming connection. Verticles are units of deployment in Vert.x. You would typically create one per "functionality".
Back to your use case, the purpose of event driven programming is precisely to avoid using a thread per connection. You can handle a lot of concurrent connections with a single event loop. If you have multiple cores on your machine then you can deploy multiple instances of your verticle to use them all (1 event loop per core).
int processors = Runtime.getRuntime().availableProcessors();
Vertx vertx = Vertx.vertx();
vertx.deployVerticle(TCPServerVerticle.class.getName(), new DeploymentOptions().setInstances(processors));
public class TCPServerVerticle extends AbstractVerticle {
#Override
public void start(Future<Void> startFuture) throws Exception {
vertx.createNetServer().connectHandler(socket -> {
socket.handler(buffer -> {
log.trace(socket.toString() + ": Socket Message");
socket.close();
});
}).listen(port, ar -> {
if (ar.succeeded()) {
startFuture.complete();
} else {
startFuture.fail(ar.cause());
}
});
}
}
With Vertx TCP server sharing the connect handlers will be called on a round-robin fashion.