Netty Client Connect with Server, but server does not fire channelActive/Registered - server

I have the following architecture in use:
- [Client] - The enduser connecting to our service.
- [GameServer] - The game server on which the game is running.
- [GameLobby] - A server that is responsible for matching Clients with a GameServer.
If we have for example 4 Clients that want to play a game and get matched to a GameLobby, then the first time all these connection succeeds properly.
However when they decide to rematch, then one of the Clients will not properly connect.
The connection between all the Clients and the GameServer happens simultaneously.
Clients that rematch first removes their current connection with the GameServer and head into the lobby again.
This connection will succeed, no errors are thrown. Even using a ChannelFuture it shows that the client connection was made properly, the following values are retrieved to show that the client thinks the connection was correct:
- ChannelFuture.isSuccess() = True
- ChannelFuture.isDone() = True
- ChannelFuture.cause() = Null
- ChannelFuture.isCancelled() = False
- Channel.isOpen() = True
- Channel.isActive() = True
- Channel.isRegistered() = True
- Channel.isWritable() = True
Thus the connection was properly made according to the Client. However on the GameServer at the SimpleChannelInboundHandler, the method ChannelRegistered/ChannelActive is never called for that specific Client. Only for the other 3 Clients.
All the 4 Clients, the GameServer, and the Lobby are running on the same IPAddress.
Since it only happens when (re)connecting again to the GameServer, I thought that is had to do with not properly closing the connection. Currently this is done through:
try {
group.shutdownGracefully();
channel.closeFuture().sync();
} catch (InterruptedException e) {
e.printStackTrace();
}
On the GameServer the ChannelUnregister is called thus this is working, and the connection is destroyed.
I have tried adding listeners to the ChannelFuture of the malfunctioning channel connection, however according to the channelFuture everything works, which is not the case.
I tried adding ChannelOptions to allow for more Clients queued to the server.
GameServer
The GameServer server is initialized as follow:
// Create the bootstrap to make this act like a server.
ServerBootstrap serverBootstrap = new ServerBootstrap();
serverBootstrap.group(bossGroup)
.channel(NioServerSocketChannel.class)
.childHandler(new ChannelInitialisation(new ClientInputReader(gameThread)))
.option(ChannelOption.SO_BACKLOG, 1000)
.childOption(ChannelOption.SO_KEEPALIVE, true)
.childOption(ChannelOption.TCP_NODELAY, true);
bossGroup.execute(gameThread); // Executing the thread that handles all games on this GameServer.
// Launch the server with the specific port.
serverBootstrap.bind(port).sync();
The GameServer ClientInputReader
#ChannelHandler.Sharable
public class ClientInputReader extends SimpleChannelInboundHandler<Packet> {
private ServerMainThread serverMainThread;
public ClientInputReader(ServerMainThread serverMainThread) {
this.serverMainThread = serverMainThread;
}
#Override
public void channelRegistered(ChannelHandlerContext ctx) throws Exception {
System.out.println("[Connection: " + ctx.channel().id() + "] Channel registered");
super.channelRegistered(ctx);
}
#Override
protected void channelRead0(ChannelHandlerContext ctx, Packet packet) {
// Packet handling
}
}
The malfunction connection is not calling anything of the SimpleChannelInboundHandler. Not even ExceptionCaught.
The GameServer ChannelInitialisation
public class ChannelInitialisation extends ChannelInitializer<SocketChannel> {
private SimpleChannelInboundHandler channelInputReader;
public ChannelInitialisation(SimpleChannelInboundHandler channelInputReader) {
this.channelInputReader = channelInputReader;
}
#Override
protected void initChannel(SocketChannel ch) throws Exception {
ChannelPipeline pipeline = ch.pipeline();
// every packet is prefixed with the amount of bytes that will follow
pipeline.addLast(new LengthFieldBasedFrameDecoder(Integer.MAX_VALUE, 0, 4, 0, 4));
pipeline.addLast(new LengthFieldPrepender(4));
pipeline.addLast(new PacketEncoder(), new PacketDecoder(), channelInputReader);
}
}
Client
Client creating a GameServer connection:
// Configure the client.
group = new NioEventLoopGroup();
Bootstrap b = new Bootstrap();
b.group(group)
.channel(NioSocketChannel.class)
.option(ChannelOption.TCP_NODELAY, true)
.handler(new ChannelInitialisation(channelHandler));
// Start the client.
channel = b.connect(address, port).await().channel();
/* At this point, the client thinks that the connection was succesfully, as the channel is active, open, registered and writable...*/
ClientInitialisation:
public class ChannelInitialisation extends ChannelInitializer<SocketChannel> {
private SimpleChannelInboundHandler<Packet> channelHandler;
ChannelInitialisation(SimpleChannelInboundHandler<Packet> channelHandler) {
this.channelHandler = channelHandler;
}
#Override
public void initChannel(SocketChannel ch) throws Exception {
// prefix messages by the length
ch.pipeline().addLast(new LengthFieldBasedFrameDecoder(Integer.MAX_VALUE, 0, 4, 0, 4));
ch.pipeline().addLast(new LengthFieldPrepender(4));
// our encoder, decoder and handler
ch.pipeline().addLast(new PacketEncoder(), new PacketDecoder(), channelHandler);
}
}
ClientHandler:
public class ClientPacketHandler extends SimpleChannelInboundHandler<Packet> {
#Override
public void channelActive(ChannelHandlerContext ctx) throws Exception {
super.channelActive(ctx);
System.out.println("Channel active: " + ctx.channel().id());
ctx.channel().writeAndFlush(new PacketSetupClientToGameServer());
System.out.println("Sending setup packet to the GameServer: " + ctx.channel().id());
// This is successfully called, as the client thinks the connection was properly made.
}
#Override
protected void channelRead0(ChannelHandlerContext ctx, Packet packet) {
// Reading packets.
}
}
I expect that the Client could connect properly to the server. Since the other Clients are properly connecting and the client could previously connect just fine.
TL;DR: When multiple Clients try to create a new match, there is a possibility that one, possibly more, Client(s) will not connect properly with the server, after the previous connection was closed.

For some that struggle with this issue in some way or another.
I did a workaround that allows me to continue even tho there is still a bug inside the Netty framework (as far as I am concerned). The workaround is quite simple just create a connection pool.
My solution uses a maximum of five connections inside the connection pool. If one of the connection gets no reply from the GameServer, then it is not that big of a deal, since there are four others that will have a high chance of succeeding. I know this is a bad workaround, but I could not find any information on this issue. It works and only gives a maximum delay of 5 seconds (each retry takes a second)

Related

Vertx - threads are stuck while sending response back to client

I'm using vertx-4.2.6 to build a proxy service which takes requests from clients (for ex: browser, standalone apps etc), invoke a single thirdparty server, gets the response and send the same response back to client who initiated the request.
In this process, I'm using shared Webclient across multiple requests, i'm getting response from thirdparty quickly (mostly in milli seconds) but sometimes the response is not returned back to client and stucks at ctx.end(response).
Whenever i restart my proxy server, it serves requests sometimes without any issues but time goes on, lets say by EOD, for new requests client seeing 503 error -service unavailable I'm using one MainVerticle with 10 instances. I'm not using any worker threads.
Below is the pseudo code:
MainVerticle
DeploymentOptions depOptions = new DeploymentOptions();
depOptions.setConfig(config);
depOptions.setInstances(10);
vertx.deployVerticle(MainVerticle.class.getName(), depOptions);
.....
router.route("/api/v1/*")
.handler(new HttpRequestHandler(vertx));
HttpRequestHandler
public class HttpRequestHandler implements Handler<RoutingContext> {
private final Logger LOGGER = LogManager.getLogger( HttpRequestHandler.class );
private WebClient webClient;
public HttpRequestHandler(Vertx vertx) {
super(vertx);
this.webClient=createWebClient(vertx);
}
private WebClient createWebClient(Vertx vertx) {
WebClientOptions options=new WebClientOptions();
options.setConnectTimeout(30000);
WebClient webClient = WebClient.create(vertx,options);
return webClient;
}
#Override
public void handle(RoutingContext ctx) {
ctx.request().bodyHandler(bh -> {
ctx.request().headers().remove("Host");
StopWatch sw=StopWatch.createStarted();
LOGGER.info("invoking CL end point with the given request details...");
/*
* Invoking actual target
*/
webClient.request(ctx.request().method(),target_port,target_host, "someURL")
.timeout(5000)
.sendBuffer(bh)
.onSuccess(clResponse -> {
LOGGER.info("CL response statuscode: {}, headers: {}",clResponse.statusCode(),clResponse.headers());
LOGGER.trace("response body from CL: {}",clResponse.body());
sw.stop();
LOGGER.info("Timetaken: {}ms",sw.getTime()); //prints in milliseconds
LOGGER.info("sending response back to client...."); //stuck here
/*
* prepare the final response and return to client..
*/
ctx.response().setStatusCode(clResponse.statusCode());
ctx.response().headers().addAll(clResponse.headers());
if(clResponse.body()!=null) {
ctx.response().end(clResponse.body());
}else {
ctx.response().end();
}
LOGGER.info("response SENT back to client...!!"); //not getting this log for certain requests and gives 503 - service unavailable to clients after 5 seconds..
}).onFailure(err -> {
LOGGER.error("Failed while invoking CL server:",err);
sw.stop();
if(err.getCause() instanceof java.net.ConnectException) {
connectionRefused(ctx);
}else {
invalidResponse(ctx);
}
});
});
Im suspecting issue might be due to shared webclient. But i'm not sure. I'm new to Vertx and i'm not getting any clue what's going wrong. Please suggest if there are any options to be set on WebClientOptions to avoid this issue.

Vertx event bus slow consuming issue

We have a non clustered vertx application, and we use the event bus to internally communicate between verticles.
Verticle A consumes from the bus, performs a HTTP request, and sends the response back through the bus.
Verticle B just request to perform that HTTP request.
The problem appears when a "high" request volume is performed by Verticle B. Then, the consumer starts receiving the events slower and slower (presumably because they are getting queued in the event bus). For 8 requests/second the bus takes up to 3-4 seconds to consume the event. When the requests/second are elevated, it can take more than 30 seconds to consume it, so the bus timeout is triggered.
The thing is, Verticle A is really fast performing the HTTP operation (~200ms) so I don't really understand why the requests get stuck in the bus.
We've tried many solutions but none ot then worked:
Deploy multiple instances of Verticle A as workers
Use vertx.executeBlocking() to perform the HTTP request
The only thing that worked was commenting the HTTP request and returning a mock object through the bus. But again, the HTTP request doesn't take more than 200ms, so it shouldn't be blocking the bus.
Additional information: We use an autogenerated rest client that uses Retrofit + OkHttpClient. Due to company policy, we cannot use Vertx WebClient, so I didn't try this solution.
EXAMPLE
This is a really simplified version of our code so you can check if I'm missing something.
VERTICLE A
// Instantiated in Verticle A
public class EmailSender {
private final Vertx vertx;
private final EmailApiClient emailApiClient;
public EmailSender(Vertx vertx) {
this.vertx = vertx;
emailApiClient = ClientFactory.createEmailApiClient();
}
public void start() {
vertx.eventBus().consumer("sendEmail", this::sendEmail);
}
public void sendEmail(Message<EmailRequest> message) {
EmailRequest emailRequest = message.body();
emailApiClient.sendEmail(emailRequest).subscribe(
response -> {
if (response.code() == 200) {
EmailResponse emailResponse = response.body();
message.reply(emailResponse);
} else {
message.fail(500, "Error sending email");
}
});
}
}
VERTICLE B
// Instantiated in Verticle B
public class EmailCommunications {
private final Vertx vertx;
public EmailCommunications(Vertx vertx) {
this.vertx = vertx;
}
public Single<EmailResponse> sendEmail(EmailRequest emailRequest) {
SingleSubject<EmailResponse> emailSent = SingleSubject.create();
vertx.eventBus().request(
"sendEmail",
emailRequest,
busResult -> {
if (busResult.succeded()) {
emailSent.onSuccess(busResult.result().body())
} else {
emailSent.onError(busResult.cause())
}
}
);
return emailSent;
}
}
We fixed the issue changing our OkHttpClient configuration so HTTP requests won't get stuck
default void configureOkHttpClient(OkHttpClient.Builder okHttpClientBuilder) {
ConnectionPool connectionPool = new ConnectionPool(40, 5, TimeUnit.MINUTES);
Dispatcher dispatcher = new Dispatcher();
dispatcher.setMaxRequestsPerHost(200);
dispatcher.setMaxRequests(200);
okHttpClientBuilder
.readTimeout(60, TimeUnit.SECONDS)
.retryOnConnectionFailure(true)
.connectionPool(connectionPool)
.dispatcher(dispatcher);
}

UWP - StreamSocket connection error for some connections

We have 2 UWP apps. One app shares data to the other app through StreamSocket. The server app will send data to client app. There will be 30-40 or more devices running the client app and connecting to the server's socket to receive data.
When we test with one client app, all the data sharing happens without any issue. But when we started testing with about 10 devices using the client app, sometimes some apps don't receive data. And there seems to be an error saying A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond
In general it get shared to most of the devices, but few fails sometimes randomly. What could be the reason for this? Is there a connection limit to connect to a socket with given IP and port using Stream Socket?
Here is some parts of our code. Please let me know what we have to correct here to avoid getting that error.
Server side
public async Task StartServer(string serverIp, string serverPort)
{
try
{
HostName serverAddress = new HostName(serverIp);
//Create a StreamSocketListener to start listening for TCP connections.
StreamSocketListener socketListener = new StreamSocketListener();
//Hook up an event handler to call when connections are received.
socketListener.ConnectionReceived += SocketListener_ConnectionReceived;
//Start listening for incoming TCP connections on the specified port.
await socketListener.BindEndpointAsync(serverAddress, serverPort);
}
catch (Exception e)
{
}
}
private async void SocketListener_ConnectionReceived(StreamSocketListener sender, StreamSocketListenerConnectionReceivedEventArgs args)
{
try
{
await Task.Run(() => ShareFile(args.Socket));
}
catch (Exception e)
{
}
}
Client side
public async Task ServerConnect(string serverIP, string serverPort)
{
try
{
HostName serverAddress = new HostName(serverIP);
StreamSocket socket = new StreamSocket();
socket.Control.KeepAlive = false;
// Connect to the server.
await socket.ConnectAsync(serverAddress, serverPort, SocketProtectionLevel.PlainSocket);
}
catch (Exception e)
{
}
}
Also would like to get these clarified
-What is the difference between BindServiceNameAsync and BindEndpointAsync? Most examples seems to use the first one. When should we use the second one?
-If we call sender.Dispose(); in SocketListener_ConnectionReceived, will that affect the other clients trying to join the same socket?
-In the ShareFile() function, if we close args.Socket() after sending data, can it close the socket before the client actually read the data from that side?

How to dispatch incoming NetSocket handlers into different event loop threads?

I'm trying to use Vertx to implement a TCP server, accepting incoming connections and then handling different sockets. Since each socket can be handled independently, the handlers belonging to different sockets are supposed to run in different event loop threads concurrently.
According to Vert.x document,
Standard verticles are assigned an event loop thread when they are created and the start method is called with that event loop. When you call any other methods that takes a handler on a core API from an event loop then Vert.x will guarantee that those handlers, when called, will be executed on the same event loop.
I think, this code snippet can print different thread names:
Vertx vertx = Vertx.vertx(); // The number of event loop threads is 2*core.
vertx.createNetServer().connectHandler(socket -> {
vertx.deployVerticle(new AbstractVerticle() {
#Override
public void start() throws Exception {
socket.handler(buffer -> {
log.trace(socket.toString() + ": Socket Message");
socket.close();
});
}
});
}).listen(port);
But unfortunately, all handlers were located in the same thread.
23:59:42.359 [vert.x-eventloop-thread-1] TRACE Server - io.vertx.core.net.impl.NetSocketImpl#253fa4f2: Socket Message
23:59:42.364 [vert.x-eventloop-thread-1] TRACE Server - io.vertx.core.net.impl.NetSocketImpl#465f1533: Socket Message
23:59:42.365 [vert.x-eventloop-thread-1] TRACE Server - io.vertx.core.net.impl.NetSocketImpl#5ab8dac: Socket Message
23:59:42.366 [vert.x-eventloop-thread-1] TRACE Server - io.vertx.core.net.impl.NetSocketImpl#5fc72993: Socket Message
23:59:42.367 [vert.x-eventloop-thread-1] TRACE Server - io.vertx.core.net.impl.NetSocketImpl#38ee66d7: Socket Message
23:59:42.368 [vert.x-eventloop-thread-1] TRACE Server - io.vertx.core.net.impl.NetSocketImpl#6a60a74: Socket Message
23:59:42.369 [vert.x-eventloop-thread-1] TRACE Server - io.vertx.core.net.impl.NetSocketImpl#5f3921e1: Socket Message
23:59:42.370 [vert.x-eventloop-thread-1] TRACE Server - io.vertx.core.net.impl.NetSocketImpl#39d41024: Socket Message
... more than 100+ lines ...
An opposite example is similar to this echo server written in BOOST.ASIO. The handlers run in different event loop threads if a thread pool is used to execute io_service::run().
So, my question is how to run these handlers concurrently?
Actually, you do something entirely different from what you intend.
Each time you receive connection on your socket, you launch a new actor,
Simplest way to prove that:
Vertx vertx = Vertx.vertx(); // The number of event loop threads is 2*core.
vertx.createHttpServer().requestHandler(request -> {
vertx.deployVerticle(new AbstractVerticle() {
String uuid = UUID.randomUUID().toString(); // Some random unique number
#Override
public void start() throws Exception {
request.response().end(uuid + " " + Thread.currentThread().getName());
}
});
}).listen(8888);
vertx.setPeriodic(1000, r -> {
System.out.println(vertx.deploymentIDs().size()); // Print verticles count every second
});
I'm using httpServer just because it's easier to check in browser.
As wrong as it may be, you'll still see that you should receive different threads:
fe931b18-89cc-4c6a-9d6a-8565bb1f1c12 vert.x-eventloop-thread-9
277330da-4df8-4e91-bd8f-82c0f62156d0 vert.x-eventloop-thread-11
bbd3207c-80a4-41d8-9be5-b40727badc84 vert.x-eventloop-thread-13
Now to how you should do it:
// We create 10 workers
for (int i = 0; i < 10; i++) {
vertx.deployVerticle(new AbstractVerticle() {
#Override
public void start() {
vertx.eventBus().consumer("processMessage", (request) -> {
// Do something smart
// Reply
request.reply("I'm on thread " + Thread.currentThread().getName());
});
}
});
}
// This is your handler
vertx.createHttpServer().requestHandler(request -> {
// Only one server, that should dispatch events to workers as quickly as possible
vertx.eventBus().send("processMessage", null, (response) -> {
if (response.succeeded()) {
request.response().end("Request :" + response.result().body().toString());
}
// Handle errors
});
}).listen(8888);
vertx.setPeriodic(1000, r -> {
System.out.println(vertx.deploymentIDs().size()); // Notice that number of workers doesn't change
});
It's not possible to determine which event loop Vert.x will assign to each of your verticles without more details (number of cores of your test machines for example).
Anyway, it is not a good idea to deploy a verticle per incoming connection. Verticles are units of deployment in Vert.x. You would typically create one per "functionality".
Back to your use case, the purpose of event driven programming is precisely to avoid using a thread per connection. You can handle a lot of concurrent connections with a single event loop. If you have multiple cores on your machine then you can deploy multiple instances of your verticle to use them all (1 event loop per core).
int processors = Runtime.getRuntime().availableProcessors();
Vertx vertx = Vertx.vertx();
vertx.deployVerticle(TCPServerVerticle.class.getName(), new DeploymentOptions().setInstances(processors));
public class TCPServerVerticle extends AbstractVerticle {
#Override
public void start(Future<Void> startFuture) throws Exception {
vertx.createNetServer().connectHandler(socket -> {
socket.handler(buffer -> {
log.trace(socket.toString() + ": Socket Message");
socket.close();
});
}).listen(port, ar -> {
if (ar.succeeded()) {
startFuture.complete();
} else {
startFuture.fail(ar.cause());
}
});
}
}
With Vertx TCP server sharing the connect handlers will be called on a round-robin fashion.

Design choice for automatically reconnecting socket client

I'm working with a windows form application in C#. I'm using a socket client which is connecting in an asynchronous way to a server. I would like the socket to try reconnecting immediately to the server if the connection is broken for any reason. Which is the best design to approach the problem? Should I build a thread which is continuously checking if the connection is lost and tries to reconnect to the server?
Here is the code of my XcomClient class which is handling the socket communication:
public void StartConnecting()
{
socketClient.BeginConnect(this.remoteEP, new AsyncCallback(ConnectCallback), this.socketClient);
}
private void ConnectCallback(IAsyncResult ar)
{
try
{
// Retrieve the socket from the state object.
Socket client = (Socket)ar.AsyncState;
// Complete the connection.
client.EndConnect(ar);
// Signal that the connection has been made.
connectDone.Set();
StartReceiving();
NotifyClientStatusSubscribers(true);
}
catch(Exception e)
{
if (!this.socketClient.Connected)
StartConnecting();
else
{
}
}
}
public void StartReceiving()
{
StateObject state = new StateObject();
state.workSocket = this.socketClient;
socketClient.BeginReceive(state.buffer, 0, StateObject.BufferSize, 0, new AsyncCallback(OnDataReceived), state);
}
private void OnDataReceived(IAsyncResult ar)
{
try
{
StateObject state = (StateObject)ar.AsyncState;
Socket client = state.workSocket;
// Read data from the remote device.
int iReadBytes = client.EndReceive(ar);
if (iReadBytes > 0)
{
byte[] bytesReceived = new byte[iReadBytes];
Buffer.BlockCopy(state.buffer, 0, bytesReceived, 0, iReadBytes);
this.responseList.Enqueue(bytesReceived);
StartReceiving();
receiveDone.Set();
}
else
{
NotifyClientStatusSubscribers(false);
}
}
catch (SocketException e)
{
NotifyClientStatusSubscribers(false);
}
}
Today I try to catch a disconnection by checking the number of bytes received or catching a socket exception.
If your application only receives data on a socket, then in most cases, you will never detect a broken connection. If you don't receive any data for a long time, you don't know if it's because the connection is broken or if the other end simply hasn't sent any data. You will, of course, detect (as EOF on the socket) connections closed by the other end in the normal fashion despite this.
In order to detect a broken connection, you need a keepalive. You need to either:
make the other end guarantee that it will send data on a set schedule, and you time out and close the connection if you don't get it, or,
send a probe to the other end once in a while. In this case the OS will take care of noticing a broken connection and you will get an error reading the socket if it's broken, either promptly (connection reset by peer) or eventually (connection timed out).
Either way, you need a timer. Whether you implement the timer as an event in an event loop or as a thread that sleeps is up to you and the best solution probably depends on how the rest of your application is structured. If you have a main thread that runs an event loop then it's probably best to hook in to that.
You can also enable the TCP keepalives option on the socket, but an application-layer keepalive is generally considered more robust.