Ktor server app keep increasing open connections - httpclient

Hi i recently deploy ktor server project on server as rest api backend for my app using. I'm using netty and running it on server with system service. Ktor server is running on port 7171 and whenever i check connections with port 7171 it keep increasing. I'm checking with this command
ss -ant | grep :7171 | wc -l
After one day connection numbers 20k+ and server crash nothing work.
I think some connections keep open. IN logs i don't get any error except few errors like connection reset by peer.
I'm also using HttpClient with Apache and for caching list of data I'm storing data in companion object so not fetching it every time from database.
I reviewed code and have only doubt about above two things.
These are my gradle dependencies
implementation("org.jetbrains.kotlin:kotlin-stdlib-jdk8:$kotlin_version")
implementation("io.ktor:ktor-server-netty:$ktor_version")
implementation("io.ktor:ktor-client-apache:$ktor_version")
implementation("io.ktor:ktor-client-logging-native:$ktor_version")
implementation("io.ktor:ktor-gson:$ktor_version")
implementation("ch.qos.logback:logback-classic:$logback_version")
implementation("io.ktor:ktor-metrics:$ktor_version")
implementation("io.ktor:ktor-server-core:$ktor_version")
implementation("io.ktor:ktor-server-sessions:$ktor_version")
implementation("io.ktor:ktor-auth-jwt:$ktor_version")
implementation("org.jooq:jooq")
jooqGeneratorRuntime("mysql:mysql-connector-java:8.0.19")
implementation("mysql:mysql-connector-java:8.0.19")
implementation(group = "com.zaxxer", name = "HikariCP", version = "3.4.2")
implementation("io.sentry:sentry:1.7.30")
implementation("software.amazon.awssdk:s3:2.8.7")
Currently i have about 7k users and max concurrent users are 450.
Please guide me how i can check issue and figure out problem.
Here is HttClient code:
suspend fun post(
url: String,
params: Map<String, String> = emptyMap(),
headersMap: Map<String, String> = emptyMap()
): Result<String> {
val httpClient = getHttpClient()
return kotlin.runCatching {
httpClient.post<String>(url) {
body = MultiPartFormDataContent(
formData {
params.forEach {
append(it.key, it.value)
}
}
)
if (headersMap.isNotEmpty()) {
headersMap.forEach { (key, value) ->
header(key, value)
}
}
}.also {
httpClient.close()
}
}.onFailure { httpClient.close() }
}
private fun getHttpClient(): HttpClient {
return HttpClient(Apache) {
install(HttpTimeout) {
requestTimeoutMillis = 60000
}
engine {
customizeClient {
sslContext = SSLContextBuilder.create().loadTrustMaterial(object : TrustStrategy {
override fun isTrusted(chain: Array<out X509Certificate>?, authType: String?): Boolean {
return true
}
}).build()
setSSLHostnameVerifier(NoopHostnameVerifier())
}
}
}
}
Also please check my api response header i think keepalive should have some expiry?

Related

Vertx - threads are stuck while sending response back to client

I'm using vertx-4.2.6 to build a proxy service which takes requests from clients (for ex: browser, standalone apps etc), invoke a single thirdparty server, gets the response and send the same response back to client who initiated the request.
In this process, I'm using shared Webclient across multiple requests, i'm getting response from thirdparty quickly (mostly in milli seconds) but sometimes the response is not returned back to client and stucks at ctx.end(response).
Whenever i restart my proxy server, it serves requests sometimes without any issues but time goes on, lets say by EOD, for new requests client seeing 503 error -service unavailable I'm using one MainVerticle with 10 instances. I'm not using any worker threads.
Below is the pseudo code:
MainVerticle
DeploymentOptions depOptions = new DeploymentOptions();
depOptions.setConfig(config);
depOptions.setInstances(10);
vertx.deployVerticle(MainVerticle.class.getName(), depOptions);
.....
router.route("/api/v1/*")
.handler(new HttpRequestHandler(vertx));
HttpRequestHandler
public class HttpRequestHandler implements Handler<RoutingContext> {
private final Logger LOGGER = LogManager.getLogger( HttpRequestHandler.class );
private WebClient webClient;
public HttpRequestHandler(Vertx vertx) {
super(vertx);
this.webClient=createWebClient(vertx);
}
private WebClient createWebClient(Vertx vertx) {
WebClientOptions options=new WebClientOptions();
options.setConnectTimeout(30000);
WebClient webClient = WebClient.create(vertx,options);
return webClient;
}
#Override
public void handle(RoutingContext ctx) {
ctx.request().bodyHandler(bh -> {
ctx.request().headers().remove("Host");
StopWatch sw=StopWatch.createStarted();
LOGGER.info("invoking CL end point with the given request details...");
/*
* Invoking actual target
*/
webClient.request(ctx.request().method(),target_port,target_host, "someURL")
.timeout(5000)
.sendBuffer(bh)
.onSuccess(clResponse -> {
LOGGER.info("CL response statuscode: {}, headers: {}",clResponse.statusCode(),clResponse.headers());
LOGGER.trace("response body from CL: {}",clResponse.body());
sw.stop();
LOGGER.info("Timetaken: {}ms",sw.getTime()); //prints in milliseconds
LOGGER.info("sending response back to client...."); //stuck here
/*
* prepare the final response and return to client..
*/
ctx.response().setStatusCode(clResponse.statusCode());
ctx.response().headers().addAll(clResponse.headers());
if(clResponse.body()!=null) {
ctx.response().end(clResponse.body());
}else {
ctx.response().end();
}
LOGGER.info("response SENT back to client...!!"); //not getting this log for certain requests and gives 503 - service unavailable to clients after 5 seconds..
}).onFailure(err -> {
LOGGER.error("Failed while invoking CL server:",err);
sw.stop();
if(err.getCause() instanceof java.net.ConnectException) {
connectionRefused(ctx);
}else {
invalidResponse(ctx);
}
});
});
Im suspecting issue might be due to shared webclient. But i'm not sure. I'm new to Vertx and i'm not getting any clue what's going wrong. Please suggest if there are any options to be set on WebClientOptions to avoid this issue.

Netty starts channels but does not read from them in kubernetes

netty-all:4.1.48.Final
I am having a cryptic issue with Netty that seems to only show up in Kubernetes. I have a clone of the project running on a cloud instance with less resources that does not have this issue. Both projects receive the same amount of traffic (I am resending the same traffic from a third provider to both Netty servers).
In kubernetes, every time a channel is opened (I send a message) I increment my session counter. Every time the channel reads data, I increment a read counter. I am sending data every time so I would expect to see at the very least one read for every session (more if the data were long enough) but not less. The counters drift apart rather smoothly until the amount of reads stays around half of the amount of opened sessions.
Is there any way for me to diagnose this issue? I have written the barebones netty server I am using (with the configuration, including an idle timer). Am I blocking Netty resources?
class Server {
private val bossGroup = NioEventLoopGroup()
private val workerGroup = NioEventLoopGroup()
fun start() {
ServerBootstrap()
.group(bossGroup, workerGroup)
.option(ChannelOption.SO_REUSEADDR, true)
.option(ChannelOption.AUTO_CLOSE, false)
.channel(NioServerSocketChannel::class.java)
.option(ChannelOption.SO_KEEPALIVE, true)
.option(ChannelOption.TCP_NODELAY, true)
.childHandler(object : ChannelInitializer<SocketChannel>() {
override fun initChannel(channel: SocketChannel) {
val idleTimeTrigger = 1
val idleStateHandler = IdleStateHandler(0, 0, idleTimeTrigger)
channel
.pipeline()
.addLast("idleStateHandler", idleStateHandler)
.addLast(Session(idleTimeTrigger))
}
})
.bind(8888)
.sync()
.channel()
.closeFuture()
.sync()
}
}
class Session(
private val idleTimeTrigger: Int,
) : ChannelInboundHandlerAdapter() {
// session counter
val idleTimeout = 10
var idleTickCounter = 0L
override fun channelRead(ctx: ChannelHandlerContext, msg: Any) {
// read counter is less than session counter... HUH????
this.idleTickCounter = 0
try {
val data = (msg as ByteBuf).toString(CharsetUtil.UTF_8)
// ... do my stuff ..
// output counter is less than session counter
} finally {
ReferenceCountUtil.release(msg)
}
}
override fun userEventTriggered(ctx: ChannelHandlerContext, evt: Any) {
this.idleTickCounter++
val idleTime = idleTimeTrigger * idleTickCounter
if (idleTime > idleTimeout) {
// idle timeout counter is always 0
ctx.close()
}
super.userEventTriggered(ctx, evt)
}
override fun exceptionCaught(ctx: ChannelHandlerContext, cause: Throwable) {
// error counter is always 0
ctx.close()
}
}
The output is being passed to a rabbit AMQP client and sent to a queue. I don't know if this is relevant (with regards to resource usage) but the AMQP client uses Jetty

Why does my Spring WebFlux controller return data on first request only?

I am working on a web application where the user's connection times out after a specific time (say 20 seconds). For long running requests I have to return a default message ("your request is under process") and then send an email to the user with the actual result.
I couldn't do this with spring web because I didn't know how to specify a timeout in the controller (with customized messages per request) and at the same time let other requests come through and be processed too. That's why I used spring web-flux which has a timeout operator for both Mono and Flux types.
To make the requested process run in a different thread, I have used Sinks. One to receive requests and one to publish the results. My problem is that the response sink can only return one result and subsequent calls to the URL returns an empty response. For example the first call to /reactive/getUser/123456789 returns the user object but subsequent calls return empty.
I'm not sure if the problem is with the Sink I have used or with how I am getting data from it. In the sample code I have used responseSink.asFlux().next() but I have also tried .single(), .toMono(), .take(1). to no avail. I get the same result.
#RequestMapping("/reactive")
#RestController
class SampleController #Autowired constructor(private val externalService: ExternalService) {
private val requestSink = Sinks.many().multicast().onBackpressureBuffer<String>()
private val responseSink = Sinks.many().multicast().onBackpressureBuffer<AppUser>()
init {
requestSink.asFlux()
.map { phoneNumber -> externalService.findByIdOrNull(phoneNumber) }
.doOnNext {
if (it != null) {
responseSink.tryEmitNext(it)
} else {
responseSink.tryEmitError(Throwable("didn't find a value for that phone number"))
}
}
.subscribe()
}
#GetMapping("/getUser/{phoneNumber}")
fun getUser(#PathVariable phoneNumber: String): Mono<String> {
requestSink.tryEmitNext(phoneNumber)
return responseSink.asFlux()
.next()
.map { it.toString() }
.timeout(Duration.ofSeconds(20), Mono.just("processing your request"))
}
}

MassTransit 3 How to send a message explicitly to the error queue

I'm using MassTransit with Reactive Extensions to stream messages from the queue in batches. Since the behaviour isn't the same as a normal consumer I need to be able to send a message to the error queue if it fails an x number of times.
I've looked through the MassTransit source code and posted on the google groups and can't find an anwser.
Is this available on the ConsumeContext interface? Or is this even possible?
Here is my code. I've removed some of it to make it simpler.
_busControl = Bus.Factory.CreateUsingRabbitMq(cfg =>
{
var host = cfg.Host(new Uri("rabbitmq://localhost/"), h =>
{
h.Username("guest");
h.Password("guest");
});
cfg.UseInMemoryScheduler();
cfg.ReceiveEndpoint(host, "customer_update_queue", e =>
{
var _observer = new ObservableObserver<ConsumeContext<Customer>>();
_observer.Buffer(TimeSpan.FromMilliseconds(1000)).Subscribe(OnNext);
e.Observer(_observer);
});
});
private void OnNext(IList<ConsumeContext<Customer>> messages)
{
foreach (var consumeContext in messages)
{
Console.WriteLine("Content: " + consumeContext.Message.Content);
if (consumeContext.Message.RetryCount > 3)
{
// I want to be able to send to the error queue
consumeContext.SendToErrorQueue()
}
}
}
I've found a work around by using the RabbitMQ client mixed with MassTransit. Since I can't throw an exception when using an Observable and therefore no error queue is created. I create it manually using the RabbitMQ client like below.
ConnectionFactory factory = new ConnectionFactory();
factory.HostName = "localhost";
factory.UserName = "guest";
factory.Password = "guest";
using (IConnection connection = factory.CreateConnection())
{
using (IModel model = connection.CreateModel())
{
string exchangeName = "customer_update_queue_error";
string queueName = "customer_update_queue_error";
string routingKey = "";
model.ExchangeDeclare(exchangeName, ExchangeType.Fanout);
model.QueueDeclare(queueName, false, false, false, null);
model.QueueBind(queueName, exchangeName, routingKey);
}
}
The send part is to send it directly to the message queue if it fails an x amount of times like so.
consumeContext.Send(new Uri("rabbitmq://localhost/customer_update_queue_error"), consumeContext.Message);
Hopefully the batch feature will be implemented soon and I can use that instead.
https://github.com/MassTransit/MassTransit/issues/800

When submitting a spark streaming recevier, how to specify host without "failing through"?

I want to create a server socket to listen on, on a host that I know the ip and hostname ahead of time (and it shows up with that hostname in the yarn node list) . But I can't seem to get it to listen on that host without letting it fail an arbitrary number of times before hand.
There's a Flume receiver that has the sort of host-specific functionality I'm looking for.
FlumeUtils.createStream(streamingContext, [chosen machine's hostname], [chosen port])
My receiver code:
class TCPServerReceiver(hostname: String, port: Int)
extends Receiver[String](StorageLevel.MEMORY_AND_DISK_2) with Logging {
def onStart() {
// Start the thread that receives data over a connection
new Thread("Socket Receiver") {
override def run() { receive() }
}.start()
}
def onStop() {
}
private def receive() {
/* This is where the job fails until it happens to start on the correct host */
val server = new ServerSocket(port, 50, InetAddress.getByName(hostname))
var userInput: String = null
while (true) {
try {
val s = server.accept()
val in = new BufferedReader(new InputStreamReader(s.getInputStream()))
userInput = in.readLine()
while (!isStopped && userInput != null) {
store(userInput)
userInput = in.readLine()
}
} catch {
case e: java.net.ConnectException =>
restart("Error connecting to " + port, e)
case t: Throwable =>
restart("Error receiving data", t)
}
}
}
}
And then to test it while it's running:
echo 'this is a test' | nc <hostname> <port>
This all works when I run as a local client, but when it's submitted to a yarn cluster, the logs show it trying to run in other containers on different hosts and all of them fail because the hostname doesn't match that of the container:
java.net.BindException: Cannot assign requested address
Eventually (after several minutes), it does create the socket once the receiver tries to start on the correct host, so the above code does work, but it takes a substantial amount of "boot time" and I'm worried that adding more nodes will cause it to take even longer!
Is there a way of ensuring that this receiver starts on the correct host on the first try?
The custom TCPServerReceiver implementation should also implement:
def preferredLocation: Option[String]
Override this to specify a preferred location (hostname).
In this case, something like:
def preferredLocation = Some(hostname)