Akka HTTP Streaming server kills connection after last message from queue - scala

I have a pretty simple app that consists of a Kafka consumer sitting behind an Akka HTTP streaming server. Upon receiving a request, the server starts up a new consumer for the specified user and begins reading messages from a queue:
def consumer(consumerGroup: String, from: Int) = {
val topicsAndDate = Subscriptions.assignmentOffsetsForTimes(partitions.map(_ -> (System.currentTimeMillis() - from)): _*)
Consumer.plainSource[String, GenericRecord](consumerSettings.withGroupId(consumerGroup), topicsAndDate)
.map(record => record.timestamp() -> messageFormat.from(record.value()))
.map {
//convert to json
def routes: Route = Route.seal(
pathSingleSlash {
complete(HttpEntity(ContentTypes.`text/html(UTF-8)`, "Say hello to akka-http"))
} ~
path("stream") {
//some logic to validate user
log.info("Received request from {} with 'from'={}", user, from)
complete(consumer(user, from))
startServer("", 8080)
The service works fine until the consumer has reached the latest message on the queue. Sixty seconds after this latest message has been returned, the connection to the server is killed every time. I want to keep the connection alive as the queue is populated with more messages every couple of minutes.
I have tried various different config options, but none seem to give the desired outcome. My current config looks like this:
akka {
http {
client {
idle-timeout = 300s
server {
idle-timeout = 600s
linger-timeout = 15 min
host-connection-pool {
max-retries = 30
max-connections = 20
max-open-requests = 32
connecting-timeout = 60s
client {
idle-timeout = 300s
I have also tried using the server.websocket.periodic-keep-alive-max-idle = 1 second setting, but it doesn't seem to make any difference.
Let me know if I need to supply any more relevant info.


http4s shutdown takes 30 seconds?

I'm learning http4s and trying out the basic example from the documentation, and I've noticed something weird. Simply starting and stopping the server works fine, but if any requests are sent, a graceful shutdown takes about 30 seconds (during which new incoming requests are still processed and responded to).
This is the code:
object Main extends IOApp.Simple {
val helloWorldService = HttpRoutes.of[IO] {
case GET -> Root / "hello" / name =>
Ok(s"Hello, $name.")
def server[F[_] : Async : Network]: EmberServerBuilder[F] = {
def run: IO[Unit] = {
.use(_ => IO.never)
This happens on both the stable (0.23.16) and dev (1.0.0-M37) versions.
Turns out the cause was the browser/Postman keeping the connection alive. Simply closing Postman after the request closed the connection and the shutdown was immediate.
And EmberServerBuilder has .withShutdownTimeout setting to control how long the shutdown waits for connections to be closed.

Netty starts channels but does not read from them in kubernetes

I am having a cryptic issue with Netty that seems to only show up in Kubernetes. I have a clone of the project running on a cloud instance with less resources that does not have this issue. Both projects receive the same amount of traffic (I am resending the same traffic from a third provider to both Netty servers).
In kubernetes, every time a channel is opened (I send a message) I increment my session counter. Every time the channel reads data, I increment a read counter. I am sending data every time so I would expect to see at the very least one read for every session (more if the data were long enough) but not less. The counters drift apart rather smoothly until the amount of reads stays around half of the amount of opened sessions.
Is there any way for me to diagnose this issue? I have written the barebones netty server I am using (with the configuration, including an idle timer). Am I blocking Netty resources?
class Server {
private val bossGroup = NioEventLoopGroup()
private val workerGroup = NioEventLoopGroup()
fun start() {
.group(bossGroup, workerGroup)
.option(ChannelOption.SO_REUSEADDR, true)
.option(ChannelOption.AUTO_CLOSE, false)
.option(ChannelOption.SO_KEEPALIVE, true)
.option(ChannelOption.TCP_NODELAY, true)
.childHandler(object : ChannelInitializer<SocketChannel>() {
override fun initChannel(channel: SocketChannel) {
val idleTimeTrigger = 1
val idleStateHandler = IdleStateHandler(0, 0, idleTimeTrigger)
.addLast("idleStateHandler", idleStateHandler)
class Session(
private val idleTimeTrigger: Int,
) : ChannelInboundHandlerAdapter() {
// session counter
val idleTimeout = 10
var idleTickCounter = 0L
override fun channelRead(ctx: ChannelHandlerContext, msg: Any) {
// read counter is less than session counter... HUH????
this.idleTickCounter = 0
try {
val data = (msg as ByteBuf).toString(CharsetUtil.UTF_8)
// ... do my stuff ..
// output counter is less than session counter
} finally {
override fun userEventTriggered(ctx: ChannelHandlerContext, evt: Any) {
val idleTime = idleTimeTrigger * idleTickCounter
if (idleTime > idleTimeout) {
// idle timeout counter is always 0
super.userEventTriggered(ctx, evt)
override fun exceptionCaught(ctx: ChannelHandlerContext, cause: Throwable) {
// error counter is always 0
The output is being passed to a rabbit AMQP client and sent to a queue. I don't know if this is relevant (with regards to resource usage) but the AMQP client uses Jetty

Where's the bottleneck when I wait for a Kafka message then return a value in Actix Web?

I am trying to communicate between 2 microservices written in Rust and Node.js using Kafka.
I'm using actix-web as web framework and rdkafka as Kafka client for Rust. On the Node.js side, it queries stuff from the database and returns it as JSON to the Rust server via Kafka.
The flow:
Request -> Actix Web -> Kafka -> Node -> Kafka -> Actix Web -> Response
The logic is the request hits an endpoint on Actix Web, then creates a message to request something to another micro-service and waits until it sends back (verify by Kafka message key), and returns it to the user as an HTTP response.
I got it to work, but the performance is very slow (I am stress-testing with wrk).
I'm not sure why it's performing slow but as I was digging down, I found that if I add a delay on the Node.js side for 5 seconds and I create 2 requests to actix-web where the requests are different by a second, it will respond with a 5 and 10-second delay.
The benchmark is around 3k requests per second, using the following command:
wrk http://localhost:8080 -d 20s -t 2 -c 200
This makes me guess that something might be blocking the thread for each request.
Here is the source code and the repo:
use std::{
use actix_web::{
use futures::TryStreamExt;
use tokio::time::sleep;
use num_cpus;
use rand::{
use rdkafka::{
const TOPIC: &'static str = "exp-queue_general-5";
pub struct AppState {
pub producer: Arc<FutureProducer>,
pub receiver: flume::Receiver<String>
fn generate_key() -> String {
async fn landing(state: Data<AppState>) -> String {
let key = generate_key();
let t1 = Instant::now();
let producer = &state.producer;
let receiver = &state.receiver;
FutureRecord::to(&format!("{}-forth", TOPIC))
.payload("Hello From Rust"),
.expect("Unable to send message");
println!("Producer take {} ms", t1.elapsed().as_millis());
let t2 = Instant::now();
let value = receiver
println!("Receiver take {} ms", t2.elapsed().as_millis());
println!("Process take {} ms\n", t1.elapsed().as_millis());
async fn heartbeat() -> &'static str {
// ? Concurrency delay check
async fn main() -> std::io::Result<()> {
// ? Assume that the whole node is just Rust instance
let mut cpus = num_cpus::get() / 2 - 1;
if cpus < 1 {
cpus = 1;
println!("Cpus {}", cpus);
let producer: FutureProducer = ClientConfig::new()
.set("bootstrap.servers", "localhost:9092")
.set("linger.ms", "25")
.set("queue.buffering.max.messages", "1000000")
.set("queue.buffering.max.ms", "25")
.set("compression.type", "lz4")
.set("retries", "40000")
.set("retries", "0")
.set("message.timeout.ms", "8000")
.expect("Kafka config");
let (tx, rx) = flume::unbounded::<String>();
rt::spawn(async move {
let consumer: StreamConsumer = ClientConfig::new()
.set("bootstrap.servers", "localhost:9092")
.set("group.id", &format!("{}-back", TOPIC))
.set("queued.min.messages", "200000")
.set("fetch.error.backoff.ms", "250")
.set("socket.blocking.max.ms", "500")
.expect("Kafka config");
.subscribe(&vec![format!("{}-back", TOPIC).as_ref()])
.expect("Can't subscribe");
|message| {
let txx = tx.clone();
async move {
let result = String::from_utf8_lossy(
.unwrap_or("Error serializing".as_bytes())
txx.send(result).expect("Tx not sending");
.expect("Error reading stream");
let state = AppState {
producer: Arc::new(producer),
receiver: rx
HttpServer::new(move || {
I found some solved issues on GitHub which recommended using actors instead which I also did as a separate branch.
This has worse performance than the main branch, performing around 200-300 requests per second.
I don't know where the bottleneck is or what's the thing that blocking the request.

Akka Streams for server streaming (gRPC, Scala)

I am new to Akka Streams and gRPC, I am trying to build an endpoint where client sends a single request and the server sends multiple responses.
This is my protobuf
syntax = "proto3";
option java_multiple_files = true;
option java_package = "customer.service.proto";
service CustomerService {
rpc CreateCustomer(CustomerRequest) returns (stream CustomerResponse) {}
message CustomerRequest {
string customerId = 1;
string customerName = 2;
message CustomerResponse {
enum Status {
No_Customer = 0;
Creating_Customer = 1;
Customer_Created = 2;
string customerId = 1;
Status status = 2;
I am trying to achieve this by sending customer request then the server will first check and respond No_Customer then it will send Creating_Customer and finally server will say Customer_Created.
I have no idea where to start for it implementation, looked for hours but still clueless, I will be very thankful if anyone can point me in the right direction.
The place to start is the Akka gRPC documentation and, in particular, the service WalkThrough. It is pretty straightforward to get the samples working in a clean project.
The relevant server sample method is this:
override def itKeepsReplying(in: HelloRequest): Source[HelloReply, NotUsed] = {
println(s"sayHello to ${in.name} with stream of chars...")
Source(s"Hello, ${in.name}".toList).map(character => HelloReply(character.toString))
The problem is now to create a Source that returns the right results, but that depends on how you are planning to implement the server so it is difficult to answer. Check the Akka Streams documentation for various options.
The client code is simpler, just call runForeach on the Source that gets returned by CreateCustomer as in the sample:
def runStreamingReplyExample(): Unit = {
val responseStream = client.itKeepsReplying(HelloRequest("Alice"))
val done: Future[Done] =
responseStream.runForeach(reply => println(s"got streaming reply: ${reply.message}"))
done.onComplete {
case Success(_) =>
println("streamingReply done")
case Failure(e) =>
println(s"Error streamingReply: $e")

waiting for ws future response in play framework

I am trying to build a service that grab some pages from another web service and process the content and return results to users. I am using Play 2.2.3 Scala.
val aas = WS.url("http://localhost/").withRequestTimeout(1000).withQueryString(("mid", mid), ("t", txt)).get
val result = aas.map {
response =>
(response.json \ "status").asOpt[Int].map {
st => status = st
(response.json \ "msg").asOpt[String].map {
txt => msg = txt
val rs1 = Await.result(result, 5 seconds)
if (rs1.isDefined) {
The problem is that the service will wait 5 seconds to return "good" even the WS request takes 100 ms. I also cannot set Await time to 100ms because the other web service I am requesting may take between 100ms to 1 second to respond.
My question is: is there a way to process and serve the results as soon as they are ready instead of wait a fixed amount of time?
#wingedsubmariner already provided the answer. Since there is no code example, I will just post what it should be:
def wb = Action.async{ request =>
val aas = WS.url("http://localhost/").withRequestTimeout(1000).get
aas.map(response =>{
Now you don't need to wait until the WS to respond and then decide what to do. You can just tell play to do something when it responds.