SIGTERM signal arrives first to kuma and stops all active application connections immediately - apache-kafka

we have applications that work with Kafka (MSK), we noticed that once pod is starting to shutdown (during autoscaling or deployment) the app container loses all active connections and the SIGTERM signal causes Kuma to close all connections immediately which cause data loss due to unfinished sessions (which doesn’t get closed gracefully) on the app side and after that we receive connection errors to the kafka brokers,
is anyone have an idea how to make Kuma wait some time once it gets the SIGTERM signal to let the sessions close gracefully?
or maybe a way to let the app know before the kuma about the shutsown?
or any other idea ?

This is known issue getting fixed in the coming 1.7 release: https://github.com/kumahq/kuma/pull/4229

Related

.NET Core / Kubernetes - SIGTERM, clean shutdown

I'm trying to verify that shutdown is completing cleanly on Kubernetes, with a .NET Core 2.0 app.
I have an app which can run in two "modes" - one using ASP.NET Core and one as a kind of worker process. Both use Console and JSON-which-ends-up-in-Elasticsearch-via-Filebeat-sidecar-container logger output which indicate startup and shutdown progress.
Additionally, I have console output which writes directly to stdout when a SIGTERM or Ctrl-C is received and shutdown begins.
Locally, the app works flawlessly - I get the direct console output, then the logger output flowing to stdout on Ctrl+C (on Windows).
My experiment scenario:
App deployed to GCS k8s cluster (using helm, though I imagine that doesn't make a difference)
Using kubectl logs -f to stream logs from the specific container
Killing the pod from GCS cloud console site, or deleting the resources via helm delete
Dockerfile is FROM microsoft/dotnet:2.1-aspnetcore-runtime and has ENTRYPOINT ["dotnet", "MyAppHere.dll"], so not wrapped in a bash process or anything
Not specifying a terminationGracePeriodSeconds so guess it defaults to 30 sec
Observing output returned
Results:
The API pod log streaming showed just the immediate console output, "[SIGTERM] Stop signal received", not the other Console logger output about shutdown process
The worker pod log streaming showed a little more - the same console output and some Console logger output about shutdown process
The JSON logs didn't seem to pick any of the shutdown log output
My conclusions:
I don't know if Kubernetes is allowing the process to complete before terminating it, or just issuing SIGTERM then killing things very quick. I think it should be waiting, but then, why no complete console logger output?
I don't know if console output is cut off when stdout log streaming at some point before processes finally terminates?
I would guess that the JSON stuff doesn't come through to ES because filebeat running in the sidecar terminates even if there's outstanding stuff in files to send
I would like to know:
Can anyone advise on points 1,2 above?
Any ideas for a way to allow a little extra time or leeway for the sidecar to send stuff up, like a pod container termination order, delay on shutdown for that container, etc?
SIGTERM does indeed signal termination. The less obvious part is that when the SIGTERM handler returns, everything is considered finished.
The fix is to not return from the SIGTERM handler until the app has finished shutting down. For example, using a ManualResetEvent and Wait()ing it in the handler.
I've started to look into this for my own purposes and have come across your question over a year after it was posted... This is a bit late, but have you tried GraceTerm?
There is an associated NuGET package for this.
From the description...
Graceterm middleware provides implementation to ensure graceful shutdown of AspNet Core applications. The basic concept is: After application received a SIGTERM (a signal asking it to terminate), Graceterm will hold it alive till all pending requests are completed or a timeout occur.
I haven't personally tried this yet, but it does look promising.
Try add STOPSIGNAL SIGINT to your Dockerfile

Getting error no such device or address on kubernetes pods

I have some dotnet core applications running as microservices into GKE (google kubernetes engine).
Usually everything work right, but sometimes, if my microservice isn't in use, something happen that my application shutdown (same behavior as CTRL + C on terminal).
I know that it is a behavior of kubernetes, but if i request application that is not running, my first request return the error: "No such Device or Address" or timeout error.
I will post some logs and setups:
The key to what's happening is this logged error:
TNS: Connect timeout occured ---> OracleInternal.Network....
Since your application is not used, the Oracle database just shuts down it's idle connection. To solve this problem, you can do two things:
Handle the disconnection inside your application to just reconnect.
Define a livenessProbe to restart the pod automatically once the application is down.
Make your application do something with the connection from time to time -> this can be done with a probe too.
Configure your Oracle database not to close idle connections.

Spark Driver died, but did not kill the application

I have a streaming job, which fails due to a network call timeout. Whereas the application keeps retying for some time, If in the mean time I kill the Driver, the application does not die. And I have to manually kill the application through the UI.
My question is:
Does this happen because the network connection forms over a different thread and does not let the Application die??

CoordinatedShutdown timeout on Akka cluster application

We've an akka cluster application (sharding some actors). Sometimes, when we deploy and our application should be turned off we see some logs like that:
Coordinated shutdown phase [cluster-sharding-shutdown-region] timed
out after 10000 milliseconds
This happens on the first deploy after more than 2 days since last deploy (on mondays for example). We ask the akka node to quit the cluster with the JMX helper and we have the following code too:
actorSystem.registerOnTermination {
logger.error("Gracefully shutdown of node")
System.exit(0)
}
So when this error happens, eventually node leaves the cluster (or at least it closes the JMX entry point to manage akka cluster) but process don't finish and the log "Gracefully shutdown of node" doesn't appear. So when this happen we need to shutdown the java process manually (we handle this with supervisor) and redeploy.
I know the timeout can be tunned through config but what are the implications of increasing this timeout? Why sometimes coordinated shutdown throws a timeout? What happens when coordinated shutdown timeout?
Any clue would be appreciated :D
Thank you
What happens after timeout? Quoting from Akka documentation:
If tasks are not completed within a configured timeout (see reference.conf) the next phase will be started anyway. It is possible to configure recover=off for a phase to abort the rest of the shutdown process if a task fails or is not completed within the timeout.
Why the shutdown may time out? Quite possible you have a deadlock somewhere. In that case, increasing the timeout wouldn't help. It may also very well be that you need more time for shutdown. Then, you must increase the timeout.
But more related to your problem, could be the following:
By default, the JVM is not forcefully stopped (it will be stopped if all non-daemon threads have been terminated). To enable a hard System.exit as a final action you can configure:
akka.coordinated-shutdown.exit-jvm = on
So you can turn this on, which should solve the "shutdown the java process manually" step.
Nevertheless, the hard question is to find out why the shutdown times out in the first place. I guess with the above trick you can survive for some time, but you'd better spend some time to find the actual cause.
We used to face this problem (One of the Co-ordinated shutdown phase timeout) for short lived application.
Use case where we faced this:
Application joins existing akka cluster
Does some work
Leaves the cluster
But at step 3, the status of member was still (Joining or WeaklyUp) and if you see task added for PhaseClusterLeave, it allows to remove member from cluster only if it's status is UP.
Snippet from ClusterDaemon.scala which is invoked on Running ClusterLeave phase :
def leaving(address: Address): Unit = {
// only try to update if the node is available (in the member ring)
if (latestGossip.members.exists(m ⇒ m.address == address && m.status == Up)) {
val newMembers = latestGossip.members map { m ⇒ if (m.address == address) m.copy(status = Leaving) else m } // mark node as LEAVING
val newGossip = latestGossip copy (members = newMembers)
updateLatestGossip(newGossip)
logInfo("Marked address [{}] as [{}]", address, Leaving)
publishMembershipState()
// immediate gossip to speed up the leaving process
gossip()
}
}
To solve this problem, we ended up writing our own CoordinatedShutdown which you can refer here CswCoordinatedShutdown.scala

Fabric Network - what happens when a downed peer connects back to the network?

I recently deployed the fabric network using Docker-compose, I was trying to simulate a downed peer. Essentially this is what happens:
4 peers are brought online using docker-compose running a fabric network
1 peer i.e the 4th peer goes down (done via docker stop command)
Invoke transactions are sent to the root peer which is verified by querying the peers after sometime (excluding the downed peer).
The downed peer is brought back up with docker start. Query transaction run fine on the always on peers but fail on the newly woken up peer.
Why isn't the 4th peer synchronizing the blockchain, once its up.Is there a step to be taken to ensure it does? Or is it discarded as a rogue peer.
This might be due to the expected behavior of PBFT (assuming you are using it). As explained on issue 933,
I think what you're seeing is normal PBFT behavior: 2f+1 replicas are
making progress, and f replicas are lagging slightly behind, and catch
up occasionally.
If you shut down another peer, you should observe
that the one you originally shut off and restarted will now
participate fully, and the network will continue to make progress. As
long as the network is making progress, and the participating nodes
share a correct prefix, you're all good. The reason for f replicas
lagging behind is that those f may be acting byzantine and progress
deliberately slowly. You cannot tell a difference between a slower
correct replica, and a deliberately slower byzantine replica.
Therefore we cannot wait for the last f stragglers. They will be left
behind and sync up occasionally. If it turns out that some other
replica is crashed, the network will stop making progress until one
correct straggler catches up, and then the network will progress
normally.
Hyperledger Fabric v0.6 does not support add peers dynamically. I am not sure for HF v1.0.