docker swarm - connections from wildfly to postgres randomly hang - postgresql

I'm experiencing a weird problem when deploying a docker stack (compose file).
I have a three node docker swarm - master and two workers.
All machines are CentOS 7.5 with kernel 3.10.0 and docker 18.03.1-ce.
Most things run on the master, one of which is a wildfly (v9.x) application server.
On one of the workers is a postgres database.
After deploying the stack things work normally, but after a while (or maybe after a specific action in the web app) request start to hang.
Running netstat -ntp inside the wildfly container shows 52 bytes stuck in the Send-q:
tcp 0 52 10.0.0.72:59338 10.0.0.37:5432 ESTABLISHED -
On the postgres side the connection is also in ESTABLISHED state, but the send and receive queues are 0.
It's always exactly 52 bytes. I read somewhere that ACK packets with timestamps are also 52 bytes. Is there any way I can verify that?
We have the following sysctl tunables set:
net.ipv4.tcp_keepalive_time = 600
net.ipv4.tcp_keepalive_intvl = 60
net.ipv4.tcp_keepalive_probes = 3
net.ipv4.tcp_timestamps = 0
The first three were needed because of this.
All services in the stack are connected to the same default network that docker creates.
Now if I move the postgres service to be on the same host as the wildfly service the problem doesn't seem to surface or if I declare a separate network for postgres and add it only to the services that need the database (and the database of course) the problem also doesn't seem to show.
Has anyone come across a similar issue? Can anyone provide any pointers on how I can debug the problem further?

Turns out this is a known issue with pooled connections in swarm with services on different nodes.
Basically the workaround is to set the above tuneables + enable tcp keepalive on the socket. See here and here for more details.

Related

Utilizing multiple database pool connections within multiple gunicorn workers

I am using a flask server which initialises the app by creating 10 connections in psycopg2 Connection Pool (using Postgres). My flask server receives 40 requests every second.
Every request uses 1 connection and takes approximately 5 seconds in the database. If a connection is not found in the connection pool, new connections are created.
There is a limitation of 150 maximum database connections on postgreSQL server. However, I am facing challenges in specifying the maximum number of connection in the connection pool . For the pool intialization, I use:
app.config['pool'] = psycopg2.pool.ThreadedConnectionPool(
10, 145,
host = config["HOST"],
database = config["DATABASE"],
user = config["USER"],
password = config["PASSWORD"]
)
I know it may not be possible to share connections within multiple workers. What is the best practice to utilize these 150 connections across multiple workers?
Fo reference, my tech stack is flask + postgreSQL(on Azure). For deployment, i use gunicorn and nginx with flask.
Following is my gunicorn command-
gunicorn --bind 0.0.0.0:8000 --worker-class=gevent --worker-connections=1000 --workers=3 --timeout=1000 manage:app
The easiest solution is to change the worker-class from gevent to sync or possibly gthread.
It is worth paying attention to the entry straight from the gunicorn documentation: "For full greenlet support applications might need to be adapted. When using, e.g., Gevent and Psycopg it makes sense to ensure psycogreen is installed and setup." (https://docs.gunicorn.org/en/stable/design.html)

Dynamic port mapping for ECS tasks

I want to run a socket program in aws ecs with client and server in one task definition. I am able to run it when I use awsvpc network mode and connect to server on localhost every time. This is good so I don’t need to know the IP address of server. The issue is server has to start on some port and if I run 10 of these tasks only 3 tasks(= number of running instances) run at a time. This is clearly because 10 tasks cannot open the same port. I can manually check for open ports before starting the server and somehow write it to docker shared volume where client can read and connect. But this seems complicated and my server has unnecessary code. For the Services there is dynamic port mapping by using Application Load Balancer but there isn’t anything for simply running tasks.
How can I run multiple socket programs without having to manage the port number in Aws ecs?
If you're using awsvpc mode, each task will get its own eni and there shouldn't be any port conflict. But each instance type has a limited number of enis available. You can increase that by enabling eni trunking which, however is supported by a handful of instance types:
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/container-instance-eni.html#eni-trunking-supported-instance-types

Amazon EC2 Elastic Load Balancer TCP disconnect after couple of hours

I am testing the reliability of TCP connections using Amazon Elastic Load Balancer compared to not using the Load Balancer to see if it has any impact.
I have setup a small Elastic Load Balancer on Amazon EC2 us-east zones with 8 t2.micro instances using an auto scaling group without policy and set to 8 min/max instance.
Each instance run a simple TCP server that accept connections on port 8017 and relay some data to the clients coming from another remote server located in my network. The same data is send to all clients.
For the purpose of the test, the servers running on the micro instances are only sending 1 byte of data every 60 seconds (to be sure the connection don't time out).
I connected multiple clients from various outside networks using the ELB DNS name provided, and after maybe 6-24 hours, I always stop receiving data and eventually the connections all die.
All clients stops around the same time, even though they are on different network/ISP. Each "client" application is doing about 10 TCP connections and they all stop receiving data.
All server instances look fine after this happen, they still send data.
To do further testing and eliminate the TCP server code problem, I also have external clients connected directly to the public IP of a single instance, without the ELB, and the data doesn't stop and the connection is not lost in this case (so far).
The Load balancer Idle Timeout is set to 900 seconds.
The Cross-Zone load balancing is enabled and I am using the following zones: us-east-1e, us-east-1b, us-east-1c, us-east-1d
I read the documentation, and searched everywhere to see if this is a known behaviour, but I couldn't find any clear answer or confirmation of others having the same issue, but it seems clear it is happening in my case.
My question: Is this a known/expected behaviour for TCP load balancer? Otherwise, any idea what could be the problem in my setup?

Ganglia Web - Hosts Up and Hosts Down Issue

I have set up Ganglia(Ganglia Core 3.6.0 and Ganglia Web 3.5.10) to monitor my cluster.
When gmond is restarted in a machine, metrics from all other gmond machines also gets stopped ie I am not able to see metrics getting published from other machines in Ganglia Web. And I can also see Hosts up going to 0 and Hosts down as 13(total number of machines). As time goes, the Hosts up comes back to 13.
Am I missing something ?? Can some one help me...
If it's always the same machine, it should be a gmond 'end-point'. The gmetad daemon is querying only one gmond (no redundancy), if he goes down everybody seems to be going down.
If there are a redundancy (eg. more than one host in a data source), you can expect some lag if the first one goes down because of the number of TCP queries before it timesout.

jboss clustering GMS, join

I have jboss 5.1.0.
we have configured jboss somehow using clustering, but in fact we do not use clustering while developing or testing. But in order to launch the project i have to type the following:
./run.sh -c all -g uniqueclustername
-b 0.0.0.0 -Djboss.messaging.ServerPeerID=1 -Djboss.service.binding.set=ports-01
but while jboss starting i able to see something like this in the console:
17:24:45,149 WARN [GMS]
join(172.24.224.7:60519) sent to
172.24.224.2:61247 timed out (after 3000 ms), retrying 17:24:48,170 WARN
[GMS] join(172.24.224.7:60519) sent to
172.24.224.2:61247 timed out (after 3000 ms), retrying 17:24:51,172 WARN
[GMS] join(172.24.224.7:60519)
here 172.24.224.7 it is my local IP
though 172.24.224.2 other IP of other developer in our room (and jboss there is stoped).
So, it tries to join to the other node or something. (i'm not very familiar how jboss acts in clusters). And as a result the application are not starting.
What may be the problem in? how to avoid this joining ?
You can probably fix this by specifying
-Djgroups.udp.ip_ttl=0
in your startup. This Sets the IP time-to-live on the JGroups packets to zero, so they never get anywhere, and the cluster will never form. We use this in dev here to stop the various developer machines from forming a cluster. There's no need to specify a unique cluster name.
I'm assuming you need to do clustering in production, is that right? Could you just use the default configuration instead of all? This would remove the clustering stuff altogether.
while setting up the server, keeping the host name = localhost and --host=localhost instead of ip address will solve the problem. That makes the server to start in non clustered mode.