HAProxy Stats UI "Current Sessions" incorrect - haproxy

I'm running HAProxy 1.5.18 to front a MySQL Percona XtraDB Cluster with everything setup as per guidelines on Percona website.
I'm seeing that the "Current Sessions" statistic is not updating as I would expect for a backend that has gone down then come back up again.
Its therefore pretty confusing to get an accurate picture of which backend mysql node is taking all the traffic.
Here's the frontend/backend config I'm using:
frontend pxc_frontend
mode tcp
bind *:6033
timeout client 28800s
default_backend pxc_backend
backend pxc_backend
mode tcp
timeout connect 10s
timeout server 28800s
balance leastconn
option httpchk
stick-table type ip size 1
stick on dst
server percona-node1 10.99.1.21:3306 check port 9200 inter 1000 rise 3 fall 3
server percona-node2 10.99.1.22:3306 backup check port 9200 inter 1000 rise 3 fall 3
server percona-node3 10.99.1.23:3306 backup check port 9200 inter 1000 rise 3 fall 3
Here's what I've tried:
1) Start my application up - this makes 50 connections to the DB (via HAProxy) hence the "current sessions" stat in the HAProxy UI shows as 50 for the backend that is the active one (percona-node1 in my case). I verify this using netstat to check the number of connections between HAProxy and the backend MySQL Node.
2) I then shutdown the backend mysql node with all the connections (percona-node1) and let HAProxy failover connections to the next backend in the list (percona-node2). I verify using netstat that HAProxy has 0 connections to the old backend (obviously) and now has 50 connections to the new backend. The "current sessions" stat in the HAProxy UI shows as 50 for the new backend but typically has a number <50 for the old backend.
3) I then bring the old backend mysql node back up again (percona-node1). I verify again using netstat that HAProxy has 0 connections to the newly restarted backend and maintains its 50 connections to the backend percona-node2. The "current sessions" stat in the HAProxy UI shows the same non-zero number for percona-node1 as before and 50 for percona-node2. I would expect it to show 0 for percona-node1 and 50 for percona-node2.
So does the current sessions stats not get cleared down for a node that has gone down then come back again?
Thanks in advance for your wisdom.

Related

In jmeter SEVERE: No buffer space available (maximum connections reached?): connect exception

In jemeter i am testing for 100000 MQTT concurrent user with ramp up of 10000 and loop count is 1.
The library that I am using for MQTT in Jmeter is https://github.com/emqx/mqtt-jmeter . But I am getting
SEVERE: No buffer space available (maximum connections reached?): connect exception after reaching 64378.
Specification:
OS: Windows 10
Ram : 64 GB
CPU : i7
Configuration in registry editor:
This is due to the windows having too many active client connections.
The default number of ephemeral TCP ports is 5000. Sometimes this number may be insufficient if the server has too many active client connections. In that case the ephemeral TCP ports are all used up and no more can be allocated to a new client connection request resulting in the error message (for a Java application)
You should specify TCP / IP settings by editing the following registry values ​​in the HKEY_LOCAL_MACHINE \ SYSTEM \ CurrentControlSet \ Services \ Tcpip \ Parameters registry subkey:
MaxUserPort
Specifies the maximum port number for ephemeral TCP ports.
TcpNumConnections
Specifies the maximum number of concurrent connections that TCP can open. This value significantly affects the number of concurrent osh.exe processes that are allowed. If the value for TcpNumConnections is too low, Windows can not assign TCP ports to stages in parallel jobs, and the parallel jobs can not run.
These keys are not added to the registry by default.
Follow this link to Configuring the Windows registry: Specifying TCP / IP settings and made necessary edit.
Hope this will help.

How to scale the total number of connection with pgpool load balancing?

I have 3 postgresql database (one master and two slave) with a pgpool, each database can handle 200 connections, and I want to be able to get 600 active connection on the pgpool.
My problem is that if I set pgpool with 600 child process, it can open the 600 connection on only one database (the master for example if all connection make a write query), but with 200 child process I only use +- 70 connection on each database.
So is there a way to configure pgpool to have a load balancing that scale with the number of database ?
Thanks.
Having 600 connections available in each db should not be an ideal solution. I would really look into my application before setting such a high connections value.
Load balancing scalability of pgpool can be increased by setting equal backend_weight parameter. So that no of sql queries will equally get distributed among postgresql nodes.
Also pgpool manages database connection pool using num_init_children and max_pool parameter.
The num_init_children parameter is used to span pgpool process that will connect to each PostgreSQL backends.
Also num_init_children parameter value is the allowed number of concurrent clients to connect with pgpool.
pgpool roughly tries to make max_pool*num_init_children no of connections to each postgresql backend.

Amazon EC2 Elastic Load Balancer TCP disconnect after couple of hours

I am testing the reliability of TCP connections using Amazon Elastic Load Balancer compared to not using the Load Balancer to see if it has any impact.
I have setup a small Elastic Load Balancer on Amazon EC2 us-east zones with 8 t2.micro instances using an auto scaling group without policy and set to 8 min/max instance.
Each instance run a simple TCP server that accept connections on port 8017 and relay some data to the clients coming from another remote server located in my network. The same data is send to all clients.
For the purpose of the test, the servers running on the micro instances are only sending 1 byte of data every 60 seconds (to be sure the connection don't time out).
I connected multiple clients from various outside networks using the ELB DNS name provided, and after maybe 6-24 hours, I always stop receiving data and eventually the connections all die.
All clients stops around the same time, even though they are on different network/ISP. Each "client" application is doing about 10 TCP connections and they all stop receiving data.
All server instances look fine after this happen, they still send data.
To do further testing and eliminate the TCP server code problem, I also have external clients connected directly to the public IP of a single instance, without the ELB, and the data doesn't stop and the connection is not lost in this case (so far).
The Load balancer Idle Timeout is set to 900 seconds.
The Cross-Zone load balancing is enabled and I am using the following zones: us-east-1e, us-east-1b, us-east-1c, us-east-1d
I read the documentation, and searched everywhere to see if this is a known behaviour, but I couldn't find any clear answer or confirmation of others having the same issue, but it seems clear it is happening in my case.
My question: Is this a known/expected behaviour for TCP load balancer? Otherwise, any idea what could be the problem in my setup?

Haproxy continue to route sessions to a backend marked as down

I'm using HaProxy 1.5.0 in front of a 3-node MariaDB cluster.
HaProxy checks with a custom query/xinet service that each DB node has a synced status.
When for some reason the check fails (the node for instance gets desynced or becomes donor), the corresponding backend in haproxy is marked down, but I can still see active sessions on it in haproxy statistics console, and queries in DB process list (this is possible because MariaDB service is still up and accepts queries, even though the cluster status is not synced).
I was wondering why HaProxy does not close active connections when a backend becomes down, and dispatch them to other active backends?
I get this expected behaviour when the MariaDB service is fully stopped on a given node (no session possible).
Is there a specific option to allow this? Option redispatch seemed promising but it applies when connections are closed (not in my case) and it's already active in my config.
Thanks for your help.
Here are the settings we're using to get the same behavior:
default-server port 9200 [snip] on-marked-down shutdown-sessions
The on-marked-down shutdown-sessions option, that tells HAProxy to close all connections to the backend server when it is marked as down.
Of course, you can add it to every individual server if you're not using a default-server directive :)

how to get current zookeeper cluster's member server list

I want to get the member server list and their type(Leader or observer) in my java application.
And also want to get the dead server.
Is their any way to do that? I read the document, but didn't find.
It would be nice if there were a built-in answer for this without resorting to JMX. If you are on one of the zookeeper nodes, you can read the zoo.cfg file to get the list of servers (dead and alive ones) and then "stat" each one individually to see if it's alive and what its status is (note the "Mode" attribute on a successful response). E.g.:
$ echo stat | nc 127.0.0.1 2181
Zookeeper version: 3.4.5--1, built on 06/10/2013 17:26 GMT
Clients:
/127.0.0.1:54752[1](queued=0,recved=215524,sent=215524)
/127.0.0.1:59298[0](queued=0,recved=1,sent=0)
Latency min/avg/max: 0/0/6
Received: 5596
Sent: 5596
Connections: 2
Outstanding: 0
Zxid: 0x10000010f
Mode: leader
Node count: 54
Note that "stat" does not show you the other members of the zookeeper ensemble--it only shows you the connected clients.
Zookeeper exposes this information over jmx.
It can also be query sending "stat" command using direct connection to port 2181.
For an example of how to do that from python see:
https://github.com/apache/zookeeper/blob/765cedb5c65526384011ea958e59938fc7493168/src/contrib/huebrowser/zkui/src/zkui/stats.py