In jmeter SEVERE: No buffer space available (maximum connections reached?): connect exception - plugins

In jemeter i am testing for 100000 MQTT concurrent user with ramp up of 10000 and loop count is 1.
The library that I am using for MQTT in Jmeter is https://github.com/emqx/mqtt-jmeter . But I am getting
SEVERE: No buffer space available (maximum connections reached?): connect exception after reaching 64378.
Specification:
OS: Windows 10
Ram : 64 GB
CPU : i7
Configuration in registry editor:

This is due to the windows having too many active client connections.
The default number of ephemeral TCP ports is 5000. Sometimes this number may be insufficient if the server has too many active client connections. In that case the ephemeral TCP ports are all used up and no more can be allocated to a new client connection request resulting in the error message (for a Java application)
You should specify TCP / IP settings by editing the following registry values ​​in the HKEY_LOCAL_MACHINE \ SYSTEM \ CurrentControlSet \ Services \ Tcpip \ Parameters registry subkey:
MaxUserPort
Specifies the maximum port number for ephemeral TCP ports.
TcpNumConnections
Specifies the maximum number of concurrent connections that TCP can open. This value significantly affects the number of concurrent osh.exe processes that are allowed. If the value for TcpNumConnections is too low, Windows can not assign TCP ports to stages in parallel jobs, and the parallel jobs can not run.
These keys are not added to the registry by default.
Follow this link to Configuring the Windows registry: Specifying TCP / IP settings and made necessary edit.
Hope this will help.

Related

Dynamic port mapping for ECS tasks

I want to run a socket program in aws ecs with client and server in one task definition. I am able to run it when I use awsvpc network mode and connect to server on localhost every time. This is good so I don’t need to know the IP address of server. The issue is server has to start on some port and if I run 10 of these tasks only 3 tasks(= number of running instances) run at a time. This is clearly because 10 tasks cannot open the same port. I can manually check for open ports before starting the server and somehow write it to docker shared volume where client can read and connect. But this seems complicated and my server has unnecessary code. For the Services there is dynamic port mapping by using Application Load Balancer but there isn’t anything for simply running tasks.
How can I run multiple socket programs without having to manage the port number in Aws ecs?
If you're using awsvpc mode, each task will get its own eni and there shouldn't be any port conflict. But each instance type has a limited number of enis available. You can increase that by enabling eni trunking which, however is supported by a handful of instance types:
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/container-instance-eni.html#eni-trunking-supported-instance-types

HAProxy Stats UI "Current Sessions" incorrect

I'm running HAProxy 1.5.18 to front a MySQL Percona XtraDB Cluster with everything setup as per guidelines on Percona website.
I'm seeing that the "Current Sessions" statistic is not updating as I would expect for a backend that has gone down then come back up again.
Its therefore pretty confusing to get an accurate picture of which backend mysql node is taking all the traffic.
Here's the frontend/backend config I'm using:
frontend pxc_frontend
mode tcp
bind *:6033
timeout client 28800s
default_backend pxc_backend
backend pxc_backend
mode tcp
timeout connect 10s
timeout server 28800s
balance leastconn
option httpchk
stick-table type ip size 1
stick on dst
server percona-node1 10.99.1.21:3306 check port 9200 inter 1000 rise 3 fall 3
server percona-node2 10.99.1.22:3306 backup check port 9200 inter 1000 rise 3 fall 3
server percona-node3 10.99.1.23:3306 backup check port 9200 inter 1000 rise 3 fall 3
Here's what I've tried:
1) Start my application up - this makes 50 connections to the DB (via HAProxy) hence the "current sessions" stat in the HAProxy UI shows as 50 for the backend that is the active one (percona-node1 in my case). I verify this using netstat to check the number of connections between HAProxy and the backend MySQL Node.
2) I then shutdown the backend mysql node with all the connections (percona-node1) and let HAProxy failover connections to the next backend in the list (percona-node2). I verify using netstat that HAProxy has 0 connections to the old backend (obviously) and now has 50 connections to the new backend. The "current sessions" stat in the HAProxy UI shows as 50 for the new backend but typically has a number <50 for the old backend.
3) I then bring the old backend mysql node back up again (percona-node1). I verify again using netstat that HAProxy has 0 connections to the newly restarted backend and maintains its 50 connections to the backend percona-node2. The "current sessions" stat in the HAProxy UI shows the same non-zero number for percona-node1 as before and 50 for percona-node2. I would expect it to show 0 for percona-node1 and 50 for percona-node2.
So does the current sessions stats not get cleared down for a node that has gone down then come back again?
Thanks in advance for your wisdom.

docker swarm - connections from wildfly to postgres randomly hang

I'm experiencing a weird problem when deploying a docker stack (compose file).
I have a three node docker swarm - master and two workers.
All machines are CentOS 7.5 with kernel 3.10.0 and docker 18.03.1-ce.
Most things run on the master, one of which is a wildfly (v9.x) application server.
On one of the workers is a postgres database.
After deploying the stack things work normally, but after a while (or maybe after a specific action in the web app) request start to hang.
Running netstat -ntp inside the wildfly container shows 52 bytes stuck in the Send-q:
tcp 0 52 10.0.0.72:59338 10.0.0.37:5432 ESTABLISHED -
On the postgres side the connection is also in ESTABLISHED state, but the send and receive queues are 0.
It's always exactly 52 bytes. I read somewhere that ACK packets with timestamps are also 52 bytes. Is there any way I can verify that?
We have the following sysctl tunables set:
net.ipv4.tcp_keepalive_time = 600
net.ipv4.tcp_keepalive_intvl = 60
net.ipv4.tcp_keepalive_probes = 3
net.ipv4.tcp_timestamps = 0
The first three were needed because of this.
All services in the stack are connected to the same default network that docker creates.
Now if I move the postgres service to be on the same host as the wildfly service the problem doesn't seem to surface or if I declare a separate network for postgres and add it only to the services that need the database (and the database of course) the problem also doesn't seem to show.
Has anyone come across a similar issue? Can anyone provide any pointers on how I can debug the problem further?
Turns out this is a known issue with pooled connections in swarm with services on different nodes.
Basically the workaround is to set the above tuneables + enable tcp keepalive on the socket. See here and here for more details.

Linux Kernel parameters which can be tuned when TCP backlog exceeds in WebSphere MQ server

We are facing an issue where we see TCP backlogs gets exceeded than default value (100) on our MQ server (v7.5) running on Linux (Redhat) platform during high number of connection requests on MQ server. The ListenerBacklog is configured as 100 in qm.ini which is the default listener backlog value (maximum connection requests) for Linux. Whenever we have connections burst and TCP backlogs exceeds the queue manager stops functioning and resumes only when queue manager/server is restarted.
So we are looking whether there are attributes in Linux kernel related to socket tuning which can improve tcp backlog at network layer and cause no harm to queue manager.Does increasing these values as below in /etc/sysctl.conf will help to resolve this issue or improve performance of queue manager?
net.ipv4.tcp_max_syn_backlog = 4096
net.core.somaxconn = 1024
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216

TCP connection - Bare Metal vs libvirt - VM

I performed a TCP connection between two directly connected 'bare metal'. I observed 90% of link as throughput.
When performing TCP connection between sender 'bare metal' to receiver VM, I observe 20% of link as throughput.
VM is set to the MAX core allocation and RAM allocation of Node. I also tried core affinity setting.
Can anybody suggest a reason for this variation like core allocation of the VM to Node?