What happens when TCP keepalive probe failed on an established socket?

What happens when TCP keepalive probe failed on an established socket? - sockets

To be more clear, I am wondering what TCP would a socket be in when keepalive probe failed too many times?
For example right now I have the following for netstat -anop
tcp 0 0 10.10.10.10:12345 11.11.11.11:56789 ESTABLISHED 12345/process keepalive (7200.00/0/0)
Let's say host 11.11.11.11 suddenly lost power forever, and host 10.10.10.10's keepalive probe will eventually detect the broken connection. When 10.10.10.10 detects the broken connection, what will the socket's state be in as shown by netstat?

The connection will be reset and the line will disappear from netstat.

Related

Connection refused when deploying in cloud on kubernetes

I am deploying kubernetes in Cloud and I'm trying to call another container inside the same pod through an API.
I am using localhost but also I treid with 127.0.0.1. Also, I tried with the container's name.
2022/11/04 15:50:47 dial tcp [::1]:4245: connect: connection refused
2022/11/04 15:50:47 Successfully processed file.json file
2022/11/04 15:50:47 Get "http://localhost:4245/api/admin/projects/default": dial tcp [::1]:4245: connect: connection refused
panic: Get "http://localhost:4245/api/admin/projects/default": dial tcp [::1]:4245: connect: connection refused
goroutine 1 [running]:
log.Panic({0xc000119dc8?, 0xc000166000?, 0x6aaaea?})
/opt/app-root/src/sdk/go1.19.2/src/log/log.go:388 +0x65
main.StatusServer({0xc000020570?, 0x30?}, {0x0, 0x0})
/build/script.go:197 +0x1ee
main.ProcessData({0xc000020041, 0x15}, {0x0, 0x0}, {0xc00002000f?, 0x43ce05?})
/build/script.go:291 +0xa6
main.main()
/build/script.go:443 +0xc5
Any idea if I can call the container like that?

You get a connection refused means you reached localhost and it decided to refuse the connection.
This is most likly because nothing is listening on the port.
If it was a firewall issue the request would timeout.
You can check listening ports with command like:
netstat -an
If not installed maybe you can try it from the workernode where the pod is running.
Another method of testing is to use
curl http://127.0.0.1:4245
This will probably result in same connection refused.
Are you really sure the container is running in same pod?
Please check your deployment and service.
If you cant find the failure please come back with more information so it can be analysed.

Can't connect to Postgresql with specific external IP

I can connect to my DigitalOcean Ubuntu 20LTS VM instant that has PostgreSQL 14 installed without issue, but I'm trying to make it more secure with only specific IPs that can connect to the database.
I heard the way to do this is to modify the /etc/postgresql/14/main/postgresql.conf file.
When I have this line, I can connect to my database without issue.
listen_addresses='0.0.0.0'
However, if I modify this line with:
listen_addresses='123.123.123.123'
I get this DataGrip error message: [08001] Connection to 111.111.111.111:12345 refused. Check that the hostname and port are correct and that the postmaster is accepting TCP/IP connections.
111.111.111.111:12345 is my (fake) VM's IP and port that I already set up.
123.123.123.123 is my (fake) computer's external IP that I get from here or here
Any suggestions? Is there a log I can search from that will give me a better understanding of what is going on?
Also to note, with listen_addresses='0.0.0.0', running ss -ptl gives an output of
State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
LISTEN 0 4096 127.0.0.53%lo:domain 0.0.0.0:*
LISTEN 0 128 0.0.0.0:ssh 0.0.0.0:*
LISTEN 0 244 0.0.0.0:12345 0.0.0.0:*
LISTEN 0 128 [::]:ssh [::]:*
with listen_addresses='123.123.123.123', running ss -ptl gives an output of
State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
LISTEN 0 4096 127.0.0.53%lo:domain 0.0.0.0:*
LISTEN 0 128 0.0.0.0:ssh 0.0.0.0:*
LISTEN 0 128 [::]:ssh [::]:*
Documentation that I used so far:
https://www.postgresql.org/docs/current/runtime-config-connection.html
https://www.postgresql.org/docs/current/auth-pg-hba-conf.html

How to detect if keepalive is enabled on TCP socket in AIX and Solaris?

I am working on a solution where i am enabling keepalive option on the TCP socket. On linux I am able to see if keepalive is enabled or not using netstat
netstat -o -p |grep processid
ouput is as follows
$ netstat -o
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address Foreign Address State Timer
tcp 0 0 himanshu-laptop.l:46096
sjc-not16.sjc.dropb:www ESTABLISHED off (0.00/0/0)
tcp 38
0 himanshu-laptop.l:40156 v-d-1a.sjc.dropbo:https CLOSE_WAIT off
(0.00/0/0)
tcp 38 0 himanshu-laptop.l:54501
v-client-5a.sjc.d:https CLOSE_WAIT off (0.00/0/0)
In command output I see field timer which shows off or keepalive.
But I am not able to get this on AIX and Solaris.
Want to check how to get this information on AIX and Solaris?

Too many connections on zookeper server

Environment: HDP 2.6.4
Ambari – 2.6.1
3 zookeeper server
23.1.35.185 - is the IP of the first zookeeper server
hi all,
In the first zookeeper server it seems that even after closing the connection to zookeeper is not getting closed,
which causes the maximum number of client connections to be reached from a host - we have maxClientCnxns as 60 in zookeeper config
As a result when a new application comes and tries to create a connection it fails.
Example when Connections are:
echo stat | nc 23.1.35.185 2181
Latency min/avg/max: 0/71/399
Received: 3031 Sent: 2407
Connections: 67
Outstanding: 622
Zxid: 0x130000004d
Mode: follower
Node count: 3730
But after some time when connection comes to ~70 we see
echo stat | nc 23.1.35.185 2181
Ncat: Connection reset by peer.
And We can see also many CLOSE_WAIT
java 58936 zookeeper 60u IPv6 381963738 0t0 TCP Zookeper_server.sys54.com:eforward->zookeper_server.sys54.com:44983 (CLOSE_WAIT)
From the zookeeper log
2018-12-26 02:50:46,382 [myid:1] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory#193]
- Too many connections from /23.1.35.185 - max is 60
In the ambari we can see also
Connection failed: [Errno 104] Connection reset by peer to zookeper_server.sys54.com.:2181
I must to say that this not happening on zookeeper servers 2 and 3
NOTE - if we increase the maxClientCnxns to 300 , its not help because after some time we get more the 300 connections ( CLOSE_WAIT ) and then we see from the log
2018-12-26 02:50:49,375 [myid:1] - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory#193] - Too many connections from /23.1.35.187 - max is 300
so any hint why the connection are CLOSE_WAIT ?

CLOSE_WAIT means that the local end of the connection has received a FIN from the other end, but the OS is waiting for the program at the local end to actually close its connection.
The problem is your program running on the local machine is not closing the socket. It is not a TCP tuning issue. A connection can (and quite correctly) stay in CLOSE_WAIT forever while the program holds the connection open.
Once the local program closes the socket, the OS can send the FIN to the remote end which transitions you to LAST_ACK while you wait for the ACK of the FIN. Once that is received, the connection is finished and drops from the connection table (if your end is in CLOSE_WAIT you do not end up in the TIME_WAIT state).
There is a kernel level property to reuse the connection and reduce the CLOSE_WAIT time.
I suggest you to follow this tutorial http://www.linuxbrigade.com/reduce-time_wait-socket-connections/
This should probably solve your problem.

Netstat output with boost::Asio

I have created an asio server with acceptor:
m_acceptor(m_ios, asio::ip::tcp::endpoint(asio::ip::address_v4::any(), port_num)
where port number is 3333
At this point, the netstat -antup command shows :
13:tcp 0 0 0.0.0.0:3333 0.0.0.0:* LISTEN 26566/./test
So, I believe this means that local address 0 0.0.0.0:3333 is ready to listen to any connection on port 3333
After this, I start the client which creates the endpoint to ip : 127.0.0.1 and port 3333
After this, the netstat output is:
tcp 0 0 0.0.0.0:3333 0.0.0.0:* LISTEN 26566/./test
tcp 0 0 127.0.0.1:3333 127.0.0.1:46675 ESTABLISHED 26566/./test
tcp 0 0 127.0.0.1:46675 127.0.0.1:3333 ESTABLISHED 26685/./test
Process 26566 is master process
Process 26685 is slave process
What I do not understand is what does the the port 46675 mean in the address shown above? This definitely represents the client side, but from where was this port number allocated to the client?
Does this mean that client has connected to port 3333 but the port from which it itself connects is 46675?

Does this mean that client has connected to port 3333 but the port from which it itself connects is 46675?
Basically. It describes the client endpoint. This is BSD/Posix sockets jargon.
What I do not understand is what does the the port 46675 mean in the address shown above? This definitely represents the client side, but from where was this port number allocated to the client?
It gets automatically chosen (by the TCP stack, usually in the kernel) from the local port range. E.g. on linux you can manipulate that range (if you have permission):
sudo sysctl -w net.ipv4.ip_local_port_range="60000 61000"
(Warning: don't do this unless you know what you're doing). See also https://en.wikipedia.org/wiki/Ephemeral_port