Unexpected connection close in HttpClient - httpclient

I am facing a problem with HttpClient (version 4.5.2) in a web application, I mean, in a multi-threaded way. In normal situation, when a connection request is arrived, a connection is leased from the pool, then used and finally released back to the pool again to be used in future requests again as the following part of log for connection with id 673890 states .
15 Feb 2017 018:25:54:115 p-1-thread-121 DEBUG PoolingHttpClientConnectionManager:249 - Connection request: [route: {}->http://127.0.0.1:8080][total kept alive: 51; route allocated: 4 of 100; total allocated: 92 of 500]
15 Feb 2017 018:25:54:116 p-1-thread-121 DEBUG PoolingHttpClientConnectionManager:282 - Connection leased: [id: 673890][route: {}->http://127.0.0.1:8080][total kept alive: 51; route allocated: 4 of 100; total allocated: 92 of 500]
15 Feb 2017 018:25:54:116 p-1-thread-121 DEBUG DefaultManagedHttpClientConnection:90 - http-outgoing-673890: set socket timeout to 9000
15 Feb 2017 018:25:54:120 p-1-thread-121 DEBUG PoolingHttpClientConnectionManager:314 - Connection [id: 673890][route: {}->http://127.0.0.1:8080] can be kept alive for 10.0 seconds
15 Feb 2017 018:25:54:121 p-1-thread-121 DEBUG PoolingHttpClientConnectionManager:320 - Connection released: [id: 673890][route: {}->http://127.0.0.1:8080][total kept alive: 55; route allocated: 4 of 100; total allocated: 92 of 500]
After using the mentioned connection (id 673890) several times in a normal way which I mentioned above, I notice the following happens in the code:
15 Feb 2017 018:25:54:130 p-1-thread-126 DEBUG PoolingHttpClientConnectionManager:249 - Connection request: [route: {}->http://127.0.0.1:8080][total kept alive: 55; route allocated: 4 of 100; total allocated: 92 of 500]
15 Feb 2017 018:25:54:130 p-1-thread-126 DEBUG PoolingHttpClientConnectionManager:282 - Connection leased: [id: 673890][route: {}->http://127.0.0.1:8080][total kept alive: 54; route allocated: 4 of 100; total allocated: 92 of 500]
15 Feb 2017 018:25:54:131 p-1-thread-126 DEBUG DefaultManagedHttpClientConnection:90 - http-outgoing-673890: set socket timeout to 9000
15 Feb 2017 018:25:54:133 p-1-thread-126 DEBUG DefaultManagedHttpClientConnection:81 - http-outgoing-673890: Close connection
15 Feb 2017 018:25:54:133 p-1-thread-126 DEBUG PoolingHttpClientConnectionManager:320 - Connection released: [id: 673890][route: {}->http://127.0.0.1:8080][total kept alive: 55; route allocated: 3 of 100; total allocated: 91 of 500]
The log says that the connection is requested, leased, used, closed and then released back to the pool. So, my question is that why the connection is closed? And why it is released to the pool after closing?
I know that the connection could be closed by the server, but that is a different situation. In that case, the connection is leased from the pool, determined as stale, so a new connection is established and used but the log I presented above shows a different behavior.
I am aware of two reasons for connection close in HttpClient. First, closed for being idle because their KeepAliveTime is expired. Second, closed by the server which makes the connection stale in the pool. Is there any other reason for connections to be closed?

Based on Oleg Kalnichevski's reply in the HttpClient mailing list, and the examinations which I made, it turned out that the problem was because of 'Connection: close' header sent by the other hand. Another cause that may lead to the same situation is using HTTP/1.0 non-persistent connections.

Related

Cannot access apache server from foreign ip address

I have a site running on my computer using Apache 2.4 which I can easily access by using my local ipv4 address and respective port 80. The port 80 is bound to port 22*** using portmap.io and is configured with OpenVpn/tcp on my computer.I have allowed access to Apache HTTP server and Apache Server Monitor through the firewall.I have also increased keepAlive timeout in apche server to 600s, max connections.i have Listen 80 and LISTEN 22*** and ServerName as http://awm-22***.portmap.host:22*** in my httpd.conf file.You can look into for more options here.Apache handler configuration
I am using PHP as backend language.
Since the port 80 is bound to port 22470, whenever I try access my site from another device(which also uses the same wifi network as the computer running the server) using the local ipv4 address of my computer(which is running the server) and port 80 i.e
192.168..:80, it automatically redirects the browser to 192.168..:22*** and I can access my site with no difficuty. Access log in apcache server:
192.168.**.** - - [15/Dec/2022:10:08:02 +0530] "GET /abc%20xyz%20klm/ HTTP/1.1" 200 12049
192.168.**.** - - [15/Dec/2022:10:08:02 +0530] "GET /SPR/b/get_captcha.php?rand=29842778 HTTP/1.1" 200 4057
But when I try access the same site from another device(which also uses the same wifi network as the computer running the server) using the ipv4 address provided by OpenVpn to my computer(which is running the server):10.9..* and port 80 i.e 10.9..*:80, it shows TOOK TOO LONG TO RESPPOND error on the browser.But the browser's header has the following:
http://10.9.**.**4:22470/abc%20xyz%20klm/
Then why is it not loading the page. No log in apache access log file.
When I try access the same site from another device(which also uses the same wifi network as the computer running the server) using the url provided to me by portmap.io i.e: http://awm-22***.portmap.host:22***/,
The server takes too long to respond error is shown in the browser.
Access log in apcache server:
10.9.0.1 - - [15/Dec/2022:10:21:33 +0530] "GET / HTTP/1.0" 302 -
10.9.0.1 - - [15/Dec/2022:10:21:34 +0530] "GET /abc%20xyz%20klm HTTP/1.0" 301 256
OpenVpn Log:
Thu Dec 15 10:32:30 2022 SIGHUP[hard,] received, process restarting
Thu Dec 15 10:32:30 2022 --cipher is not set. Previous OpenVPN version defaulted to BF-CBC as fallback when cipher negotiation failed in this case. If you need this fallback please add '--data-ciphers-fallback BF-CBC' to your configuration and/or add BF-CBC to --data-ciphers.
Thu Dec 15 10:32:30 2022 OpenVPN 2.5.7 [git:release/2.5/3d792ae9557b959e] Windows-MSVC [SSL (OpenSSL)] [LZO] [LZ4] [PKCS11] [AEAD] built on Oct 28 2022
Thu Dec 15 10:32:30 2022 Windows version 10.0 (Windows 10 or greater) 64bit
Thu Dec 15 10:32:30 2022 library versions: OpenSSL 1.1.1o 3 May 2022, LZO 2.10
Thu Dec 15 10:32:35 2022 TCP/UDP: Preserving recently used remote address: [AF_INET]193.161.193.99:1194
Thu Dec 15 10:32:35 2022 Attempting to establish TCP connection with [AF_INET]193.161.193.99:1194 [nonblock]
Thu Dec 15 10:32:35 2022 TCP connection established with [AF_INET]193.161.193.99:1194
Thu Dec 15 10:32:35 2022 TCP_CLIENT link local: (not bound)
Thu Dec 15 10:32:35 2022 TCP_CLIENT link remote: [AF_INET]193.161.193.99:1194
Thu Dec 15 10:32:41 2022 [193.161.193.99] Peer Connection Initiated with [AF_INET]193.161.193.99:1194
Thu Dec 15 10:32:42 2022 WARNING: INSECURE cipher (BF-CBC) with block size less than 128 bit (64 bit). This allows attacks like SWEET32. Mitigate by using a --cipher with a larger block size (e.g. AES-256-CBC). Support for these insecure ciphers will be removed in OpenVPN 2.7.
Thu Dec 15 10:32:42 2022 WARNING: INSECURE cipher (BF-CBC) with block size less than 128 bit (64 bit). This allows attacks like SWEET32. Mitigate by using a --cipher with a larger block size (e.g. AES-256-CBC). Support for these insecure ciphers will be removed in OpenVPN 2.7.
Thu Dec 15 10:32:42 2022 WARNING: cipher with small block size in use, reducing reneg-bytes to 64MB to mitigate SWEET32 attacks.
Thu Dec 15 10:32:42 2022 open_tun
Thu Dec 15 10:32:42 2022 tap-windows6 device [OpenVPN TAP-Windows6] opened
Thu Dec 15 10:32:42 2022 Notified TAP-Windows driver to set a DHCP IP/netmask of 10.9.**.234/255.255.255.252 on interface {798F492A-574C-4BC6-87C5-A62C6D058EC1} [DHCP-serv: 10.9.**.233, lease-time: 31536000]
Thu Dec 15 10:32:42 2022 Successful ARP Flush on interface [12] {798F492A-574C-4BC6-87C5-A62C6D058EC1}
Thu Dec 15 10:32:42 2022 IPv4 MTU set to 1500 on interface 12 using service
These are my firewall rules:
Inbound rules
Inbound rule for Port 80 Outbound rules
Firewall monitoring Domain and Private Profiles
Firewall monitoring Public Profile
What is causing the problem? Any solution will be of great help. Thanks in advance.

Why does Puma mess up incoming requests? (timed out worker)

Problem
I have a Rails 7 app deployed on render.com, and it doesn't get a lot of traffic (maybe once per day). However, when a few requests do come in, everything seems to running fine for a moment until Puma seems to barf. The incoming requests are from Twilio for a voice call, and the call eventually errors with "We're sorry, an application error has occurred. Goodbye". It seems like something about a "timed out" worker happens, then the worker boots, and whammo! a flood of "Completed 2XX OK" and "Kredis Connected to shared" lines come crashing through like they've been pent up the entire time. THEN, nearly a day later without any outside requests coming in, several log lines about Out-of-sync worker list, no 78 worker come through. My Puma config file is unchanged from what ships with Rails.
Questions
Where might I go look for the offending code? What tools could help me decipher why a Puma worker is timing out? Could it have something to do with how I'm using Redis via Kredis in my app?
Workaround
To get around this issue, I've started to occasionally redeploy my latest commit and that seems to help. I'm not certain, but it seems like inactivity causes Puma to become discombobulated.
Log output
Here's what the offending lines in my log file look like:
... a few requests that complete 200 OK ...
Sep 13 05:53:15 PM [70] ! Terminating timed out worker (worker failed to check in within 60 seconds): 90
... a couple more normal log lines and then ...
Sep 13 05:53:16 PM [70] - Worker 3 (PID: 134) booted in 0.04s, phase: 0
... some more normal log lines and then ...
Sep 13 05:53:16 PM I, [2022-09-13T22:53:16.593713 #74] INFO -- : [595ad8e5-fa3a-45a3-8c5b-a506e6c94b69] Completed 204 No Content in 110ms (Allocations: 13681)
Sep 13 05:53:16 PM I, [2022-09-13T22:53:16.425579 #86] INFO -- : [f1a64c71-8048-4032-8bf6-2e68aa1fa7ba] Completed 204 No Content in 2ms (Allocations: 541)
Sep 13 05:53:16 PM I, [2022-09-13T22:53:16.595408 #86] INFO -- : [68d19bd9-2286-4f75-a982-5fa3e864d6ac] Completed 200 OK in 105ms (Views: 0.2ms | Allocations: 1592)
Sep 13 05:53:16 PM I, [2022-09-13T22:53:16.614951 #76] INFO -- : [e883350f-9a26-4d3d-8f1c-4853285aa71a] Kredis (10.6ms) Connected to shared
Sep 13 05:53:16 PM I, [2022-09-13T22:53:16.615787 #76] INFO -- : [fbcd8730-1514-4af5-9332-0bdf0c89fc2d] Kredis (17.2ms) Connected to shared
Sep 13 05:53:16 PM I, [2022-09-13T22:53:16.705926 #86] INFO -- : [1f67a177-38f2-4bf5-bd03-1c59a3edb3a4] Kredis (224.1ms) Connected to shared
Sep 13 05:53:16 PM I, [2022-09-13T22:53:16.958386 #76] INFO -- : [e883350f-9a26-4d3d-8f1c-4853285aa71a] Completed 200 OK in 472ms (ActiveRecord: 213.1ms | Allocations: 32402)
Sep 13 05:53:17 PM I, [2022-09-13T22:53:17.034211 #86] INFO -- : [1f67a177-38f2-4bf5-bd03-1c59a3edb3a4] Completed 200 OK in 606ms (ActiveRecord: 256.6ms | Allocations: 17832)
Sep 13 05:53:17 PM I, [2022-09-13T22:53:17.136231 #76] INFO -- : [fbcd8730-1514-4af5-9332-0bdf0c89fc2d] Completed 200 OK in 654ms (ActiveRecord: 88.0ms | Allocations: 37385)
... literally a day later without any other activity ...
Sep 14 05:02:29 AM [69] ! Terminating timed out worker (worker failed to check in within 60 seconds): 78
Sep 14 05:02:31 AM [69] ! Out-of-sync worker list, no 78 worker
Sep 14 05:02:31 AM [69] ! Out-of-sync worker list, no 78 worker
Sep 14 05:02:31 AM [69] ! Out-of-sync worker list, no 78 worker
Sep 14 05:02:31 AM [69] ! Out-of-sync worker list, no 78 worker
Sep 14 05:02:31 AM [69] ! Out-of-sync worker list, no 78 worker
Sep 14 05:02:31 AM [69] ! Out-of-sync worker list, no 78 worker
Sep 14 05:02:31 AM [69] - Worker 1 (PID: 132) booted in 0.03s, phase: 0

How should I limit the slow disconnect of thousands of tcp connections?

I have a golang program running on centos that usually has around 5k tcp clients connected. Every once in a while this number goes to around 15k for about an hour and still everything is fine.
The program has a slow shutdown mode where it stops accepting new clients and slowly kills all currently connected clients over the course of 20 mins. During these slow shutdown periods if the machine has 15k clients, sometimes I get:
Wed Oct 31 21:28:23 2018] net_ratelimit: 482 callbacks suppressed
[Wed Oct 31 21:28:23 2018] TCP: too many orphaned sockets
[Wed Oct 31 21:28:23 2018] TCP: too many orphaned sockets
[Wed Oct 31 21:28:23 2018] TCP: too many orphaned sockets
I have tried adding:
echo "net.ipv4.tcp_max_syn_backlog=5000" >> /etc/sysctl.conf
echo "net.ipv4.tcp_fin_timeout=10" >> /etc/sysctl.conf
echo "net.ipv4.tcp_tw_recycle=1" >> /etc/sysctl.conf
echo "net.ipv4.tcp_tw_reuse=1" >> /etc/sysctl.conf
sysctl -f /etc/sysctl.conf
And these values are set, I see them with their correct new values. A typical sockstat is:
cat /proc/net/sockstat
sockets: used 31682
TCP: inuse 17286 orphan 5 tw 3874 alloc 31453 mem 15731
UDP: inuse 8 mem 3
UDPLITE: inuse 0
RAW: inuse 0
FRAG: inuse 0 memory 0
And ideas how to stop the too many orphaned socket error and crash? Should I increase the 20 min slow shutdown period to 40 mins? Increase tcp_mem? Thanks!

lmtpd: failed to mmap file /var/lib/imap/deliver.db.NEW (in reply to end of DATA command)

Good day!
After installing and running kolab letters delivered instantly. But after a few days letters to local destinations have become delivered with a delay. Over time, they are delivered, but the delay may be several hours. An example of the path of the letter:
root#myhost:~# cat /var/log/mail.log | grep 7AA7935B1FC
Jan 12 11:31:03 myhost postfix/smtpd[19494]: 7AA7935B1FC:
client=localhost[127.0.0.1]
Jan 12 11:31:05 myhost postfix/cleanup[19492]: 7AA7935B1FC:
message-id=<20160112093103.7AA7935B1FC#mail.myhost.com>
Jan 12 11:31:05 myhost postfix/qmgr[7021]: 7AA7935B1FC:
from=<noreply#myhost.com>, size=1279, nrcpt=3 (queue active)
Jan 12 11:31:05 myhost lmtpunix[19631]: Delivered:
<20160112093103.7AA7935B1FC#mail.myhost.com> to mailbox:
myhost.com!user.user1
Jan 12 11:31:06 myhost postfix/lmtp[19617]: 7AA7935B1FC: to=<user1#myhost.com>, relay=mail.myhost.com[/var/lib/imap/socket/lmtp], delay=2.6, delays=2/0.01/0/0.59, dsn=4.3.0, status=deferred (host
mail.myhost.com[/var/lib/imap/socket/lmtp] said: 421 4.3.0 lmtpd:
failed to mmap /var/lib/imap/deliver.db.NEW file (in reply to end of
DATA command))
Jan 12 11:31:06 myhost postfix/lmtp[19617]: 7AA7935B1FC: to=<user2#myhost.com>, relay=mail.myhost.com[/var/lib/imap/socket/lmtp], delay=2.7, delays=2/0.01/0/0.68, dsn=4.4.2, status=deferred (lost connection with mail.myhost.com[/var/lib/imap/socket/lmtp] while sending end of data
-- message may be sent more than once
Jan 12 11:31:07 myhost postfix/lmtp[19617]: 7AA7935B1FC: to=<user3#myhost.com>, relay=mail.myhost.com[/var/lib/imap/socket/lmtp], delay=2.7, delays=2/0.01/0/0.68, dsn=4.4.2, status=deferred (lost connection with mail.myhost.com[/var/lib/imap/socket/lmtp] while sending end of data
-- message may be sent more than once)
Currently mailq features a variety of messages in queue. An example of one of these:
7BBDF35B123 6162 Tue Jan 12 13:19:24 user#rambler.ru (delivery temporarily suspended: lost connection with mail.myhost.com[/var/lib/imap/socket/lmtp] while sending end of data -- message may be sent more than once) user4#myhost.com
-- 11667 Kbytes in 327 Requests.
I think that the main reason is described here:
lmtp: failed to mmap /var/lib/imap/deliver.db.NEW file
But, unfortunately, not been able to find a solution.
The problem was solved according to this recommendation: http://lists.kolab.org/pipermail/users-de/2015-May/001998.html
Stop Services cyrus-imap and postfix
Delete files deliver.db.NEW and deliver.db in the directory /var/lib/imap/
Start the services and the file deliver.db is automatically created
Restart the queue: postsuper -r ALL
Some of the letters delivered from the queue again.
Proposed cause: after installing and start services on the new server users download messages en masse in the format *.eml, downloaded from the last post. Perhaps these actions somehow overflowed index files.
P.S.: Unfortunately, the solution was temporary: the situation described above is repeated periodically :(

Facebook OpenGraph API calls does not work at all times

We are not able to connect to Facebook OpenGraph API now and then.
Our .NET client returns the below error:
System.Net.WebException: Unable to connect to the remote server ---> System.Net.Sockets.SocketException: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond 66.220.146.100:443 at System.Net.Sockets.Socket.DoConnect(EndPoint endPointSnapshot, SocketAddress socketAddress) at System.Net.Sockets.Socket.InternalConnect(EndPoint remoteEP) at System.Net.ServicePoint.ConnectSocketInternal(Boolean connectFailure, Socket s4, Socket s6, Socket& socket, IPAddress& address, ConnectSocketState state, IAsyncResult asyncResult, Int32 timeout, Exception& exception)
Please note that this is not specific to any particular API call.
Even pinging graph.facebook.com times out randomly.
Below is some trace route info:
1 <1 ms <1 ms <1 ms ip-97-74-87-252.ip.secureserver.net [97.74.87.252]
2 <1 ms <1 ms <1 ms ip-208-109-113-173.ip.secureserver.net [208.109.113.173]
3 1 ms <1 ms <1 ms ip-97-74-252-21.ip.secureserver.net [97.74.252.21]
4 1 ms <1 ms <1 ms ip-97-74-252-21.ip.secureserver.net [97.74.252.21]
5 9 ms 1 ms 1 ms 206-15-85-9.static.twtelecom.net [206.15.85.9]
6 13 ms 13 ms 44 ms lax2-pr2-xe-1-3-0-0.us.twtelecom.net [66.192.241.218]
7 26 ms 28 ms 27 ms cr1.la2ca.ip.att.net [12.122.104.14]
8 26 ms 27 ms 27 ms cr82.sj2ca.ip.att.net [12.122.1.146]
9 23 ms 23 ms 23 ms 12.122.128.201
10 22 ms 22 ms 22 ms 12.249.231.26
11 22 ms 22 ms 22 ms ae1.bb02.sjc1.tfbnw.net [204.15.21.164]
12 23 ms 23 ms 23 ms ae5.dr02.snc4.tfbnw.net [204.15.21.169]
13 24 ms 23 ms 23 ms eth-17-2.csw02b.snc4.tfbnw.net [74.119.76.105]
14 * * * Request timed out.
15 * * * Request timed out.
16 * * * Request timed out.
17 * * * Request timed out.
18 * * * Request timed out.
Steps to reproduce:
Try pinging graph.facebook.com.
Ping request will get timed out for certain IPs.