Why does Perl's Crypt::SSLeay timeout on Intel Mac OS X machines? - perl

A have a Perl cron job that recently started having its HTTPS connections start failing with an error of "500 SSL read timeout". I've tracked that the error is being thrown as part of an alarm in Crypt::SSLeay, but I don't know if this is simply something taking too long to respond.
So far, I've adjusted the timeout from the default 30 seconds to 10 minutes and it still times out. I've moved the script to other machines, and those on Intel Mac OS X systems all time out, while those under Linux, or on PPC Mac OS X systems run fine, so I don't think it's changes on the network or remote server.
When the process started having problems does not coincide with any software updates or reboots on the machine, and I've contacted the server I'm connecting to, and everyone claims that they haven't changed anything.
Does anyone have recommendations on trying to debug HTTPS, or have you ever seen this behavior and give recommendations on something I might've overlooked at that could've caused this problem?

The problem seems to be specific to OS X and related directly to OpenSSL, so not unique to perl. It possibly has to do with one of the latest security updates from Apple (2010-001).
I'm having the same issue with:
python httplib (uploads over ~64k produce 'The read operation timed out' error). Smaller uploads over SSL work. Uploads of all sizes over HTTP work.
curl over HTTPS. curl times out. Same curl command from Linux works fine with both HTTP and HTTPS. curl on OS X over HTTP also works fine.
I found a few places online that cover similar issues across different programming languages / software. I can only post one...
https://blog.torproject.org/blog/apple-broke-openssl-which-breaks-tor-os-x

Related

Local web server on windows stopped being reachable by devices on the same network

I use a local Python web server on my Windows machine. It’s simple, but good enough while in the static web page development stage. I just run it with something like this on my WSL command line:
python3 -m http.server
I can also access it on mobile devices on the same network, by going to my local address, e.g.: http://192.168.1.12:8000. All was good, until suddenly I could no longer access it on external devices, I got a “server not responding” type of message. Also, I could clearly see that when I refreshed the page on my phone, there was no GET request on the logs.
Immediately I tested on the local machine, and it was still working fine. This obviously smelled like a Firewall. In Linux, I’d know what to do, but it’s the first time I had to deal with this on Windows. This is what I’ve tried, without resolving the connection problem:
I opened the Event Viewer but could not see any obvious logs to check
I stopped the server (CTRL+C) and started it again on another port (5000). The Windows Firewall message popped up again asking for permission for Python3 to access the “Public network” and the “Private network”. Normally I just tick the “private network” but this time I checked both, as a troubleshooting step, in case my Wi-Fi was incorrectly being considered “public”.
I went to Windows Firewall and temporarily shut it down on the private network.
I installed and tried running nmap on the WSL, but it failed to run and prompted me to install the Windows version instead.
I installed and ran the Windows version of nmap but it told me that port 5000 was open.
What is the recommended way to troubleshoot and fix this issue?
Still suspecting the firewall, I tried something new, I switched off the “public network” firewall. I tested on my mobile and the page loaded as normal again! I immediately turned the firewall back on. Tested the page on my mobile once more, still fine. So, the solution was to toggle the public network firewall. I would make it more generic and toggle all firewall categories on Windows. And of course, I would make sure that the firewall stays on, this was a very quick operation.
I thought I’d put this here rather than ServerFault or SuperUser as it could potentially be more useful to developers, and it took a precious hour of my time. I still don’t know why it stopped working on its own in the first place. Better troubleshooting steps or suggestions are welcome, but I probably won’t be able to verify it as I don’t know how to purposely induce the issue.
Another solution that worked another time, was to delete all instances of Python 3.8 from the list of allowed apps (I don't know why Windows shows the same app multiple times) then (re)start the Python server and allow it through when the Firewall question pops up again.
In windows firewall you may have 4 options to configure your local web server when you are creating new Inbound connections rule.
1 Program
2 Port
3 Predefined
4 Custom
Try to use port only in "TCP protocol" and the custom port.
Allow connection.
Select: all checks: domain, private and public.
Enter a name.
Thats all.

TCP retransmission on RST - Different socket behaviour on Windows and Linux?

Summary:
I am guessing that the issue here is something to do with how Windows and Linux handle TCP connections, or sockets, but I have no idea what it is. I'm initiating a TCP connection to a piece of custom hardware that someone else has developed and I am trying to understand its behaviour. In doing so, I've created a .Net core 2.2 application; run on a Windows system, I can initiate the connection successfully, but on Linux (latest Raspbian), I cannot.
It appears that it may be because Linux systems do not try to retry/retransmit a SYN after a RST, whereas Windows ones do - and this behaviour seems key to how this peculiar piece of hardware works..
Background:
We have a black box piece of hardware that can be controlled and queried over a network, by using a manufacturer-supplied Windows application. Data is unencrypted and requires no authentication to connect to it and the application has some other issues. Ultimately, we want to be able to relay data from it to another system, so we decided to make our own application.
I've spent quite a long time trying to understand the packet format and have created a library, which targets .net core 2.2, that can be used to successfully communicate with this kit. In doing so, I discovered that the device seems to require a kind of "request to connect" command to be sent, via UDP. Straight afterwards, I am able to initiate a TCP connection on port 16000, although the first TCP attempt always results in a RST,ACK being returned - so a second attempt needs to be made.
What I've developed works absolutely fine on both Windows (x86) and Linux (Raspberry Pi/ARM) systems and I can send and receive data. However, when run on the Raspbian system, there seems to be problems when initiating the TCP connection. I could have sworn that we had it working absolutely fine on a previous build, but none of the previous commits seem to work - so it may well be a system/kernel update that has changed something.
The issue:
When initiating a TCP connection to this device, it will - straight away - reset the connection. It does this even with the manufacturer-supplied software, which itself then immediately re-attempts the connection again and it succeeds; so this kind of reset-once-then-it-works-the-second-time behaviour in itself isn't a "problem" that I have any control over.
What I am trying to understand is why a Windows system immediately re-attempts the connection through a retransmission...
..but the Linux system just gives up after one attempt (this is the end of the packet capture..)
To prove it is not an application-specific issue, I've tried using ncat/netcat on both the Windows system and the Raspbian system, as well as a Kali system on a separate laptop to prove it isn't an ARM/Raspberry issue. Since the UDP "request" hasn't been sent, the connection will never succeed anyway, but this simply demonstrates different behaviour between the OSes.
Linux versions look pretty much the same as above, whereby they send a single packet that gets reset - whereas the Windows attempt demonstrates the multiple retransmissions..
So, does anyone have any answer for this behaviour difference? I am guessing it isn't a .net core specific issue, but is there any way I can set socket options to attempt a retransmission? Or can it be set at the OS level with systemctl commands or something? I did try and see if there are any SocketOptionNames, in .net, that look like they'd control attempts/retries, as this answer had me wonder, but no luck so far.
If anyone has any suggestions as to how to better align this behaviour across platforms, or can explain the reason for this difference is at all, I would very much appreciate it!
Nice find! According to this, Windows´ TCP will retry a connection if it receives a RST/ACK from the remote host after sending a SYN:
... Upon receiving the ACK/RST client from the target host, the client determines that there is indeed no service listening there. In the Microsoft Winsock implementation of TCP, a pending connection will keep attempting to issue SYN packets until a maximum retry value is reached (set in the registry, this value defaults to 3 extra times)...
The value used to limit those retries is set in HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters\TcpMaxConnectRetransmissions according to the same article. At least in Win10 Pro it doesn´t seem to be present by default.
Although this is a conveniece for Windows machines, an application still should determine its own criteria for handling a failed connect attempt IMO (i. e number of attempts, timeouts etc).
Anyhow, as I said, surprising fact! Living and learning I guess ...
Cristian.

Fiddler blocks my Internet access when it starts to capture web traffic

I have Windows 8.1 installed on my computer and regularly use Fiddler to capture web traffic.
Recently, however, when I open Fiddler and
it strats to capture web traffic, my Internet connectivity dies.
The error I get when I open IE is "the proxy server isn't responding."
In Chrome, I get "Could not connect to proxy server" with the following error: "Error code: ERR_PROXY_CONNECTION_FAILED."
Fiddler doesn't even capture any of the requests going out. The weird thing is that Fiddler was working ok just some days ago and nothing was recently installed on my system.
Searching the Internet for 5 hours, trying everything, and no effective response.
This also had no effect: http://www.telerik.com/blogs/fiddler-and-internet-explorer-11-on-windows-8-1​
It seems that the proxy server created by Fiddler is simply not attending to any traffic.
If I close Fiddler or disable the capture mode my internet come back to normal.
Uninstall and reinstall Fiddler does not solve the problem, neither restart Windows.​
This question has some similarities with my problem, but as I said, none of the answers worked for me.
Why is Fiddler having this problem, and how can I fix it?
99% of the time, this is caused by running a 3rd-party firewall which is blocking access to Fiddler.
1% of the time, this is caused by plugging a Windows Phone device into your PC over USB. The Windows Phone team steals Fiddler's default port (8888) from it.
Running Help > Troubleshoot and updating your question with its output may help.

Missed connections using select() on many non-blocking connecting TCP sockets on windows XP

I have a small portable tool which connects to around 150 servers at diverse locations to get a quick status check from them. It is important to get the status for all servers back to the user relatively quickly so the tool connects to the servers in parallel using non-blocking connect, and uses select() to determine when each socket is ready. The use of select() is fairly straightforward, and the tool is failure mature now and works well on Linux. It runs on windows xp, but connections to the vast majority of the servers out there do not complete. The tool staggers the calls to connect to avoid creating what looks like a SYN flood. It connects to one server about 100 msecs. I also have a check in place to ensure FD_SETSIZE is not violated. I have anecdotal evidence from someone else that the behaviour is better on later windows versions, but have not been able to verify.
I have used WinDump to verify that the syn packets are being sent, and I can see ack packets coming back, but select() keeps returning zero, and the code simply can't connect to most of the servers that do exist, and I can connect to just fine with the same code on Linux.
Has anyone seen and or solved any similar issues with many non-blocking connects and select on Windows XP?
After another day or so of digging I seem to have found the answer. On windows XP SP2 there is a limit of 10 concurrent connecting sockets system wide. If 10 or more half open connections exist, a System event is logged noting that the limit has been reached, and new connecting sockets are throttled silently. The System Event number is 4226.
I have fixed my code by adding version checks for Windows XP, and throttled to less than 10 connections on those systems. So far I have no reports of other versions being affected.

perl-cgi script slowness

I have a perl-cgi script which is log to cisco devices and run a command. so recently move the script from old solaris server to the newer more powerful vm server. now the script is very slow to log to the device actually got timeout. I am not expert on perl and I don't know how I can troubleshoot.regarding of the network there is no issue detected upon my test. as I said the server and the network at least 10 x faster than the old one. any suggestion? thank in advance.
The script was probably written using blocking sockets. The fact that it moved has probably slowed down the connection between the cisco devices and the server running the CGI. I would check your network path first. If this is still a concern you should write this to fork() a child process, use non-blocking socket techniques or write a CLI app. This doesn't sound like something that is well suited to run as a CGI.