I want to simulate test cross-platform connection failures / timeouts, starting with blocking connect()s:
#!/usr/bin/python3
import socket
s = socket.socket()
endpoint = ('localhost', 28813)
s.bind((endpoint))
# listen for connections, accept 0 connections kept waiting (backlog)
# all other connect()s should block indefinitely
s.listen(0)
for i in range(1,1000):
c = socket.socket()
c.connect(endpoint)
# print number of successfully connected sockets
print(i)
On Linux, it prints "1" and hangs indefinitely (i.e. the behavior I want).
On Windows (Server 2012), it prints "1" and aborts with a ConnectionRefusedError.
On macOS, it prints all numbers from 1 to 128 and then hangs indefinitely.
Thus, I could accept the macOS ignores the backlog parameter and just connect enough sockets for clients to block on new connections.
How can I get Windows to also block connect() attempts?
On Windows, the SO_CONDITIONAL_ACCEPT socket option allows the application to have the incoming connections wait until it's accept()ed. The constant (SO_CONDITIONAL_ACCEPT=0x3002) isn't exposed in the Python module, but can be supplied manually:
s.bind(endpoint)
s.setsockopt(socket.SOL_SOCKET, 0x3002, 1)
s.listen(0)
It's so effective that even the first connect is kept waiting.
On macOS, backlog=0 is reset to backlog=SOMAXCONN, backlog=1 keeps all connections except the first waiting.
I have written a program in golang to make request about 2000qps to different remote ip with local port randomly selected by linux, and close request immediately after connection established, but still encounter bind: address already in use error periodically
what I have done:
net.ipv4.ip_local_port_range is 15000-65535
net.ipv4.tcp_tw_recycle=1 net.ipv4.tcp_tw_reuse=1 net.ipv4.tcp_fin_timeout=30
above is sockstat:
sockets: used 1200 TCP: inuse 2302 orphan 1603 tw 40940 alloc 2325 mem 201
I don't figure it out why this error still there with kernel selecting available local port,will kernel return a port in use ?
This is a good answer from 2012:
https://serverfault.com/questions/342741/what-are-the-ramifications-of-setting-tcp-tw-recycle-reuse-to-1#434669
As of 2018, tcp_tw_recycle exists only in the sysctl binary, is otherwise gone from the kernel:
https://github.com/torvalds/linux/search?utf8=%E2%9C%93&q=tcp_tw_recycle&type=
tcp_tw_reuse is still in use as described in the above answer:
https://github.com/torvalds/linux/blob/master/net/ipv4/tcp_ipv4.c#L128
However, while a TCP_TIMEWAIT_LEN is in use:
https://github.com/torvalds/linux/search?utf8=%E2%9C%93&q=TCP_TIMEWAIT_LEN&type=
the value is hardcoded:
https://github.com/torvalds/linux/blob/master/include/net/tcp.h#L120
and tcp_fin_timeout refers to a different state:
https://github.com/torvalds/linux/blob/master/Documentation/networking/ip-sysctl.txt#L294
One can relatively safely change the local port range to 1025-65535.
For kicks, if there were a situation where this client was talking to servers and network under my control, I would build a new kernel with a not-to-spec TCP_TIMEWAIT_LEN, and perhaps also fiddle with tcp_max_tw_buckets:
https://github.com/torvalds/linux/blob/master/Documentation/networking/ip-sysctl.txt#L379
But doing so in other circumstances- if this client is behind a NAT and talking to common public servers- will likely be disruptive.
I have a stupid problem with scotty web app and mongodb service starting in the right order.
I use systemd to start mongodb first and then the scotty web app. It does not work for some reason. The app errors out with connect: does not exist (Connection refused) from the mongodb driver meaning that the connection is not ready.
So my question. How can I test the connection availability say three times with 0.5s interval and only then error out?
This is the application main function
main :: IO ()
main = do
pool <- createPool (runIOE $ connect $ host "127.0.0.1") close 1 300 5
clearSessions pool
let r = \x -> runReaderT x pool
scottyT 3000 r r basal
basal :: ScottyD ()
basal = do
middleware $ staticPolicy (noDots >-> addBase "static")
notFound $ runSession
routes
Although the app service is ordered after mongodb service the connection to mongodb is still unavailable during the app start up. So I get the above mentioned error.
This is the systemd service file to avoid questions regarding the correct service ordering.
[Unit]
Description=Basal Web Application
Requires=mongodb.service
After=mongodb.service iptables.service network-online.target
[Service]
User=http
Group=http
WorkingDirectory=/srv/http/basal/
ExecStart=/srv/http/basal/bin/basal
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.target
I don't know why connection to mongodb is not available given the correct service order.
So I want to probe connection availability withing haskell code three times with 0.5s delay and then error out. How can I do it?
Thanks.
I guess from the functions you're using that you're using something like mongoDB 1.5.0.
Here, connect returns something in the IOE monad, which is an alias for ErrorTIOErrorIO.
So the best approach is to use the retrying mechanisms ErrorT offers. As it's an instance of MonadPlus, we can just use mplus if we don't care about checking for the specific error:
retryConnect :: Int -> Int -> Host -> IOE Pipe
retryConnect retries delayInMicroseconds host
| retries > 0 =
connect host `mplus`
(liftIO (threadDelay delayInMicroseconds) >>
retryConnect (retries - 1) delayInMicroseconds host)
| otherwise = connect host
(threadDelay comes from Control.Concurrent).
Then replace connect with retryConnect 2 500000 and it'll retry twice after the first failure with a 500,000 microsecond gap (i.e. 0.5s).
If you do want to check for a specific error, then use catchError instead and inspect the error to decide whether to swallow it or rethrow it.
I'm currently building an environment for deploying a web.application.
The Web.Application uses Enyim.Caching.
There looks to be an issues with the sockets
I'm unfamiliar with membase server, if there is any additional information that I can include in this post please ask...
Any suggestions on what I can check would be greatly appreciated:
Enyim.Caching.Memcached.MemcachedNode.InternalPoolImpl - Pool has been inited for 127.0.0.1:11212 with 10 sockets`
Enyim.Caching.Memcached.MemcachedNode.InternalPoolImpl - Acquiring stream from pool. 127.0.0.1:11212`
Enyim.Caching.Memcached.PooledSocket - Socket 86101442-5fc2-4169-bba2-9f25f1647254 was reset
Enyim.Caching.Memcached.MemcachedNode.InternalPoolImpl - Socket was reset. 86101442-5fc2-4169-bba2-f25f1647254
Enyim.Caching.Memcached.MemcachedNode - System.IO.IOException: Failed to read from the socket '127.0.0.1:11212'. Error: ?
at Enyim.Caching.Memcached.PooledSocket.BasicNetworkStream.Read(Byte[] buffer, Int32 offset, Int32 count) in d:\d\repo\EnyimMemcached\Enyim.Caching\Memcached\BasicNetworkStream.cs:line 92
at System.IO.BufferedStream.ReadByte()
at Enyim.Caching.Memcached.PooledSocket.ReadByte() in
Enyim uses port 11211 by default. It looks like you are trying 11212 instead, try changing to 11211.
I'm currently testing extreme condition on a piece of code written with Erlang.
I have implemented learnyousomeerlang.com's technique of supervisor to have multiple accept capability.
Here the code slightly modified to handle SSL connections of the supervisor:
-module(mymodule).
-behaviour(supervisor).
-export([start/0, start_socket/0]).
-define(SSL_OPTIONS, [{active, true},
{mode, list},
{reuseaddr, true},
{cacertfile, "./ssl_key/server/gd_bundle.crt"},
{certfile, "./ssl_key/server/cert.pem"},
{keyfile, "./ssl_key/server/key.pem"},
{password, "********"}
]).
-export([init/1]).
start_link() ->
application:start(crypto),
crypto:start(),
application:start(public_key),
application:start(ssl),
supervisor:start_link({local, ?MODULE}, ?MODULE, []).
init([]) ->
{ok, LSocket} = ssl:listen(4242, ?SSL_OPTIONS),
spawn_link(fun empty_listeners/0),
{ok, {{simple_one_for_one, 60, 3600},
[{socket,
{mymodule_serv, start_link, [LSocket]}, % pass the socket!
temporary, 1000, worker, [mymodule_serv]}
]}}.
empty_listeners() ->
[start_socket() || _ <- lists:seq(1,100)],
ok.
start_socket() ->
supervisor:start_child(?MODULE, []).
Here's the code for gen_server which will represent every client connecting :
-module(mymodule_serv).
-behaviour(gen_server).
-export([start_link/1]).
-export([init/1, handle_call/3, handle_cast/2, terminate/2, code_change/3, handle_info/2]).
start_link(Socket) ->
gen_server:start_link(?MODULE, Socket, []).
init(Socket) ->
gen_server:cast(self(), accept),
{ok, #client{socket=Socket, pid=self()}}.
handle_call(_E, _From, Client) ->
{noreply, Client}.
handle_cast(accept, C = #client{socket=ListenSocket}) ->
{ok, AcceptSocket} = ssl:transport_accept(ListenSocket),
mymodule:start_socket(),
ssl:ssl_accept(AcceptSocket),
ssl:setopts(AcceptSocket, [{active, true}, {mode, list}]),
{noreply, C#client{socket=AcceptSocket, state=connecting}}.
[...]
I have the ability to launch close to 10.000 connections at once from multiple server.
While it will take 10 second to a ssl accepting bit of C++ code to accept all of them (which don't even have multiple accept pending), in Erlang this is quite different. It will accept at most 20 connections a second (according to netstat info, whilst C++ accept more like 1K connection per seconds)
While the 10K connections are awaiting for acceptance, I'm manually trying to connect as well.
openssl s_client -ssl3 -ign_eof -connect myserver.com:4242
3 cases happen when I do :
Connection simply timeout
Connection will connect after waiting for it 30 sec. at least
Connection will occur almost directly
When I try connecting manually with 2 consoles, the first done handshaking will not always be the first which tried to connect... Which I found particular.
The server configuration is :
2 x Intel® Xeon® E5620
8x 2.4GHz
24 Go RAM
I'm starting the Erlang shell with :
$erl +S 8:8
EDIT 1:
I have even tried to accept the connection with gen_tcp, and upgrading afterwards the connection to a SSL one. Still the same issue, it won't accept more than 10 connections a second... Is ssl:ssl_accept is doing this ? does it lock anything that would prevent Erlang to scale this ?
EDIT 2:
After looking around on other SSL server created in erlang, it seems that they use some kind of driver for SSL/TLS connection, my examples are RabbitMQ and EjabberD.
Nowhere there is ssl:ssl_accept in their Erlang code, I haven't investigate a lot, but it seems they have created their own driver in order to upgrade the TCP Socket to a SSL/TLS one.
Is that because there is an issue with Erlang's module SSL ? Does anyone know why they are using custom driver for SSL/TLS ?
Any thoughts on this ?
Actually it was not the SSL accept or handshake that was slowing the whole thing.
We found on the erlang question list that it was the backlog.
Backlog is set to 5 by default. I have set it to SOMAXCONN and everything works fine now !