Unable to accept connections on socket, when creating sockets on remote node via RPC in Erlang - sockets

I am struggling to identify the reason for gen_tcp:accept always returning an {error, closed} response.
Essentially, I have a supervisor that creates a listening socket:
gen_tcp:listen(8081, [binary, {packet, 0}, {active, false}, {reuseaddr, true}]),
This socket is then passed to a child, which is an implementation of the gen_server behaviour. The child then accepts connections on the socket.
accept(ListeningSocket, {ok, Socket}) ->
spawn(fun() -> loop(Socket) end),
accept(ListeningSocket);
accept(_ListeningSocket, {error, Error}) ->
io:format("Unable to listen on socket: ~p.~n", [Error]),
gen_server:call(self(), stop).
accept(ListeningSocket) ->
accept(ListeningSocket, gen_tcp:accept(ListeningSocket)).
loop(Socket) ->
case gen_tcp:recv(Socket, 0) of
{ok, Data} ->
io:format("~p~n", [Data]),
process_request(Data),
gen_tcp:send(Socket, Data),
loop(Socket);
{error, closed} -> ok
end.
I load the supervisor and gen_server BEAM binaries locally, and load them on a another node (which runs on the same machine) via an RPC call to code:load_binary.
Next, I execute the supervisor via an RPC call, which in turn starts the server.{error, closed} is always returned by gen_tcp:accept in this scenario.
Should I run the supervisor and server while logged in to a node shell, then the server can accept connections without issue. This includes 'remsh' to the remote node that would fail to accept connections, had I previously RPCed it to start the server unsuccessfully.
I seem to be able to replicate the issue by using the shell alone:
[Terminal 1]: erl -sname node -setcookie abc -distributed -noshell
[Terminal 2]: erl -sname rpc -setcookie abc:
net_adm:ping('node#verne').
{ok, ListeningSocket} = rpc:call('node#verne', gen_tcp, listen, [8081, [binary, {packet, 0}, {active, true}, {reuseaddr, true}]]).
rpc:call('node#verne', gen_tcp, accept, [ListeningSocket]).
The response to the final RPC is {error, closed}.
Could this be something to do with socket/port ownership?
In case it is of help, there are no clients waiting to connect, and I don't set timeouts anywhere.

Each rpc:call starts a new process on the target node to handle the request. In your final example, your first call creates a listen socket within such a process, and when that process dies at the end of the rpc call, the socket is closed. Your second rpc call to attempt an accept therefore fails due to the already-closed listen socket.
Your design seems unusual in several ways. For example, it's not normal to have supervisors opening sockets. You also say the child is a gen_server yet you show a manual recv loop, which if run within a gen_server would block it. You might instead explain what you're trying to accomplish and request help on coming up with a design to meet your goals.

Related

No output from erlang tracer

I've got a module my_api with a function which is callback for cowboy's requests handle/2,
So when I make some http requests like this:
curl http://localhost/test
to my application this function is called and it's working correctly because I get a response in the terminal.
But in another terminal I attach to my application with remsh and try to trace calls to that function with a dbg module like this:
dbg:tracer().
dbg:tp(my_api, handle, 2, []).
dbg:p(all, c).
I expected that after in another terminal I make a http request to my api, the function my_api:handle/2 is called and I get some info about this call (at least function arguments) in the attached to the node terminal but I get nothing in there. What am I missing?
When you call dbg:tracer/0, a tracer of type process is started with a message handler that sends all trace messages to the user I/O device. Your remote shell's group leader is independent of the user I/O device, so your shell doesn't receive the output sent to user.
One approach to allow you to see trace output is to set up a trace port on the server and a trace client in a separate node. If you want traces from node foo, first remsh to it:
$ erl -sname bar -remsh foo
Then set up a trace port. Here, we set up a TCP/IP trace port on host port 50000 (use any port you like as long as it's available to you):
1> dbg:tracer(port, dbg:trace_port(ip, 50000)).
Next, set up the trace parameters as you did before:
2> dbg:tp(my_api, handle, 2, []).
{ok, ...}
3> dbg:p(all, c).
{ok, ...}
Then exit the remsh, and start a node without remsh:
$ erl -sname bar
On this node, start a TCP/IP trace client attached to host port 50000:
1> dbg:trace_client(ip, {"localhost", 50000}).
This shell will now receive dbg trace messages from foo. Here, we used "localhost" as the hostname since this node is running on the same host as the server node, but you'll need to use a different hostname if your client is running on a separate host.
Another approach, which is easier but relies on an undocumented function and so might break in the future, is to remsh to the node to be traced as you originally did but then use dbg:tracer/2 to send dbg output to your remote shell's group leader:
1> dbg:tracer(process, {fun dbg:dhandler/2, group_leader()}).
{ok, ...}
2> dbg:tp(my_api, handle, 2, []).
{ok, ...}
3> dbg:p(all, c).
{ok, ...}
Since this relies on the dbg:dhandler/2 function, which is exported but undocumented, there's no guarantee it will always work.
Lastly, since you're tracing all processes, please pay attention to the potential problems described in the dbg man page, and always be sure to call dbg:stop_clear(). when you're finished tracing.

Writing parallel TCP server in erlang

"Programming Erlang Software for a Concurrent World" says to write a parallel TCP server do like this:
start_parallel_server() ->
{ok, Listen} = gen_tcp:listen(...),
spawn(fun() -> par_connect(Listen) end).
par_connect(Listen) ->
{ok, Socket} = gen_tcp:accept(Listen),
spawn(fun() -> par_connect(Listen) end),
loop(Socket).
loop(...) -> %% handle request here
When start_parallel_server finishes its work it will close listen socket. Shouldn't we add something like timer:sleep(infinity) at the end of it?
If you run start_parallel_server() from the shell the shell process will own the listening socket, so it will stay alive as long as that shell process is alive. Note that the shell process dies on exceptions and a new shell process is respawned… Can cause confusion.
But if you e.g. spawn a new process that in turn calls the start_parallel_server() function you will need a sleep in that spawned process to keep it alive.
Moreover, for real world applications https://github.com/extend/ranch is more suitable.

Lua Socket cannot be properly stopped by Ctrl+C

I have a standalone lua script that uses lua sockets to connect to a server via TCP IP. It uses receive call to receive data from that server. It works, however, when I try to stop it with Ctrl+C, one of the two scenarios is happening:
-If there is currently no traffic and receive is waiting, Ctrl+C will have no effect. The program will continue to run, and will have to be terminated by kill.
-If there is traffic, the program will exit with the below printout and with the socket still open and with the server not accepting another connection:
lua: luaSocketTest.lua:15: interrupted!
stack traceback:
[C]: in function 'receive'
luaSocketTest.lua:15: in function 'doWork'
luaSocketTest.lua:22: in main chunk
[C]: ?
I tried using pcall to solve the second scenario, without success. pcall doesn't return, the process still throws the error.
Sample of my program is below:
local socket = require ("socket")
local ip = "localhost"
local port = 5003
function doWork ()
print ("Starting socket: "..ip..":"..port)
client = assert(socket.connect(ip, port))
print ("Socket Accepted")
client:send("TEST TEST")
while 1 do
local byte, err = client:receive (1)
if not err then
print (byte)
end
end
end
while 1 do
local status = pcall(doWork())
print ("EXITED PCALL WITH STATUS: "..tostring(status))
if not status then client:close() end
end
This would be quite a change, but you could employ lua-ev. It allows to add Signal handlers, which is exactly what is required to react to ctrl-c.
local socket = require'socket'
-- make connect and send in blocking mode
local client = socket.connect(ip,port)
client:send('TEST TEST')
-- make client non-blocking
client:settimeout(0)
ev.IO.new(function()
repeat
local data,err,part = client:receive(10000)
print('received',data or part)
until err
end,client:getfd(),ev.READ):start(ev.Loop.default)
local ev = require'ev'
local SIGINT = 2
ev.Signal.new(function()
print('SIGINT received')
end,SIGINT):start(ev.Loop.default)
ev.Loop.default:loop()

Erlang accepting SSL connection is really slow (comparing to C++)

I'm currently testing extreme condition on a piece of code written with Erlang.
I have implemented learnyousomeerlang.com's technique of supervisor to have multiple accept capability.
Here the code slightly modified to handle SSL connections of the supervisor:
-module(mymodule).
-behaviour(supervisor).
-export([start/0, start_socket/0]).
-define(SSL_OPTIONS, [{active, true},
{mode, list},
{reuseaddr, true},
{cacertfile, "./ssl_key/server/gd_bundle.crt"},
{certfile, "./ssl_key/server/cert.pem"},
{keyfile, "./ssl_key/server/key.pem"},
{password, "********"}
]).
-export([init/1]).
start_link() ->
application:start(crypto),
crypto:start(),
application:start(public_key),
application:start(ssl),
supervisor:start_link({local, ?MODULE}, ?MODULE, []).
init([]) ->
{ok, LSocket} = ssl:listen(4242, ?SSL_OPTIONS),
spawn_link(fun empty_listeners/0),
{ok, {{simple_one_for_one, 60, 3600},
[{socket,
{mymodule_serv, start_link, [LSocket]}, % pass the socket!
temporary, 1000, worker, [mymodule_serv]}
]}}.
empty_listeners() ->
[start_socket() || _ <- lists:seq(1,100)],
ok.
start_socket() ->
supervisor:start_child(?MODULE, []).
Here's the code for gen_server which will represent every client connecting :
-module(mymodule_serv).
-behaviour(gen_server).
-export([start_link/1]).
-export([init/1, handle_call/3, handle_cast/2, terminate/2, code_change/3, handle_info/2]).
start_link(Socket) ->
gen_server:start_link(?MODULE, Socket, []).
init(Socket) ->
gen_server:cast(self(), accept),
{ok, #client{socket=Socket, pid=self()}}.
handle_call(_E, _From, Client) ->
{noreply, Client}.
handle_cast(accept, C = #client{socket=ListenSocket}) ->
{ok, AcceptSocket} = ssl:transport_accept(ListenSocket),
mymodule:start_socket(),
ssl:ssl_accept(AcceptSocket),
ssl:setopts(AcceptSocket, [{active, true}, {mode, list}]),
{noreply, C#client{socket=AcceptSocket, state=connecting}}.
[...]
I have the ability to launch close to 10.000 connections at once from multiple server.
While it will take 10 second to a ssl accepting bit of C++ code to accept all of them (which don't even have multiple accept pending), in Erlang this is quite different. It will accept at most 20 connections a second (according to netstat info, whilst C++ accept more like 1K connection per seconds)
While the 10K connections are awaiting for acceptance, I'm manually trying to connect as well.
openssl s_client -ssl3 -ign_eof -connect myserver.com:4242
3 cases happen when I do :
Connection simply timeout
Connection will connect after waiting for it 30 sec. at least
Connection will occur almost directly
When I try connecting manually with 2 consoles, the first done handshaking will not always be the first which tried to connect... Which I found particular.
The server configuration is :
2 x Intel® Xeon® E5620
8x 2.4GHz
24 Go RAM
I'm starting the Erlang shell with :
$erl +S 8:8
EDIT 1:
I have even tried to accept the connection with gen_tcp, and upgrading afterwards the connection to a SSL one. Still the same issue, it won't accept more than 10 connections a second... Is ssl:ssl_accept is doing this ? does it lock anything that would prevent Erlang to scale this ?
EDIT 2:
After looking around on other SSL server created in erlang, it seems that they use some kind of driver for SSL/TLS connection, my examples are RabbitMQ and EjabberD.
Nowhere there is ssl:ssl_accept in their Erlang code, I haven't investigate a lot, but it seems they have created their own driver in order to upgrade the TCP Socket to a SSL/TLS one.
Is that because there is an issue with Erlang's module SSL ? Does anyone know why they are using custom driver for SSL/TLS ?
Any thoughts on this ?
Actually it was not the SSL accept or handshake that was slowing the whole thing.
We found on the erlang question list that it was the backlog.
Backlog is set to 5 by default. I have set it to SOMAXCONN and everything works fine now !

Erlang catch disconnect client

I have tcp server written in erlang and command handler. If client connect to my server, and then closing how can i catch network disconnect?
I presume u are using vanilla gen_tcp to implement your server.
In which case, acceptor process (the process you pass the Socket to) will receive a {tcp_closed, Socket} message when the socket is closed from the client end.
sample code from the erlang gen_tcp documentation.
start(LPort) ->
case gen_tcp:listen(LPort,[{active, false},{packet,2}]) of
{ok, ListenSock} ->
spawn(fun() -> server(LS) end);
{error,Reason} ->
{error,Reason}
end.
server(LS) ->
case gen_tcp:accept(LS) of
{ok,S} ->
loop(S),
server(LS);
Other ->
io:format("accept returned ~w - goodbye!~n",[Other]),
ok
end.
loop(S) ->
inet:setopts(S,[{active,once}]),
receive
{tcp,S,Data} ->
Answer = do_something_with(Data),
gen_tcp:send(S,Answer),
loop(S);
{tcp_closed,S} ->
io:format("Socket ~w closed [~w]~n",[S,self()]),
ok
end.
are you using a separate linked process to handle commands from each client?
if so you can think of trapping exits in the main process...