Perl and server-client sockets - perl

Background/Context
I have a two scripts, a server-side script that can handle multiple clients, and a client-side script that connects to the server. Any of the clients that send a message to the server have that message copied/echoed to all the other connected clients.
Where I'm stuck.
This afternoon, I have been grasping at thin air searching for a thorough explanation with examples covering all that there is for Perl and TCP sockets. A surprising large number of results from google still list articles from 2007-2012 . It appears there originally there was the 'Socket' module , and over time IO::Socket was added , then IO::Select. But the Perldoc pages don't cover or reference everything in one place, or provide sufficent cross referencing links. I gather that most of the raw calls in Socket have an equivalent in IO::Socket. And its possible (recommended ? yes/no?) to do a functional call on the socket if something isn't available via the OO modules...
Problem 1. The far-side/peer has disconnected / the socket is no longer ESTABLISHED?
I have been trying everything I ran across today, including IO::Select with calls to can_read, has_exception, but the outputs from these show no differences regardless if the socket is up or down - I confirmed from netstat output that the non-blocking socket is torn down instantly by the OS (MacOS).
Problem 2. Is there data available to read?
For my previous perl client scripts, I have rolled my own method of using sysread (https://perldoc.perl.org/functions/sysread.html) , but today I noticed that recv is listed within the synopsis on this page near the top https://perldoc.perl.org/IO/Socket.html , but there is no mention of the recv method in the detailed info below...
From other C and Java doco pages, I gather there is a convention of returning undef, 0, >0, and on some implementations -1 when doing the equivalent of sysread. Is there an official perl spec someone can link me to that describes what Perl has implemented? Is sysread or recv the 'right' way to be reading from TCP sockets in the first place?
I haven't provided my code here because I'm asking from a 'best-practices' point of view, what is the 'right' way to do client-server communication? Is polling even the right way to begin with? Is there an event-driven method that I've somehow missed..
My sincere apologies if what I've asked for is already available, but google keeps giving me the same old result pages and derivative blogs/articles that I've already read.
Many thanks in advance.

And its possible (recommended ? yes/no?) to do a functional call on the socket if something isn't available via the OO modules...
I'm not sure which functional calls you refer to which are not available in IO::Socket. But in general IO::Socket objects are also normal file handles. This means you can do things like $server->accept but also accept($server).
Problem 1. The far-side/peer has disconnected / the socket is no longer ESTABLISHED?
This problem is not specific to Perl but how select and the socket API work in general. Perl does not add its own behavior in this regard. In general: If the peer has closed the connection then select will show that the socket is available for read and if one does a read on the socket it will return no data and no error - which means that no more data are available to read from the peer since the peer has properly closed its side of the connection (connection close is not considered an error but normal behavior). Note that it is possible within TCP to still send data to the peer even if the peer has indicated that it will not send any more data.
Problem 2. Is there data available to read?
sysread and recv are different the same as read and recv/recvmsg or different in the underlying libc. Specifically recv can have flags which for example allow peeking into data available in the systems socket buffer without reading the data. See the the documentation for more information.
I would recommend to use sysread instead of recv since the behavior of sysread can be redefined when tying a file handle while the behavior of recv cannot. And tying the file handle is for example done by IO::Socket::SSL so that not the data from the underlying OS socket are returned but the decrypted data from the SSL socket.
From other C and Java doco pages, I gather there is a convention of returning undef, 0, >0, and on some implementations -1 when doing the equivalent of sysread. Is there an official perl spec someone can link me to that describes what Perl has implemented?
The behavior of sysread is well documented. To cite from what you get when using perldoc -f sysread:
... Returns the number of bytes
actually read, 0 at end of file, or undef if there was an error
(in the latter case $! is also set).
Apart from that, you state your problem as Is there data available to read? but then you only talk about sysread and recv and not how to check if data is available before calling these functions. I assume that you are using select (or IO::Select, which is just a wrapper) to do this. While can_read of IO::Select can be used to get the information in most cases it will return the information only from the underlying OS socket. With plain sockets this is enough but for example when using SSL there is some internal buffering done in the SSL stack and can_read might return false even though there are still data available to read in the buffer. See Common Usage Errors: Polling of SSL sockets on how to handle this properly.

Related

How do I get useful data from a UDP socket using GNAT.Sockets in Ada?

Summary:
I am writing a server in Ada that should listen and reply to messages received over UDP. I am using the GNAT.Sockets library and have created a socket and bound it to a port. However, I am not sure how to listen for and receive messages on the socket. The Listen_Socket function is for TCP sockets and it seems that using Stream with UDP sockets is not recommended. I have seen the receive_socket and receive_vector procedures as alternatives, but I am not sure how to use them or how to convert the output to a usable format.
More details:
I am writing a server that should reply to messages that it gets over UDP. A minimal example of what I have so far would look like this:
with GNAT.Sockets;use GNAT.Sockets;
procedure udp is
sock: Socket_Type;
family: Family_Type:=Family_Inet;
port: Port_Type:=12345;
addr: Sock_Addr_Type(family);
begin
Create_Socket(sock,family,Socket_Datagram);
addr.Addr:=Any_Inet_Addr;
addr.Port:=port;
Bind_Socket(sock,addr);
-- Listen_Socket(sock); -- A TCP thing, not for UDP.
-- now what?
end UDP;
For a TCP socket, I can listen, accept, then use the Stream function to get a nice way to read the data (as in 'Read and 'Input). While the Stream function still exists, I have found an archive of a ten year old comp.lang.ada thread in which multiple people say not to use streams with UDP.
Looking in g-socket.ads, I do see alternatives: the receive_socket and receive_vector procedures. However, the output of the former is a Stream_Element_Array (with an offset indicating the length), and the latter has something similar, just with some kind of length associated with each Stream_Element.
According to https://stackoverflow.com/a/40045312/7105391, the way to change these types into a stream, is to not get them in the first place, and instead get a stream, which is not particularly helpful here.
Over at this github gist I found , Unchecked_Conversion is being used to turn the arrays into strings and vice versa, but given that the reference manual (13.13.1) says that type Stream_Element is mod <implementation-defined>;, I'm not entirely comfortable using that approach.
All in all, I'm pretty confused about how I'm supposed to do this. I'm even more confused about the lack of examples online, as this should be a pretty basic thing to do.

Can you _close(), _read(), and _write() a socket on windows?

MSDN says closesocket() is the function to use. However, I couldn't help wondering if _close() will work also?
MSDN appears to say no in their description of SOCKET type: (http://msdn.microsoft.com/en-us/windows/ms740516(v=vs.80)):
In Winsock applications, a socket descriptor is not a file descriptor and must be used with the Winsock functions.
and more specifically from its note on renamed socket functions:
Sockets are represented by standard file descriptors in Berkeley Sockets, so the close function can be used to close sockets as well as regular files. While nothing in Windows Sockets prevents an implementation from using regular file handles to identify sockets, nothing requires it either. On Windows, sockets must be closed by using the closesocket routine. On Windows, using the close function to close a socket is incorrect and the effects of doing so are undefined by this specification.
However, even despite the above, some of the Windows file functions might work with sockets in practice:
Given that ReadFile and WriteFile work on sockets, I suspect _read and _write, for instance, might also work with sockets as well as file handles.
MSDN's overview of socket handles states:
A socket handle can optionally be a file handle in Windows Sockets 2. A socket handle from a Winsock provider can be used with other non-Winsock functions such as ReadFile, WriteFile, ReadFileEx, and WriteFileEx.
The short answer is no. Sockets handles on Windows are not file handles as they are on Unix. There's special support such that the low level Win32 APIs, ReadFile and WriteFile, can work with a socket handle. But that's likely where it ends.
With regards to _open_osfhandle, yes, that will possibly work in a very limited sense, but there's good reasons why you shouldn't do this. Most of the following I inferred just by browsing the sources of open, read,write, close, and open_osfhandle in the CRT sources (that comes with Visual Studio).
There's a lot of buffering that goes on with the CRT read/write calls. Any attempt to mix read/write with recv/send will be going into undefined behavior.
Performance. Just look at the source of read() and write() as seen in the CRT sources. A lot of wrapper code to eventually call ReadFile and WriteFile, which in turn have to forward to the actual socket API.
Socket error codes may not bubble to the file API as you think. Remember socket API errors get returned through WSAGetLastError. Win32 file IO calls get bubbled up through GetLastError. So if your call to write() hits a socket error, it might try to map the return value via GetLastError, which still returns success.
close() won't properly close the socket handle since it maps to CloseHandle, not closesocket.
More insightful than MSDN: [http://tangentsoft.net/wskfaq/articles/bsd-compatibility.html]
No they are definitely different in Windows 3.x, 9x:
'In Windows 3.1 and 95 Windows sockets and file handles are
completely distinct. In Windows NT, however, it appears they may usually be one and the same.'
And no, you can't use them together on Windows NT - at least not by default:
'The Visual C++ RTL emulates POSIX functions, except that they’re named
with a leading underscore: for example, _read() instead of read(). The
_read() function uses ReadFile() internally, so you’d think it would work with sockets. The problem is, the first argument is an
RTL-specific handle, not an operating system file handle. If you pass
a socket handle to _read() or _write(), the RTL will realize that it
isn’t an RTL handle and the call will fail.
'Fortunately, there is a bridge function in Visual C++’s RTL:
_open_osfhandle(). (If you’re not using Visual C++, you’ll have to check its RTL source for a similar function.) I’ve not tried it, but
it appears to take an operating system file handle (including socket
handles) and return a handle you can use with the POSIX emulation
functions in the RTL. I’m told that this will work with sanely-coded
non-Microsoft Winsock stacks, but since I haven’t tried it, you should
if you want to support these alternate stacks.'

Lua sockets - Asynchronous Events

In current lua sockets implementation, I see that we have to install a timer that calls back periodically so that we check in a non blocking API to see if we have received anything.
This is all good and well however in UDP case, if the sender has a lot of info being sent, do we risk loosing the data. Say another device sends a 2MB photo via UDP and we check socket receive every 100msec. At 2MBps, the underlying system must store 200Kbits before our call queries the underlying TCP stack.
Is there a way to get an event fired when we receive the data on the particular socket instead of the polling we have to do now?
There are a various ways of handling this issue; which one you will select depends on how much work you want to do.*
But first, you should clarify (to yourself) whether you are dealing with UDP or TCP; there is no "underlying TCP stack" for UDP sockets. Also, UDP is the wrong protocol to use for sending whole data such as a text, or a photo; it is an unreliable protocol so you aren't guaranteed to receive every packet, unless you're using a managed socket library (such as ENet).
Lua51/LuaJIT + LuaSocket
Polling is the only method.
Blocking: call socket.select with no time argument and wait for the socket to be readable.
Non-blocking: call socket.select with a timeout argument of 0, and use sock:settimeout(0) on the socket you're reading from.
Then simply call these repeatedly.
I would suggest using a coroutine scheduler for the non-blocking version, to allow other parts of the program to continue executing without causing too much delay.
Lua51/LuaJIT + LuaSocket + Lua Lanes (Recommended)
Same as the above method, but the socket exists in another lane (a lightweight Lua state in another thread) made using Lua Lanes (latest source). This allows you to instantly read the data from the socket and into a buffer. Then, you use a linda to send the data to the main thread for processing.
This is probably the best solution to your problem.
I've made a simple example of this, available here. It relies on Lua Lanes 3.4.0 (GitHub repo) and a patched LuaSocket 2.0.2 (source, patch, blog post re' patch)
The results are promising, though you should definitely refactor my example code if you derive from it.
LuaJIT + OS-specific sockets
If you're a little masochistic, you can try implementing a socket library from scratch. LuaJIT's FFI library makes this possible from pure Lua. Lua Lanes would be useful for this as well.
For Windows, I suggest taking a look at William Adam's blog. He's had some very interesting adventures with LuaJIT and Windows development. As for Linux and the rest, look at tutorials for C or the source of LuaSocket and translate them to LuaJIT FFI operations.
(LuaJIT supports callbacks if the API requires it; however, there is a signficant performance cost compared to polling from Lua to C.)
LuaJIT + ENet
ENet is a great library. It provides the perfect mix between TCP and UDP: reliable when desired, unreliable otherwise. It also abstracts operating system specific details, much like LuaSocket does. You can use the Lua API to bind it, or directly access it via LuaJIT's FFI (recommended).
* Pun unintentional.
I use lua-ev https://github.com/brimworks/lua-ev for all IO-multiplexing stuff.
It is very easy to use fits into Lua (and its function) like a charm. It is either select/poll/epoll or kqueue based and performs very good too.
local ev = require'ev'
local loop = ev.Loop.default
local udp_sock -- your udp socket instance
udp_sock:settimeout(0) -- make non blocking
local udp_receive_io = ev.IO.new(function(io,loop)
local chunk,err = udp_sock:receive(4096)
if chunk and not err then
-- process data
end
end,udp_sock:getfd(),ev.READ)
udp_receive_io:start(loop)
loop:loop() -- blocks forever
In my opinion Lua+luasocket+lua-ev is just a dream team for building efficient and robust networking applications (for embedded devices/environments). There are more powerful tools out there! But if your resources are limited, Lua is a good choice!
Lua is inherently single-threaded; there is no such thing as an "event". There is no way to interrupt executing Lua code. So while you could rig something up that looked like an event, you'd only ever get one if you called a function that polled which events were available.
Generally, if you're trying to use Lua for this kind of low-level work, you're using the wrong tool. You should be using C or something to access this sort of data, then pass it along to Lua when it's ready.
You are probably using a non-blocking select() to "poll" sockets for any new data available. Luasocket doesn't provide any other interface to see if there is new data available (as far as I know), but if you are concerned that it's taking too much time when you are doing this 10 times per second, consider writing a simplified version that only checks one socket you need and avoids creating and throwing away Lua tables. If that's not an option, consider passing nil to select() instead of {} for those lists you don't need to read and pass static tables instead of temporary ones:
local rset = {socket}
... later
...select(rset, nil, 0)
instead of
...select({socket}, {}, 0)

How to set a timeout in connect/send ? ( as400 iseries v5r4, rpg )

From this rpg socket tutorial we created a socket client in rpg that calls a java server socket
The problem is that connect()/send() operations blocks and we have a requirement that if the connect/send couldn't be done in a matter of a second per say, we have to just log it and finish.
If I set the socket to non-blocking mode (I think with fnctl), we are not fully understanding how to proceed, and can't find any useful documentation with examples for it.
I think if I do connect to a non-blocking socket I have to do select(..., timeout) which tells us if the connect succeed and/ we are able to send(bytes). But, if we send(bytes) afterwards, as it is now a non-blocking socket (which will immediately return after the call), how do I know that send() did the actual sending of the bytes to the server before closing the socket ?
I can fall back to have the client socket in AS400 as a Java or C procedure, but I really want to just keep it in a simple RPG program.
Would somebody help me understand how to do that please ?
Thanks !
In my opinion, that RPG tutorial you mention has a slight defect. What I believe is causing your confusion is the following section's code:
...
Consequently, we typically call the
send() API like this:
D miscdata S 25A
D rc S 10I 0
C eval miscdata = 'The data to send goes here'
C eval rc = send(s: %addr(miscdata): 25: 0)
c if rc < 25
C* for some reason we weren't able to send all 25 bytes!
C endif
...
If you read the documentation of send() you will see that the return value does not indicate an error if it is greater than -1 yet in the code above it seems as if an error has occurred. In fact, the sum of the return values must equal the size of the buffer assuming that you keep moving the pointer into the buffer to reflect what has been sent. Look here in Beej's Guide to Network Programming. You might also like to look at Richard Stevens' book UNIX Network Programming, Volume 1 for really detailed explanations.
As to the problem of determining if the last send before close() did the actual send ... well the paragraph above explains how to determine what portion of the data was sent. However, calling close() will attempt to send all unsent data unless SO_LINGER is set.
fnctl() is used to control blocking while setsockopt() is used to set SO_LINGER.
The abstraction of network communications being used is BSD sockets. There are some slight differences in implementations across OS's but it is generally quite homogeneous. This means that one can generally use documentation written for other OS's for the broad overview. Most of the time.

Socket Read In Multi-Threaded Application Returns Zero Bytes or EINTR (104)

Am a c-coder for a while now - neither a newbie nor an expert. Now, I have a certain daemoned application in C on a PPC Linux. I use PHP's socket_connect as a client to connect to this service locally. The server uses epoll for multiplexing connections via a Unix socket. A user submitted string is parsed for certain characters/words using strstr() and if found, spawns 4 joinable threads to different websites simultaneously. I use socket, connect, write and read, to interact with the said webservers via TCP on their port 80 in each thread. All connections and writes seems successful. Reads to the webserver sockets fail however, with either (A) all 3 threads seem to hang, and only one thread returns -1 and errno is set to 104. The responding thread takes like 10 minutes - an eternity long:-(. *I read somewhere that the 104 (is EINTR?), which in the network context suggests that ...'the connection was reset by peer'; or (B) 0 bytes from 3 threads, and only 1 of the 4 threads actually returns some data. Isn't the socket read/write thread-safe? I use thread-safe (and reentrant) libc functions such as strtok_r, gethostbyname_r, etc.
*I doubt that the said webhosts are actually resetting the connection, because when I run a single-threaded standalone (everything else equal) all things works perfectly right, but of course in series not parallel.
There's a second problem too (oops), I can't write back to the client who connect to my epoll-ed Unix socket. My daemon application will hang and hog CPU > 100% for ever. Yet nothing is written to the clients end. Am sure the client (a very typical PHP socket application) hasn't closed the connection whenever this is happening - no error(s) detected either. Any ideas?
I cannot figure-out whatever is wrong even with Valgrind, GDB or much logging. Kindly help where you can.
Yes, read/write are thread-safe. But beware of gethostbyname() and getservbyname() if you're using them - they return pointers to static data, and may not be thread-safe.
errno 104 is ECONNREFUSED (not EINTR). Use strerror or perror to get the textual error message (like 'Connection reset by peer') for a particular errno code.
The best way to figure out what's going wrong is often to do very detailed logging - log the results of every operation, plus details like the IP address/port connecting to, the number of bytes read/written, the thread id, and so forth. And, of course, make sure your logging code is thread-safe :-)
Getting an ECONNRESET after 10 minutes sounds like the result of your connection timing out. Either the web server isn't sending the data or your app isn't receiving it.
To test the former, hookup a program like Wireshark to the local loopback device and look for traffic to and from the port you are using.
For the later, take a look at the epoll() man page. They mention a scenario where using edge triggered events could result in a lockup, because there is still data in the buffer, but no new data comes in so no new event is triggered.