I presume that (x)inetd works like most daemons, in that when it accept()s a connection on a port it's monitoring, a socket is created. Somehow, though, before it fork()s and exec()s the target service program, it manages to convert this bidirectional socket into the three standard I/O descriptors STDIN, STDOUT, and STDERR.
I've been asking around how to defeat this behavior without rolling my own version of inetd, with no success. It occurred to me to ask the question in this manner, because it might lend a clue as to how I might then re-"join" file descriptors 0, 1, and 2 into a single socket.
The reason for my wanting to have the service program communicate bidirectionally over a single socket is that it needs to use an I/O infrastructure that is based on a single, bidirectional socket.
Incoming data to the inet.d server will be mapped to stdin, and fprintf(stdout) & fprintf(stderr) of inet.d handler will be send back to client by combining into one stream.
Here the issue is telnet client can't distinguish the stdout and stderr streams. If needed use ssh instead of telnet, that provides the option.
Related
I have a simple socket implementation which uses the standard low-level Berkeley socket functions (bind, listen, accept, read, etc).
This socket listens on a port, let's say X.
Now what I'm trying to achieve is to make Simple-WebSocket-Server to listen also on this port X.
Of course this is not possible by nature - I know.
My intention is this: In my simple socket implementation I would detect if the connected client (after accepting) is my client or a websocket one, then if I find the client to be a websocket one, I would pass the whole thing into this library to behave the same as like it was the one have accepted this client.
What would be good to hand over the socket's fd, along with a first bytes that my socket read before noticing a websocket request.
I'm a bit stuck on would be the best to do this, but I don't want to reimplement the whole websocket stuff for sure.
The tricky thing here is that the Simple-WebSocket-Server does its own accept, so I don't there is a way to had over to it a fd together with an array of "first bytes".
Some approaches I could think of:
modify the Simple-WebSocket-Server so that instead of closing a non-WS client or timing-out, it makes a call to your library
instead use something like websocketpp to create your own websocket server, and then select between the two servers (I did something similar for one of my own projects, where I had to detect the socket protocol from the first byte and then select an appropriate protocol handler wampcc protcol selector)
or, you could try to have the Simple-WebSocket-Server listen on a different port Y; you also listen on X, and if detect a web-socket client on X, you internally create an internal pair of queues and then open a connection to localhost:Y, and proceed to transfer bytes between the pair of sockets; this way you don't need to modify the Simple-WebSocket-Server code.
Background/Context
I have a two scripts, a server-side script that can handle multiple clients, and a client-side script that connects to the server. Any of the clients that send a message to the server have that message copied/echoed to all the other connected clients.
Where I'm stuck.
This afternoon, I have been grasping at thin air searching for a thorough explanation with examples covering all that there is for Perl and TCP sockets. A surprising large number of results from google still list articles from 2007-2012 . It appears there originally there was the 'Socket' module , and over time IO::Socket was added , then IO::Select. But the Perldoc pages don't cover or reference everything in one place, or provide sufficent cross referencing links. I gather that most of the raw calls in Socket have an equivalent in IO::Socket. And its possible (recommended ? yes/no?) to do a functional call on the socket if something isn't available via the OO modules...
Problem 1. The far-side/peer has disconnected / the socket is no longer ESTABLISHED?
I have been trying everything I ran across today, including IO::Select with calls to can_read, has_exception, but the outputs from these show no differences regardless if the socket is up or down - I confirmed from netstat output that the non-blocking socket is torn down instantly by the OS (MacOS).
Problem 2. Is there data available to read?
For my previous perl client scripts, I have rolled my own method of using sysread (https://perldoc.perl.org/functions/sysread.html) , but today I noticed that recv is listed within the synopsis on this page near the top https://perldoc.perl.org/IO/Socket.html , but there is no mention of the recv method in the detailed info below...
From other C and Java doco pages, I gather there is a convention of returning undef, 0, >0, and on some implementations -1 when doing the equivalent of sysread. Is there an official perl spec someone can link me to that describes what Perl has implemented? Is sysread or recv the 'right' way to be reading from TCP sockets in the first place?
I haven't provided my code here because I'm asking from a 'best-practices' point of view, what is the 'right' way to do client-server communication? Is polling even the right way to begin with? Is there an event-driven method that I've somehow missed..
My sincere apologies if what I've asked for is already available, but google keeps giving me the same old result pages and derivative blogs/articles that I've already read.
Many thanks in advance.
And its possible (recommended ? yes/no?) to do a functional call on the socket if something isn't available via the OO modules...
I'm not sure which functional calls you refer to which are not available in IO::Socket. But in general IO::Socket objects are also normal file handles. This means you can do things like $server->accept but also accept($server).
Problem 1. The far-side/peer has disconnected / the socket is no longer ESTABLISHED?
This problem is not specific to Perl but how select and the socket API work in general. Perl does not add its own behavior in this regard. In general: If the peer has closed the connection then select will show that the socket is available for read and if one does a read on the socket it will return no data and no error - which means that no more data are available to read from the peer since the peer has properly closed its side of the connection (connection close is not considered an error but normal behavior). Note that it is possible within TCP to still send data to the peer even if the peer has indicated that it will not send any more data.
Problem 2. Is there data available to read?
sysread and recv are different the same as read and recv/recvmsg or different in the underlying libc. Specifically recv can have flags which for example allow peeking into data available in the systems socket buffer without reading the data. See the the documentation for more information.
I would recommend to use sysread instead of recv since the behavior of sysread can be redefined when tying a file handle while the behavior of recv cannot. And tying the file handle is for example done by IO::Socket::SSL so that not the data from the underlying OS socket are returned but the decrypted data from the SSL socket.
From other C and Java doco pages, I gather there is a convention of returning undef, 0, >0, and on some implementations -1 when doing the equivalent of sysread. Is there an official perl spec someone can link me to that describes what Perl has implemented?
The behavior of sysread is well documented. To cite from what you get when using perldoc -f sysread:
... Returns the number of bytes
actually read, 0 at end of file, or undef if there was an error
(in the latter case $! is also set).
Apart from that, you state your problem as Is there data available to read? but then you only talk about sysread and recv and not how to check if data is available before calling these functions. I assume that you are using select (or IO::Select, which is just a wrapper) to do this. While can_read of IO::Select can be used to get the information in most cases it will return the information only from the underlying OS socket. With plain sockets this is enough but for example when using SSL there is some internal buffering done in the SSL stack and can_read might return false even though there are still data available to read in the buffer. See Common Usage Errors: Polling of SSL sockets on how to handle this properly.
Say one part of a program writes some stuff to a socket and another part of the same program reads stuff from that same socket. If an external tool writes to that very same socket, how would I differentiate who wrote what to the socket (using the part that reads it)? Would using a named pipe work?
If you are talking about TCP the situation you describe is impossible, because connections are 1-to-1. If you mean UDP, you can get the sender address by setting the appropriate flags in the recvmesg() function.
I'm writing a Unix domain socket server for Linux.
A peculiarity of Unix domain sockets I quickly found out is that, while creating a listening Unix socket creates the matching filesystem entry, closing the socket doesn't remove it. Moreover, until the filesystem entry is removed manually, it's not possible to bind() a socket to the same path again : bind() fails with EADDRINUSE if the path it is given already exists in the filesystem.
As a consequence, the socket's filesystem entry needs to be unlink()'ed on server shutdown to avoid getting EADDRINUSE on server restart. However, this cannot always be done (i.e.: server crash). Most FAQs, forum posts, Q&A websites I found only advise, as a workaround, to unlink() the socket prior to calling bind(). In this case however, it becomes desirable to know whether a process is bound to this socket before unlink()'ing it.
Indeed, unlink()'ing a Unix socket while a process is still bound to it and then re-creating the listening socket doesn't raise any error. As a result, however, the old server process is still running but unreachable : the old listening socket is "masked" by the new one. This behavior has to be avoided.
Ideally, using Unix domain sockets, the socket API should have exposed the same "mutual exclusion" behavior that is exposed when binding TCP or UDP sockets : "I want to bind socket S to address A; if a process is already bound to this address, just complain !" Unfortunately this is not the case...
Is there a way to enforce this "mutual exclusion" behavior ? Or, given a filesystem path, is there a way to know, via the socket API, whether any process on the system has a Unix domain socket bound to this path ? Should I use a synchronization primitive external to the socket API (flock(), ...) ? Or am I missing something ?
Thanks for your suggestions.
Note : Linux's abstract namespace Unix sockets seem to solve this issue, as there is no filesystem entry to unlink(). However, the server I'm writing aims to be generic : it must be robust against both types of Unix domain sockets, as I am not responsible for choosing listening addresses.
I know I am very late to the party and that this was answered a long time ago but I just encountered this searching for something else and I have an alternate proposal.
When you encounter the EADDRINUSE return from bind() you can enter an error checking routine that connects to the socket. If the connection succeeds, there is a running process that is at least alive enough to have done the accept(). This strikes me as being the simplest and most portable way of achieving what you want to achieve. It has drawbacks in that the server that created the UDS in the first place may actually still be running but "stuck" somehow and unable to do an accept(), so this solution certainly isn't fool-proof, but it is a step in the right direction I think.
If the connect() fails then go ahead and unlink() the endpoint and try the bind() again.
I don't think there is much to be done beyond things you have already considered. You seem to have researched it well.
There are ways to determine if a socket is bound to a unix socket (obviously lsof and netstat do it) but they are complicated and system dependent enough that I question whether they are worth the effort to deal with the problems you raise.
You are really raising two problems - dealing with name collisions with other applications and dealing with previous instances of your own app.
By definition multiple instances of your pgm should not be trying to bind to the same path so that probably means you only want one instance to run at a time. If that's the case you can just use the standard pid filelock technique so two instances don't run simultaneously. You shouldn't be unlinking the existing socket or even running if you can't get the lock. This takes care of the server crash scenario as well. If you can get the lock then you know you can unlink the existing socket path before binding.
There is not much you can do AFAIK to control other programs creating collisions. File permissions aren't perfect, but if the option is available to you, you could put your app in its own user/group. If there is an existing socket path and you don't own it then don't unlink it and put out an error message and letting the user or sysadmin sort it out. Using a config file to make it easily changeable - and available to clients - might work. Beyond that you almost have to go some kind of discovery service, which seems like massive overkill unless this is a really critical application.
On the whole you can take some comfort that this doesn't actually happen often.
Assuming you only have one server program that opens that socket.
Then what about this:
Exclusively create a file that contains the PID of the server process (maybe also the path of the socket)
If you succeed, then write your PID (and socket path) there and continue creating the socket.
If you fail, the socket was created before (most likely), but the server may be dead. Therefore read the PID from the file that exists, and then check that such a process still exists (e.g. using the kill with 0-signal):
If a process exists, it may be the server process, or it may be an unrelated process
(More steps may be needed here)
If no such process exists, remove the file and begin trying to create it exclusively.
Whenever the process terminates, remove the file after having closed (and removed) the socket.
If you place the socket and the lock file both in a volatile filesystem (/tmp in older ages, /run in modern times, then a reboot will clear old sockets and lock files automatically, most likely)
Unless administrators like to play with kill -9 you could also establish a signal handler that tries to remove the lock file when receiving fatal signals.
Am a c-coder for a while now - neither a newbie nor an expert. Now, I have a certain daemoned application in C on a PPC Linux. I use PHP's socket_connect as a client to connect to this service locally. The server uses epoll for multiplexing connections via a Unix socket. A user submitted string is parsed for certain characters/words using strstr() and if found, spawns 4 joinable threads to different websites simultaneously. I use socket, connect, write and read, to interact with the said webservers via TCP on their port 80 in each thread. All connections and writes seems successful. Reads to the webserver sockets fail however, with either (A) all 3 threads seem to hang, and only one thread returns -1 and errno is set to 104. The responding thread takes like 10 minutes - an eternity long:-(. *I read somewhere that the 104 (is EINTR?), which in the network context suggests that ...'the connection was reset by peer'; or (B) 0 bytes from 3 threads, and only 1 of the 4 threads actually returns some data. Isn't the socket read/write thread-safe? I use thread-safe (and reentrant) libc functions such as strtok_r, gethostbyname_r, etc.
*I doubt that the said webhosts are actually resetting the connection, because when I run a single-threaded standalone (everything else equal) all things works perfectly right, but of course in series not parallel.
There's a second problem too (oops), I can't write back to the client who connect to my epoll-ed Unix socket. My daemon application will hang and hog CPU > 100% for ever. Yet nothing is written to the clients end. Am sure the client (a very typical PHP socket application) hasn't closed the connection whenever this is happening - no error(s) detected either. Any ideas?
I cannot figure-out whatever is wrong even with Valgrind, GDB or much logging. Kindly help where you can.
Yes, read/write are thread-safe. But beware of gethostbyname() and getservbyname() if you're using them - they return pointers to static data, and may not be thread-safe.
errno 104 is ECONNREFUSED (not EINTR). Use strerror or perror to get the textual error message (like 'Connection reset by peer') for a particular errno code.
The best way to figure out what's going wrong is often to do very detailed logging - log the results of every operation, plus details like the IP address/port connecting to, the number of bytes read/written, the thread id, and so forth. And, of course, make sure your logging code is thread-safe :-)
Getting an ECONNRESET after 10 minutes sounds like the result of your connection timing out. Either the web server isn't sending the data or your app isn't receiving it.
To test the former, hookup a program like Wireshark to the local loopback device and look for traffic to and from the port you are using.
For the later, take a look at the epoll() man page. They mention a scenario where using edge triggered events could result in a lockup, because there is still data in the buffer, but no new data comes in so no new event is triggered.