I am looking through some out-of-date code which uses getaddrinfo and getnameinfo to determine host name information and then falls back to gethostname and gethostbyname if getnameinfo fails.
Now, this seems wrong to me. I am trying to understand the intent of the code so that I can make a recommendation. I don't want to repost the entire code here because it is long and complicated, but I'll try to summarize:
As far as I can tell, the point of this code is to generate a string which can be used by another process to connect to a listening socket. This seems to be not just for local processes, but also for remote hosts to connect back to this computer.
So the code in question is basically doing the following:
getaddrinfo(node = NULL, service = port, hints.ai_flags = AI_PASSIVE, ai); -- this gets a list of possible arguments for socket() that can be used with bind().
go through the list of results and create a socket.
first time a socket is successfully created, this is selected as the "used" addrinfo.
for the ai_addr of the selected addrinfo, call getnameinfo() to get the associated host name.
if this fails, call gethostname(), then look up gethostbyname() on the result.
There are a few reasons I think this is wrong, but I want to verify my logic. Firstly, it seems from some experiments that getnameinfo() pretty much always fails here. I suppose that the input address is unknown, since it is a listening socket, not a destination, so it doesn't need a valid IP from this point of view. Then, calling gethostname() and passing the result to gethostbyname() pretty much always returns the same result as gethostname() by itself. In other words, it's just verifying the local host name, and seems pointless to me. This is problematic because it's not even necessarily usable by remote hosts, is it?
Somehow I think it's possible that the whole idea of trying to determine your own host name on the subnet is not that useful, but rather you must ping a message to another host and see what IP address they see it as. (Unfortunately in this context that doesn't make sense, since I don't know other peers at this level of the program.) For instance, the local host could have more than one NIC and therefore multiple IP addresses, so trying to determine a single host-address pair is nonsensical. (Is the correct resolution to just bind() and simultaneously listen on all addrinfo results?)
I also noticed that one can get names resolved by just passing them in to getaddrinfo() and setting the AI_CANONNAME flag, meaning the getnameinfo() step may be redundant. However, I guess this is not done here because they are trying to determine some kind of unbiased view of the hostname without supplying it apriori. Of course, it fails, and they end up using gethostname() anyways! I also tried supplying "localhost" to getaddrinfo(), and it reports in ai_canonname` the host name under Linux, but just results in "localhost" on OS X, so not so useful since this is supposed to be cross-platform.
I guess to summarize, my question is, what is the correct way, if one exists, to get a local hostname that can be announced to subnet peers, in modern socket programming? I am leaning towards replacing this code with simply returning the results of gethostname(), but I'm wondering if there's a more appropriate solution using modern calls like getaddrinfo().
If the answer is that there's no way to do this, I'll just have to use gethostname() anyways since I must return something here, or it would break the API.
If I read this correctly, you just want to get a non-localhost socket address that is likely to succeed for creating a local socket, and for a remote host to connect back on.
I have a function that I wrote that you can reference called "GetBestAddressForSocketBind". You can get it off my GitHub project page here. You may need to reference some of the code in the parent directory.
The code essentially just uses getifaddrs to enumerate adapters and picks the first one that is "up", not a loopback/local and has an IP address of the desired address family (AF_INET or AF_INET6).
Hope this helps.
I think that you should look at Ulrich Drepper's article about IPv6 programming. It is relatively short and may answer on some of your concerns. I found it really useful. I'm posting this link, because it is very difficult to answer to your question(s) without (at least) pseudo-code.
Related
This question is related to a question about getting a free port in Haskell, where I included a getFreePort function that retrieved the first available port. This function works on a Windows system, but when I tried on my Linux box it fails randomly (the free port is reported as busy).
I've modified the function to try to re-bind to the free address, and it fails at random:
getFreePort :: IO Integer
getFreePort = do
sock <- socket AF_INET Stream defaultProtocol
bind sock (SockAddrInet aNY_PORT iNADDR_ANY)
port <- socketPort sock
close sock
print "Trying to rebind to the sock"
sock <- socket AF_INET Stream defaultProtocol
bind sock (SockAddrInet port 0x0100007f)
port <- socketPort sock
close sock
return (toInteger port)
I understand that there is a race condition about other process acquiring that port, but isn't this unlikely?
As a general remark, the pattern of check if a resource is available and if so take it is often an anti-pattern. Whenever you do that you run the risk that another process takes the resource after the check but before you actually acquire it yourself.
The only info you have after such a check is that the resource was not used at that particular point in time. It may or may not help you to guess the port's state in the future but the information you have is in no way binding at any later time. You cannot assume that because the resource was free at time t it will still be free at t+dt. Even if dt is very small. It's maybe a bit more likely that it will still be free when you ask fast. But that's just it - maybe a higher probability.
You should just try to acquire a resource and handle failure appropriately. The only way you can be sure a port was really free is when you just successfully opened it. Then you know it was indeed free. As soon as you close it all bets are off again.
I don't think you can ever safely check if a port is free in one process and then assume it still is free in another process. That does not make sense. It does not even make sense within the same process!
At the very least you would have to design a protocol that would go back and forth:
here's a port that was just free, try that
nope, it's taken now
ok, here's another one
nope, it's taken now
ok, here's another one
yep, got it, thanks
But that is pretty silly to begin with. The process that needs the port should just open it. When it already has the port open and not before, then it should communicate the port number to the other party.
I am working on Kontron board with Vxworks-cert-6.6 RTOS in it.
My current goal is to bring-up the Network Interface in it, and set a static IP address and a netmask for that interface.
As of now, I know 2 ways to achieve my requirement (to set a static IP address):
by providing ead bootline parameter, specifying IP address and netmask for boot interface.
by using ifLib network interface library.
The first method I cannot opt for the reason, after deployment end-user cannot change every time the ead bootline parameter for IP address to set and compile the whole bootrom and deploy back the device.
And, the second method is not permitted for me to use because for the reason, the Vxworks-cert package I am using doesn't come with ifLib library in it.
So above said both the methods are not feasible in my case. Even I have thought of a 3rd method to change bootline parameter during runtime and reboot. But this method is also not feasible for the main reason, the Kontron board I am using doesn't support NVRAM, and hence the bootline parameter I have changed during runtime doesn't withstand power-off/on cycles and new desired IP address will not be set.
Can anyone suggest me a method to achieve this; if any reference links it will be more helpful. Thank you in advance for all your helps.
In server, I know the address information of local machine (AKA peer) can be attained
by using accept method but I see some people are using getpeername() for that address after accepting the connection.
Is there any difference in result?
I saw the following link.
So, I think there is no difference but I just wanted to make sure of this.
There is no difference, just the overhead of another function call if you use getpeername instead of getting the address immediately from accept.
I'm writing a Unix domain socket server for Linux.
A peculiarity of Unix domain sockets I quickly found out is that, while creating a listening Unix socket creates the matching filesystem entry, closing the socket doesn't remove it. Moreover, until the filesystem entry is removed manually, it's not possible to bind() a socket to the same path again : bind() fails with EADDRINUSE if the path it is given already exists in the filesystem.
As a consequence, the socket's filesystem entry needs to be unlink()'ed on server shutdown to avoid getting EADDRINUSE on server restart. However, this cannot always be done (i.e.: server crash). Most FAQs, forum posts, Q&A websites I found only advise, as a workaround, to unlink() the socket prior to calling bind(). In this case however, it becomes desirable to know whether a process is bound to this socket before unlink()'ing it.
Indeed, unlink()'ing a Unix socket while a process is still bound to it and then re-creating the listening socket doesn't raise any error. As a result, however, the old server process is still running but unreachable : the old listening socket is "masked" by the new one. This behavior has to be avoided.
Ideally, using Unix domain sockets, the socket API should have exposed the same "mutual exclusion" behavior that is exposed when binding TCP or UDP sockets : "I want to bind socket S to address A; if a process is already bound to this address, just complain !" Unfortunately this is not the case...
Is there a way to enforce this "mutual exclusion" behavior ? Or, given a filesystem path, is there a way to know, via the socket API, whether any process on the system has a Unix domain socket bound to this path ? Should I use a synchronization primitive external to the socket API (flock(), ...) ? Or am I missing something ?
Thanks for your suggestions.
Note : Linux's abstract namespace Unix sockets seem to solve this issue, as there is no filesystem entry to unlink(). However, the server I'm writing aims to be generic : it must be robust against both types of Unix domain sockets, as I am not responsible for choosing listening addresses.
I know I am very late to the party and that this was answered a long time ago but I just encountered this searching for something else and I have an alternate proposal.
When you encounter the EADDRINUSE return from bind() you can enter an error checking routine that connects to the socket. If the connection succeeds, there is a running process that is at least alive enough to have done the accept(). This strikes me as being the simplest and most portable way of achieving what you want to achieve. It has drawbacks in that the server that created the UDS in the first place may actually still be running but "stuck" somehow and unable to do an accept(), so this solution certainly isn't fool-proof, but it is a step in the right direction I think.
If the connect() fails then go ahead and unlink() the endpoint and try the bind() again.
I don't think there is much to be done beyond things you have already considered. You seem to have researched it well.
There are ways to determine if a socket is bound to a unix socket (obviously lsof and netstat do it) but they are complicated and system dependent enough that I question whether they are worth the effort to deal with the problems you raise.
You are really raising two problems - dealing with name collisions with other applications and dealing with previous instances of your own app.
By definition multiple instances of your pgm should not be trying to bind to the same path so that probably means you only want one instance to run at a time. If that's the case you can just use the standard pid filelock technique so two instances don't run simultaneously. You shouldn't be unlinking the existing socket or even running if you can't get the lock. This takes care of the server crash scenario as well. If you can get the lock then you know you can unlink the existing socket path before binding.
There is not much you can do AFAIK to control other programs creating collisions. File permissions aren't perfect, but if the option is available to you, you could put your app in its own user/group. If there is an existing socket path and you don't own it then don't unlink it and put out an error message and letting the user or sysadmin sort it out. Using a config file to make it easily changeable - and available to clients - might work. Beyond that you almost have to go some kind of discovery service, which seems like massive overkill unless this is a really critical application.
On the whole you can take some comfort that this doesn't actually happen often.
Assuming you only have one server program that opens that socket.
Then what about this:
Exclusively create a file that contains the PID of the server process (maybe also the path of the socket)
If you succeed, then write your PID (and socket path) there and continue creating the socket.
If you fail, the socket was created before (most likely), but the server may be dead. Therefore read the PID from the file that exists, and then check that such a process still exists (e.g. using the kill with 0-signal):
If a process exists, it may be the server process, or it may be an unrelated process
(More steps may be needed here)
If no such process exists, remove the file and begin trying to create it exclusively.
Whenever the process terminates, remove the file after having closed (and removed) the socket.
If you place the socket and the lock file both in a volatile filesystem (/tmp in older ages, /run in modern times, then a reboot will clear old sockets and lock files automatically, most likely)
Unless administrators like to play with kill -9 you could also establish a signal handler that tries to remove the lock file when receiving fatal signals.
I have a legacy source file which describes the protocol to be used for RPC in a file with extension .x file which is fed to rpcgen to generate the necessary stub files for the protocol. However, currently in the generated stub files, the RPC client is free to connect from (or listen on) any port. because in the generated file, I see the following
transp = svctcp_create(RPC_ANYSOCK, 0, 0);
I am a newbie to RPC and related things but trying to modify it anyway .... Since I know that the server listens on a particular port, I deduced that the above line is what is causing the client to connect from arbitrary port. Now I kind of know how to fix it ..I would have to try to open a bunch of sockets whose port will be in the given range of ports until I am successful and pass it as the first argument to svctcp_create...
However this would have to be in the rpcgen generated files which does not make me very comfortable. I would like to modify the ".x" file so as to do it once for all. can anybody help me with this?
Thanks,
Sunil
Why do you need to restrict the local ports to a range? There is no support for this at any layer of the TCP networking APIs. Client port ranges are sometimes specified as firewall rules by netadmins who are unaware of the implementation infeasibility, and who think they are adding security, about which they are also mistaken. What's the reason in your case?