How do I determine whether open socket is TCP or unix domain socket? - sockets

My code is passed an open socket. This socket could be either a TCP socket (AF_INET) or a Unix Domain Socket (AF_UNIX).
Depending on the domain of the socket, it will need to be handled differently. In particular if the socket is bound to an address then I might want to accept incoming connections in a diffent way.
What is the best way to determine whether the socket I have been passed is a unix domain socket or a TCP socket? The solution would need to work on OS X and Linux at least.
getsockopt appears to allow getting the type of the socket (e.g. SOCK_STREAM etc) but not the domain.
getsockname will return a zero length for unix domain sockets on OSX, but this is officially a bug and the Linux behaviour is different.

The first member of the struct sockaddr returned by getsockname is sa_family, just test that against the symbolic constants. The bug on OSX lets you assume the unix domain when the returned address structure is zero bytes, for other platforms and domains, just check the returned structure.

getsockname() is the only cross-platform socket API to query a socket for its locally bound address, and thus its address family.
On Windows, at least, you can use getsockopt(SOL_SOCKET, SO_PROTOCOL_INFO) to retrieve a WSAPROTOCOL_INFO struct, which has an iAddressFamily field. Maybe there are similar platform-specific APIs on other OSes.

Related

Explain line "s = socket(res->ai_family, res->ai_socktype, res->ai_protocol)"

int s;
struct addrinfo hints, *res;
getaddrinfo("www.example.com", "http", &hints, &res);
s = socket(res->ai_family, res->ai_socktype, res->ai_protocol);
Please explain the last line of code
The notion of socket is a very generic communication means.
It could deal with communication between local processes, communication between your process and some internal aspects of your system's kernel (events...), communication through the network...
Even when it deals with the network, there exists many protocol families and many protocols.
That's why, when we create a socket (with the socket() call on your last line), we have to provide several parameters in order to select the right properties of the required socket.
man 2 socket mainly explains the first parameter (domain or protocol family) but the other parameters are explained in subsequent pages since they depend on the choice made with this first parameter.
Note that once a socket is obtained with the socket() call, you may need to provide many other settings by other system calls, depending on your intention (bind() for a server, connect() for a client... many settings exist).
In your example, it seems that you want to reach an HTTP server named www.example.com.
You could have hardcoded the fact that such a server can be reached with the AF_INET protocol family (for ipv4, or AF_INET6 for ipv6), through a TCP connection (type SOCK_STREAM, protocol 0) but the getaddrinfo() function can help provide all these details and some other to be used in subsequent system calls (IP address and port number to be specified in a subsequent connect() call for example).
All this information stands in the members of the returned struct addrinfo.

Why use htons() to specify protocol when creating packet socket?

To create a packet socket, following socket() function call is used (socket type and protocol may be different):
socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL))
And to create a stream socket, following call is used:
socket(AF_INET, SOCK_STREAM, IPPROTO_TCP)
My question is why use htons() to specify protocol when creating a packet socket and not when creating socket of AF_INET or AF_INET6 family? Why not use
socket(AF_INET, SOCK_XXX, htons(IPPROTO_XXX))
to create a STREAM or DATAGRAM socket as used when creating a packet socket or vice-versa. What is different with the use of the protocols in the two calls to socket() function as both the calls are used to create sockets, one for packet socket and the other for socket at TCP level?
First, like most other network parameters that are passed to the kernel (IP addresses, ports, etc), the parameters are passed in their "on-the-wire" format so that underlying software doesn't need to manipulate them before comparing/copying/transmitting/etc. (For comparison, consider that AF_PACKET and SOCK_RAW are parameters to the kernel itself -- hence "native format" is appropriate -- while the ETH_P_xxx value is generally for "comparison with incoming packets"; it just so happens that ETH_P_ALL is a special signal value saying 'capture everything'.)
Second, interpretation of the protocol is potentially different by address family. A different address family could choose to interpret the protocol in whatever form made sense for it. It just so happens that Ethernet and IP have always used big-endian (and were important/ubiquitous enough that big-endian came to be called network order).
Third, the protocol number in the AF_INET world (i.e. Internet Protocol) only occupies a single byte so it doesn't make sense to specify a byte-ordering.

what is parameter level in getsockopt?

I got the following link: SOL_SOCKET in getsockopt()
But it is really confusing for me. One replied that the SOL_SOCKET means the socket layer. What is the socket layer? Are there any other options available for that parameter?
What happens if we pass the SOL_SOCKET parameter and what does the SOL stand for?
I am using UNIX.
"socket layers" refers to the socket abstraction of the operative system. Those options can be set independently of the type of socket you are handling. In practice, you may be only interested in TCP/IP sockets, but there are also UDP/IP sockets, Unix domain sockets, and others. The options related to SOL_SOCKET can be applied to any of them. The list provided in the answer of the other question has some of them; in the manual page of sockets there are even more, under the "Socket options" section.
SOL_SOCKET is a constant for the "protocol number" associated with that level. For other protocols or levels, you can use getprotoent to obtain the protocol number from its name, or check the manual of the protocol - for example, in the manual page of IP are described the constants for the protocol numbers of IP (IPPROTO_IP), TCP (IPPROTO_TCP) and UDP (IPPROTO_UDP), while the manual page of Unix sockets says that, for historial reasons, its protocol options must be set using SOL_SOCKET too. Moreover, you can find the list of supported protocols for your system in /etc/protocols. And, of course, the options supported by each of the protocols is in their manuals: IP, TCP, UDP, Unix sockets...

Why client do socket binding in connection oriented communication and not in connection less communication

I was brushing up my sockte programming knowledge and came across a doubt.
First let me explain my understanding of sockets.
Socket binding associates the socket with port.
Socket binding helps kernel to identify the process to whom it should forward the incoming packet.
In connection oriented communication socket establishment is as below
At server side
socket()-->bind()-->listen()-->accept().....
client side is
socket()-->connect-->......
My question is why client need not bind to a socket. In client case if it send a request it has to get a response to its socket and kernel has to forward it to its process.For these things to happen isn't binding needed?If not how kernel will understand to whom to send the response packet?
Also in connection less client call bind socket.Why is it needed here?
My question is why client need not bind to a socket.
Because the kernel does the bind automatically when you call connect(), if you haven't bound the socket yourself.
Also in connectionless client call bind socket. Why is it needed here?
Because otherwise the socket isn't bound to an IP address:port, so it can't send or receive anything. It has no path to the outside world.
You always open a socket first. This is the path through the kernel. The connect call for say TCP happens after the socket is made.
Look at TCP versus UDP clients.
TCP
s = socket(options....)
connect(s)
send(s, data)
UDP
s = socket(options....)
send(s, data)
bind("0.0.0.0", 0) (all interfaces, any port) is implicit if you call connect(...) or listen(...) without an explicit bind(...).
All sockets must be bound to a local port even when connectionless so that bi-directional communication is possible (even if you're not going to do so).

TCP: is it possible to bind a socket and then /both/ connect from it and accept from it (both client and server rules)?

is it possible in any common platform - say, in Windows - to write a servient process that creates a socket, binds it to exactly one local "address:port" (fixed), and then:
use it to listen for incoming connections (on the specified port)
while at the same time
use it as a client socket to connect to some other servient (having source port identical to the one it exposes to others) ?
that is (sorry for the syntax abuse):
mySocket=socket(); mySocket.bind(myaddress, 3000);
mySocket.connectTo(neighbour, whateverport); // and present to others as port 3000
mySocket.listen(); // and it listens on 3000
mySocket.accept();
?
iirc, it's not even possible/advisable to try, even in the case an API wouldn't complain, but maybe it's me that is playing too much by the book... so I thought of asking you
thanks a lot!
No, a socket cannot be used for both listening and connecting at the same time. connect() will return a WSAEINVAL error if listen() was already called, and listen() will return a WSAEISCONN error if connect() was already called. You need to use separate sockets.
And if you could, there's be all kinds of troubles that crop up. For example, if select() returns that the socket is readable, do you do a recv() or an accept()? You want 2 sockets to play those two roles.
What advantage is there in one socket? For example, if you were hoping to do a blocking read until something interesting happens (incoming connection, incoming data), there are alternatives. In that example, you'd use select() to block on two sockets at once. The result from select() tells you which socket is ready. That tells you if you want to accept() a new connection from the one socket or recv() new data from the other socket.