Why use htons() to specify protocol when creating packet socket? - sockets

To create a packet socket, following socket() function call is used (socket type and protocol may be different):
socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL))
And to create a stream socket, following call is used:
socket(AF_INET, SOCK_STREAM, IPPROTO_TCP)
My question is why use htons() to specify protocol when creating a packet socket and not when creating socket of AF_INET or AF_INET6 family? Why not use
socket(AF_INET, SOCK_XXX, htons(IPPROTO_XXX))
to create a STREAM or DATAGRAM socket as used when creating a packet socket or vice-versa. What is different with the use of the protocols in the two calls to socket() function as both the calls are used to create sockets, one for packet socket and the other for socket at TCP level?

First, like most other network parameters that are passed to the kernel (IP addresses, ports, etc), the parameters are passed in their "on-the-wire" format so that underlying software doesn't need to manipulate them before comparing/copying/transmitting/etc. (For comparison, consider that AF_PACKET and SOCK_RAW are parameters to the kernel itself -- hence "native format" is appropriate -- while the ETH_P_xxx value is generally for "comparison with incoming packets"; it just so happens that ETH_P_ALL is a special signal value saying 'capture everything'.)
Second, interpretation of the protocol is potentially different by address family. A different address family could choose to interpret the protocol in whatever form made sense for it. It just so happens that Ethernet and IP have always used big-endian (and were important/ubiquitous enough that big-endian came to be called network order).
Third, the protocol number in the AF_INET world (i.e. Internet Protocol) only occupies a single byte so it doesn't make sense to specify a byte-ordering.

Related

Does linux socket "send" function do a hton conversion?

When trying to use TCP/IP socket - "socket(AF_INET, SOCK_STREAM, 0)", tcp in this case, does the call to send and recv do a byte-order conversion automatically ?
At the TCP level, byte ordering only applies to the IPs and ports in the TCP/IP headers, which are established when connect()/accept() are called. When working with instances of sockaddr_in... structs, the user is responsible for handling byte conversions to/from network byte order as needed.
send()/recv() simply deal with a socket handle and a raw byte array, so there are no byte order issues when calling them. However, if the byte array has data that contains multi-byte integers in it, those have to be handled separately by the user as needed.

Explain line "s = socket(res->ai_family, res->ai_socktype, res->ai_protocol)"

int s;
struct addrinfo hints, *res;
getaddrinfo("www.example.com", "http", &hints, &res);
s = socket(res->ai_family, res->ai_socktype, res->ai_protocol);
Please explain the last line of code
The notion of socket is a very generic communication means.
It could deal with communication between local processes, communication between your process and some internal aspects of your system's kernel (events...), communication through the network...
Even when it deals with the network, there exists many protocol families and many protocols.
That's why, when we create a socket (with the socket() call on your last line), we have to provide several parameters in order to select the right properties of the required socket.
man 2 socket mainly explains the first parameter (domain or protocol family) but the other parameters are explained in subsequent pages since they depend on the choice made with this first parameter.
Note that once a socket is obtained with the socket() call, you may need to provide many other settings by other system calls, depending on your intention (bind() for a server, connect() for a client... many settings exist).
In your example, it seems that you want to reach an HTTP server named www.example.com.
You could have hardcoded the fact that such a server can be reached with the AF_INET protocol family (for ipv4, or AF_INET6 for ipv6), through a TCP connection (type SOCK_STREAM, protocol 0) but the getaddrinfo() function can help provide all these details and some other to be used in subsequent system calls (IP address and port number to be specified in a subsequent connect() call for example).
All this information stands in the members of the returned struct addrinfo.

What is the goal of using GGP(0x0003) as a protocol parameter in socket()

I started to program a packet sniffer, And I have searched for the correct parameters to pass to socket() function in order to capture packets with their Ethernet header.
I noticed that in this tutorial , In order to recieve the Ethernet header, they changed this line:
s = socket.socket(socket.AF_INET, socket.SOCK_RAW, socket.IPPROTO_TCP)
To this line:
s = socket.socket( socket.AF_PACKET , socket.SOCK_RAW , socket.ntohs(0x0003))
And my questions are:
I understood from this link that AF_INET with raw socket won't give me the Ethernet header. My question is why?
Why he also changed from IPPROTO_TCP to ntohs(0x0003) which I know that this is GGP protocol. As far as I understood, the third parameter states the protocol which the socket will recieve. If the protocol parameter is GGP, then the socket will look for packets who have GGP as their internet layer protocol, isn't? then why they pass GGP and not TCP or IP? After all, almost every PDU has IP and\or TCP\UDP as their data protocols.. Does it matter what's the third parameter for my packet sniffer?
In addition to the second question, I think that I didn't get the objective of the third parameter. If this is IPPROTO_TCP, the socket will capture packets with TCP in the network layer (and not UDP for example)? and if i'll pass IPPROTO_IP, the socket will capture packets with IP as their internet layer protocol, without checking the other layer's protocols (It doesn't matter for the socket what protocol is used for the network layer? It only cares that IP is exists as the internet layer protocol)?
Thanks and sorry for the grammer mistakes (English isn't my first language).
If you check linux/if_ether.h you will see
#define ETH_P_ALL 0x0003 /* Every packet (be careful!!!) */
So the value of ETH_P_ALL is 0x0003. The authors of this tutorial use 0x0003 instead of ETH_P_ALL because in some systems when used in python a "not defined" error occurs.
The raw socket feature can be set up at different layers of the network stack, in order to allow the kernel do perform some of the work for you at lower levels (eg: ethernet crafting).
The change to GGP protocol might make sense on the website you found the example, but it is ugly to do so and getprotoent() should be used rather than using magic numbers.
Yes you can tweak (filter) how the packet capture will happen. If you want to capture all packets then use ETH_P_ALL:
When protocol is set to htons(ETH_P_ALL) then all protocols are
received.

How do I determine whether open socket is TCP or unix domain socket?

My code is passed an open socket. This socket could be either a TCP socket (AF_INET) or a Unix Domain Socket (AF_UNIX).
Depending on the domain of the socket, it will need to be handled differently. In particular if the socket is bound to an address then I might want to accept incoming connections in a diffent way.
What is the best way to determine whether the socket I have been passed is a unix domain socket or a TCP socket? The solution would need to work on OS X and Linux at least.
getsockopt appears to allow getting the type of the socket (e.g. SOCK_STREAM etc) but not the domain.
getsockname will return a zero length for unix domain sockets on OSX, but this is officially a bug and the Linux behaviour is different.
The first member of the struct sockaddr returned by getsockname is sa_family, just test that against the symbolic constants. The bug on OSX lets you assume the unix domain when the returned address structure is zero bytes, for other platforms and domains, just check the returned structure.
getsockname() is the only cross-platform socket API to query a socket for its locally bound address, and thus its address family.
On Windows, at least, you can use getsockopt(SOL_SOCKET, SO_PROTOCOL_INFO) to retrieve a WSAPROTOCOL_INFO struct, which has an iAddressFamily field. Maybe there are similar platform-specific APIs on other OSes.

what is the protocol parameter in winsock's socket function for?

The winsock function socket expects as third parameter the protocol what usually is IPROTO_TCP for socket type SOCK_STREAM and IPROTO_UDP for socket type SOCK_DGRAM. When I pass a 0 value as the protocol parameter, TCP and UDP work as expected.
SOCKET s = socket(AF_INET, SOCK_DGRAM, 0)
// s is a valid socket
What is the IPROTO_IP protocol parameter value meant to? If it's only intented to be used with SOCK_RAW, why is there this kind of redundancy?
socket(AF_INET, SOCK_STREAM, IPROTO_TCP);
socket(AF_INET, SOCK_DGRAM, IPROTO_UDP);
What actually does the protocol parameter specify? when I can just use another value, it looks like that it's unimportant.
I want to send UDP packets (including broadcasts) from a PC with more than one netword card to a specific ethernet segment. While the IP routing normally select the network card (and source address) I would like specify the adapter(s) and think about raw sockets or any other means to achieve this goal. Probably this IPPROTO_IP may help in this case.
I think the documentation for socket (which can be found here: http://msdn.microsoft.com/en-us/library/ms740506(VS.85).aspx) is pretty clear on what the value is for and why passing 0 is fine if you don't care.
A situation where you might want to pass something different is if you wanted to set up a socket for an unusual connection type; such as bluetooth, or if you wanted to create a PGM reliable multicast socket, etc.
Your second question is unrelated to raw sockets or the protocol parameters. What you need to do is simply bind your socket to the address of the local interface that you want to use; so rather than binding to INADDR_ANY and allowing the stack to decide for you, you tell it which interface to use.