System call SELECT and TESTMASK - sockets

I'm currently studying OS's and i came across a question that i cant quite figure it out.
I have this piece of pseudo-code:
for(;;) {
mask = teskmask;
select(MAXSOCKS,&mask,0,0);
if(FD_ISSET(strmfd,&mask)) {
clilen=sizeof(cliaddr);
newfd=accept(strmfd,(struct sockaddr*)&cliaddr,&clilen);
echo(newfd);
close(newfd);
}
if (FD_ISSET(dgrmfd,&mask)) echo (dgrmfd);
}
Note: consider MAXSOCKS to be defined as whatever (doesn't matter here), strmfd to be a Stream socket, dgrmfd as a datagram socket, clilen is the size of client address,
echo(newfd); is just a function to echo whats in the socket.
So my question is:
what is/whats for the testmask and mask and how is it used here?
I know that select blocks the process until some socket is available to read/write or exceptions.

The simplest way to get answer to your question is just to read select(2) manual page, which says:
int select(int nfds,
fd_set *readfds, fd_set *writefds, fd_set *exceptfds,
struct timeval *timeout);
or a quite similar declaration. So "testmask" is fd_set. Select() gets these masks on input and refills them on output, unless an error is signaled.
In all Unix systems I've seen fd_set is bit array up to FD_SETSIZE bits, and it could be hard to extend this to larger value, so a descriptor greater or equal to FD_SETSIZE can't be select()ed without target-specific hacking. So poll() is preferred for a large descriptor set. (And, platform-specific solutions as kqueue (*BSD, MacOSX), epoll (Linux), /dev/poll (Solaris) are better if relatively small subset of all descriptors is ready on each cycle; for Linux it's ~30% when poll() is more efficient than epoll-based scheme, measured for systems ~3 years ago.)
In Windows systems, fd_set is array of socket numbers (integers) with unlimited value, but limited total count (AFAIR, 64).
Using only standard macros to deal with fd_set gives code which is portable between Unix and Windows. But this code could be inefficient for both in different manner.

Actually the proper way to copy the select mask is to use the FD_COPY macro.
I have just posted an article about the difference between select, poll and epoll here: http://www.ulduzsoft.com/2014/01/select-poll-epoll-practical-difference-for-system-architects/ - I suggest you take a look on it as it covers many things you asked, and there's a chance it covers something you missed.

Related

New to socket programming, questions regarding "select()"

Currently in my degree we're starting to work with sockets.
I Have a couple of questions regarding polling for input from sockets,
using the select() function.
int select( int nfds,
fd_set *readfds,
fd_set *writefds,
fd_set *exceptfds,
const struct timespec *timeout);
We give select "nfds" param, which would normally would
be the maximum sockets number we would like to monitor. How can i watch only one specific socket instead of the range of 0 to nfds_val sockets ?
What are the file descriptors objects that we use? what is their purpose,
and why can't we just point "select" to the relevant socket structure?
I've read over the forum regarding Blocking and Non-Blocking mode of select, but couldn't understand the meaning or uses of each, nor how to implement such, would be glad if someone could explain.
Last but not least (only for the time being :D ) - When binding a socketaddr_in to socket number, why does one needs to cast to socketaddr * and not leave it as sockaddr_in * ?
I mean except for the fact that bind method expects this kind of pointer type ;)
Would appreciate some of the experts answers here :)
Thank you guys and have a great week!
We give select "nfds" param, which would normally would be the maximum sockets number we would like to monitor. How can i watch only one specific socket instead of the range of 0 to nfds_val sockets ?
Edit. (sorry, the previous text here was wrong) Just provide your socket descriptor + 1. I'm pretty sure it doesn't mean OS will check all the descriptors in [0, 1... descriptor] range.
What are the file descriptors objects that we use? what is their purpose, and why can't we just point "select" to the relevant socket structure?
File descriptors are usually integer values given to the user by OS. OS uses descriptors to control physical and logical resources - one file descriptor means OS has given you something file-like to control. Since Berkeley Sockets have read and write operations defined, they are file-like and socket objects essentially are plain file descriptors.
Answering why can't we just point "select" to the relevant socket structure? - we actually can. What exactly to pass to select depends on OS and language. In C you place your socket descriptor (plain int value most probably) into a fd_set. fd_set is then passed to select.
Edit.
An tiny example for Linux:
fd_set set;
FD_ZERO(&set);
FD_SET(socket_fd, &set);
// check if socket_fd is ready for reading
result = select(socket_fd + 1, &set, NULL, NULL, NULL);
if (result == -1) report_error(errno);
Docs.
Windows has similar code.
I've read over the forum regarding Blocking and Non-Blocking mode of select, but couldn't understand the meaning or uses of each, nor how to implement such, would be glad if someone could explain.
A blocking operation makes your thread wait until it is done. It's 99% of functions you use. If there are sockets ready for some IO, blocking select will return something immediately. It there are no such sockets, the thread will wait for them. Non-blocking select, in the latter case, won't wait and will return -1 (error).
As an example, try to implement single threaded server that is capable of working with multiple clients, including long operations like file transfer happening simultaneously. You definitely don't want to use blocking socket operations in this case.
Last but not least (only for the time being :D ) - When binding a socketaddr_in to socket number, why does one needs to cast to socketaddr * and not leave it as sockaddr_in * ? I mean except for the fact that bind method expects this kind of pointer type ;)
Probably due to historical reasons, but I'm not sure. And there seems to be a fine answer on SO already.

C socket programming recv size

I am a newbie in socket programming(in C), maybe this question is a litter bit stupid. In C socket programming, how should I determine the size of buffer of the function recv()/read()? As in many cases, we don't know the size of data sent using send()/write(). Thanks a lot!
how should I determine the size of buffer of the function
recv()/read()
Ideally one shouldn't look at these buffers and keep to the olden TCP model: keep reading bytes while bytes are available.
If you are asking this question for things like: "how big should be the buffer into which I receive?", the simple answer is to pick a size and just pass that. If there's more data you can read again.
Back to your original question, different stacks give you different APIs. For example on some Unixes you have things like SIOCINQ and FIONREAD. These give you the amount of data the kernel has in its receive buffer, waiting for you to copy it out.
If you don't really know how many bytes are expected, use a large buffer and pass a large buffer size to recv/read. These functions will return how many bytes were put into the buffer. Then you can deal with this data printing it, for example.
But keep in mind that data is often either sent in chunks of known size or sent with a message-size in the first bytes, so the receiver side is able to identify how many bytes should be read.

Anyone had any experience with *.pcap manipulation libs?

I'm using the SharpPcap + PacketDotNet libraries to process some .pcap files and came across a bug in the way the timestamps are calculated.
Take this Timeval property, which is something along these lines:
PosixTimeval Timeval
{
DateTime Date;
ulong Seconds;
ulong MicroSeconds;
}
The problem is as follows: Suppose you have a trace open in Wireshark with one of the packets with a timestamp of "0.002". Once you open it within one of your programs, it retrieves the packet and its Timeval is setup such that Seconds = 0 and MicroSeconds = 002 = 2. This is done under the hood, so there is no way to avoid it as far as I can tell.
My question is if that problem is common to other libraries (and maybe all of them?) who manipulate the pcap file format, which I think are built around the same collection of c/c++ functions, or if this is a problem only with the ones I'm using.
I'm the author of sharppcap/packet.net.
What is the bug that you are seeing with the timestamp values? The conversion you mentioned seems correct. 0.002 seconds is 2 milliseconds.
The timestamp values should be the full unix timeval when the packet was captured. Certainly a timeval of 0.002 doesn't make sense as an absolute time, only a relative one.
I'll add a unit test to sharppcap to validate the packet timeval if there isn't one already.

Using kqueue to poll for exceptional conditions

I'm modifying an app in order to replace its use of select() with kqueue. select() allows one to poll for exceptional conditions:
int select(int nfds,
fd_set *restrict readfds,
fd_set *restrict writefds,
fd_set *restrict errorfds, <---- this thing here
struct timeval *restrict timeout
);
After reading the kqueue documentation it looks like there's no way to do that. There's EVFILT_READ and EVFILT_WRITE but nothing along the lines of EVFILT_ERROR/EVFILT_EXCEPTIONAL. Is it possible to poll for exceptional conditions, and if so, how?
There is no such thing as exceptional state on FreeBSD, to quote man 2 select:
The only exceptional condition detectable is out-of-band data received on a socket.
So your question boils down to “How can I detect OOB-data on a socket with kqueue”, which I, to be honest, cannot answer without doing some research.

How to use the select() function in socket programming?

The prototype is:
int select (int nfds,
fd_set *read-fds,
fd_set *write-fds,
fd_set *except-fds,
struct timeval *timeout);
I've been struggling to understand this function for quite some time. My question is, if it checks all the file descriptors from 0 to nfds-1, and will modify the read-fds, write-fds and except-fds when return, why do I need to use FD_SET to add file descriptors to the set at the begining, it will check all the file descriptors anyway, or not?
It won't check from 0 to nfds-1. The first argument just provides an upper bound on how large, numerically, the file descriptors used are. This is because the set itself might be represented as a bitvector, without a way to know how many bits are actually used. Specifying this as a separate argument helps select() avoid checking file descriptors that are not in use.
Also, a descriptor that is not in e.g. the read set when you call select() is not being checked at all, so it cannot appear in the set when the call returns, either.
I once had the same doubt as yours. You can look at following question and answers:
Query on Select System Call