How to use the select() function in socket programming? - select

The prototype is:
int select (int nfds,
fd_set *read-fds,
fd_set *write-fds,
fd_set *except-fds,
struct timeval *timeout);
I've been struggling to understand this function for quite some time. My question is, if it checks all the file descriptors from 0 to nfds-1, and will modify the read-fds, write-fds and except-fds when return, why do I need to use FD_SET to add file descriptors to the set at the begining, it will check all the file descriptors anyway, or not?

It won't check from 0 to nfds-1. The first argument just provides an upper bound on how large, numerically, the file descriptors used are. This is because the set itself might be represented as a bitvector, without a way to know how many bits are actually used. Specifying this as a separate argument helps select() avoid checking file descriptors that are not in use.
Also, a descriptor that is not in e.g. the read set when you call select() is not being checked at all, so it cannot appear in the set when the call returns, either.

I once had the same doubt as yours. You can look at following question and answers:
Query on Select System Call

Related

Linux not answered to the point yet

ssize_t (*read) (struct file *, char __user *, size_t, loff_t *);
int (*open) (struct inode *, struct file *);
Where does this struct file* point to? If we give a pointer variable of our own to struct file* and struct inode*. where is it pointing to? What is going on? I find the declarations in file operations struct.and definition of the same in driver program. But in driver program the pointer for struct file * and struct inode* comes out of nowhere? From where did u get those argument variables? Or can we give variable whatever we like?
If it’s the otherwise , how does that random variable we give serve the purpose.
I think the guy who first raised the question needs an answer for the same.
Rather than making it complex can someone explain in simple if you know?
read() and open() are userspace functions which operate on file descriptors. When a user runs an application which uses these functions, the kernel translates their call and populates the necessary information for the driver which instantiated the file. The kernel sort of "redirects" the userspace call and calls the driver read() and open() with the proper parameters filled in.
I'd recommend reading about driver file operations in LDD3, Chapter 3.

"nfds" parameter in select() in sockets programming

I never really got the idea behind this parameter, what is it good for? I also noticed this parameter is ignored in WinSock2, why is that?
Do Unix systems use this parameter or do they ignore it as well?
Windows' implementation of select() uses linked lists internally, so it doesn't need to use the nfds parameter for anything.
On other OS's, however, the fd_set struct is implemented to hold an array of bits (one bit per socket). For example, here is how it is declared (in sys/_types/_fd_def.h) under MacOS/X:
typedef struct fd_set {
__int32_t fds_bits[__DARWIN_howmany(__DARWIN_FD_SETSIZE, __DARWIN_NFDBITS)];
} fd_set;
... and in order to do the right thing, the select() call will have to loop over the bits in the array to see what they contain. By supplying select() with the nfds parameter, we tell the select() implementation that it only needs to iterate over the first (nfds) bits of the array, rather than always having to iterate over the entire array on every call. This allows select() to be more efficient than it would otherwise be.

New to socket programming, questions regarding "select()"

Currently in my degree we're starting to work with sockets.
I Have a couple of questions regarding polling for input from sockets,
using the select() function.
int select( int nfds,
fd_set *readfds,
fd_set *writefds,
fd_set *exceptfds,
const struct timespec *timeout);
We give select "nfds" param, which would normally would
be the maximum sockets number we would like to monitor. How can i watch only one specific socket instead of the range of 0 to nfds_val sockets ?
What are the file descriptors objects that we use? what is their purpose,
and why can't we just point "select" to the relevant socket structure?
I've read over the forum regarding Blocking and Non-Blocking mode of select, but couldn't understand the meaning or uses of each, nor how to implement such, would be glad if someone could explain.
Last but not least (only for the time being :D ) - When binding a socketaddr_in to socket number, why does one needs to cast to socketaddr * and not leave it as sockaddr_in * ?
I mean except for the fact that bind method expects this kind of pointer type ;)
Would appreciate some of the experts answers here :)
Thank you guys and have a great week!
We give select "nfds" param, which would normally would be the maximum sockets number we would like to monitor. How can i watch only one specific socket instead of the range of 0 to nfds_val sockets ?
Edit. (sorry, the previous text here was wrong) Just provide your socket descriptor + 1. I'm pretty sure it doesn't mean OS will check all the descriptors in [0, 1... descriptor] range.
What are the file descriptors objects that we use? what is their purpose, and why can't we just point "select" to the relevant socket structure?
File descriptors are usually integer values given to the user by OS. OS uses descriptors to control physical and logical resources - one file descriptor means OS has given you something file-like to control. Since Berkeley Sockets have read and write operations defined, they are file-like and socket objects essentially are plain file descriptors.
Answering why can't we just point "select" to the relevant socket structure? - we actually can. What exactly to pass to select depends on OS and language. In C you place your socket descriptor (plain int value most probably) into a fd_set. fd_set is then passed to select.
Edit.
An tiny example for Linux:
fd_set set;
FD_ZERO(&set);
FD_SET(socket_fd, &set);
// check if socket_fd is ready for reading
result = select(socket_fd + 1, &set, NULL, NULL, NULL);
if (result == -1) report_error(errno);
Docs.
Windows has similar code.
I've read over the forum regarding Blocking and Non-Blocking mode of select, but couldn't understand the meaning or uses of each, nor how to implement such, would be glad if someone could explain.
A blocking operation makes your thread wait until it is done. It's 99% of functions you use. If there are sockets ready for some IO, blocking select will return something immediately. It there are no such sockets, the thread will wait for them. Non-blocking select, in the latter case, won't wait and will return -1 (error).
As an example, try to implement single threaded server that is capable of working with multiple clients, including long operations like file transfer happening simultaneously. You definitely don't want to use blocking socket operations in this case.
Last but not least (only for the time being :D ) - When binding a socketaddr_in to socket number, why does one needs to cast to socketaddr * and not leave it as sockaddr_in * ? I mean except for the fact that bind method expects this kind of pointer type ;)
Probably due to historical reasons, but I'm not sure. And there seems to be a fine answer on SO already.

System call SELECT and TESTMASK

I'm currently studying OS's and i came across a question that i cant quite figure it out.
I have this piece of pseudo-code:
for(;;) {
mask = teskmask;
select(MAXSOCKS,&mask,0,0);
if(FD_ISSET(strmfd,&mask)) {
clilen=sizeof(cliaddr);
newfd=accept(strmfd,(struct sockaddr*)&cliaddr,&clilen);
echo(newfd);
close(newfd);
}
if (FD_ISSET(dgrmfd,&mask)) echo (dgrmfd);
}
Note: consider MAXSOCKS to be defined as whatever (doesn't matter here), strmfd to be a Stream socket, dgrmfd as a datagram socket, clilen is the size of client address,
echo(newfd); is just a function to echo whats in the socket.
So my question is:
what is/whats for the testmask and mask and how is it used here?
I know that select blocks the process until some socket is available to read/write or exceptions.
The simplest way to get answer to your question is just to read select(2) manual page, which says:
int select(int nfds,
fd_set *readfds, fd_set *writefds, fd_set *exceptfds,
struct timeval *timeout);
or a quite similar declaration. So "testmask" is fd_set. Select() gets these masks on input and refills them on output, unless an error is signaled.
In all Unix systems I've seen fd_set is bit array up to FD_SETSIZE bits, and it could be hard to extend this to larger value, so a descriptor greater or equal to FD_SETSIZE can't be select()ed without target-specific hacking. So poll() is preferred for a large descriptor set. (And, platform-specific solutions as kqueue (*BSD, MacOSX), epoll (Linux), /dev/poll (Solaris) are better if relatively small subset of all descriptors is ready on each cycle; for Linux it's ~30% when poll() is more efficient than epoll-based scheme, measured for systems ~3 years ago.)
In Windows systems, fd_set is array of socket numbers (integers) with unlimited value, but limited total count (AFAIR, 64).
Using only standard macros to deal with fd_set gives code which is portable between Unix and Windows. But this code could be inefficient for both in different manner.
Actually the proper way to copy the select mask is to use the FD_COPY macro.
I have just posted an article about the difference between select, poll and epoll here: http://www.ulduzsoft.com/2014/01/select-poll-epoll-practical-difference-for-system-architects/ - I suggest you take a look on it as it covers many things you asked, and there's a chance it covers something you missed.

the function select() for multi client socket programming

I have simple question about select() of I/O multiplexing function for socket programming.
When select function executes, it`s said that it modifies its examining fd set,
so we need to reset it again every time.
(fd_set read_fds for example..)
But why is that?
Why does the select function clears the uninteresting file descriptors on its fd sets?
What changes select function give to (or modify) original fd set?
Thanks.
all I found from book or somewhere else on web says
'We need to' reset for every loop routine but it doesn`t say how it is.
Why does the select function clears the uninteresting file descriptors
on its fd sets?
Because after select() returns, you're (presumably) going to want to query (via FD_ISSET()) which socket(s) are now ready-for-read (or ready-for-write).
So when select() returns, it modifies the fd_set objects so that the only bits still set inside them are the bits representing sockets that are now ready. If it did not do that, FD_ISSET() would have no way to know which sockets are ready for use and which are not.