potential for file id collision in C when doing pthread network io - sockets

I have an app in c that listens on a port and creates a pthread upon connection and goes back to the listen. The pthread functions reads from the socket, writes a response and then waits 1/10th of a sec followed by a shutdown() and a close() then pthread_exit(). This can happen very rapidly resulting in possibly hundreds of threads at the same time. My question is can the system reuse a file id before I do the final close()? I'm concerned about the possibility of the socket closing prematurely for some reason. On the listening side the file id cannot be reused until I do the close() call even if the underlying connection is long gone, right? I'm fairly sure that this is how it works but I can't confirm.

On the listening side the file id cannot be reused until I do the
close() call even if the underlying connection is long gone, right?
Yes, this is correct - the file descriptor is not released for re-use until it has been passed to close() (or is an FD_CLOEXEC file descriptor being closed automatically at execve()).

All thread try to enter critical region to be processed if you didn't use semafor,mutex or monitoring probably it uses same id even your files that you get from byte stream may be croupted. I advise to you use semafor, mutex ,or monitoring, and search about dining philosophers problem, because it is very frequent situation. Good luck I hope I can show a clue about your problem.

Related

What's the read logic when I call recvfrom() function in C/C++

I wrote a C++ program to create a socket and bind on this socket to receive ICMP/UDP packets. The code I wrote as following:
while(true){
recvfrom(sockId, rePack, sizeof(rePack), 0, (struct sockaddr *)&raddr, (socklen_t *)&len);
processPakcet(recv_size);
}
So, I used a endless while loop to receive messages continually, But I worried about the following two questions:
1, How long the message would be kept in the receiver queue or say in NIC queue?
I worried about that if it takes too long to process the first message, then I might miss the second message. so how fast should I read after read.
2, How to prevent reading the duplicated messages?
i.e, does the receiver queue knows me, when my thread read the first message done, would the queue automatically give me the second one? or say, when I read the first message, then the first message would be deleted by the queue and no one could receive it again.
Additionally, I think the while(true) module is not good, anyone could give me a good suggestion please. (I heard something like polling module).
First, you should always check the return value from recvfrom. It's unlikely the recvfrom will fail, but if it does (for example, if you later implement signal handling, it might fail with EINTR) you will be processing undefined data. Also, of course, the return value tells you the size of the packet you received.
For question 1, the actual answer is operating system-dependent. However, most operating systems will buffer some number of packets for you. The OS interrupt handler that handles the incoming packet will never be copying it directly into your application level buffer, so it will always go into an OS buffer first. The OS has previously noted your interest in it (by virtue of creating the socket and binding it you expressed interest), so it will then place a pointer to the buffer onto a queue associated with your socket.
A different part of the OS code will then (after the interrupt handler has completed) copy the data from the OS buffer into your application memory, free the OS buffer, and return to your program from the recvfrom system call. If additional packets come in, either before or after you have started processing the first one, they'll be placed on the queue too.
That queue is not infinite of course. It's likely that you can configure how many packets (or how much buffer space) can be reserved, either at a system-wide level (think sysctl-type settings in linux), or at the individual socket level (setsockopt / ioctl).
If, when you call recvfrom, there are already queued packets on the socket, the system call handler will not block your process, instead it will simply copy from the OS buffer of the next queued packet into your buffer, release the OS buffer, and return immediately. As long as you can process incoming packets roughly as fast as they arrive or faster, you should not lose any. (However, note that if another system is generating packets at a very high rate, it's likely that the OS memory reserved will be exhausted at some point, after which the OS will simply discard packets that exceed its resource reservation.)
For question 2, you will receive no duplicate messages (unless something upstream of your machine is actually duplicating them). Once a queued message is copied into your buffer, it's released before returning to you. That message is gone forever.
(Note that it's possible that some other process has also created a socket expressing interest in the same packets. That process would also get a copy of the packet data, which is typically handled internal to the operating system by reference counting rather than by actually duplicating the OS buffers, although that detail is invisible to applications. In any case, once all interested processes have received the packet, it will be discarded.)
There's really nothing at all wrong with a while (true) loop; it's a very common control structure for long-running server-type programs. If your program has nothing else it needs to be doing in the meantime, while true allowing it to block in recvfrom is the simplest and hence clearest way to implement it.
(You could use a select(2) or poll(2) call to wait. This allows you to handle waiting for any one of multiple file descriptors at the same time, or to periodically "time out" and go do something else, say, but again if you have nothing else you might need to be doing in the meantime, that is introducing needless complication.)

Shutdown Persistent TCP Con. (C multithreaded server)

I'm designing a multi-threaded server with a thread pool. This system is designed to use persistent TCP connections, as clients will maintain connects close to 24/7. The problem I run into is how to manage shutdowns. Currently, a connection comes in through "accept(listen_fd....)" and gets assigned to a work order struct. This struct is dumped onto the work queue, and is picked up by a thread. From this point on, this thread is devoted to the current connection. My code inside the thread is:
/* Function which runs in a thread to handle a request */
void *
handle_req( void *in)
{
ssize_t n;
char read;
/* Convert the input to a workorder_ptr */
workorder_t *workorder_ptr = (workorder_t *)in;
while( !serv_shutdown
&& (n=recv(workorder_ptr->sock_fd,&read,1,0) != 0))
{
printf("Read a character: %c\n",read);
}
printf("Peer has shutdown.\n");
/* Free the workorder memory */
close(workorder_ptr->sock_fd);
free(workorder_ptr);
return NULL;
}
Which simply listens to the socket and echos the characters indefinitely, and operates correctly when the client terminates the connection. You see the "!serv_shutdown" part in the while loop - this is my attempt to get the thread to break out of its loop on a shutdown signal. When a SIGINT is caught, the global variable is set to 1. Unfortunately, the program is currently blocking on the recv statement, and won't check this flag until another character is read. I want to avoid that, since it could be an arbitrary amount of time before another character is sent on this connection.
Also, I read on another post here that it's better to use "select" than "accept" to wait on a socket connection, but I didn't quite understand. Would you do a select to wait, and then do an accept right after that? I'm not sure how select creates a socket connection. I ask this, because if my understanding of select is cleared up, maybe it applies to the question I am asking?
Also also, how do I detect the case where a connection simply times out?
Thanks!
EDIT
I think I may have finally found a solution, after further digging:
Wake up thread blocked on accept() call
Basically, I could create a global pipe and have each thread do a select on its own socket_fd as well as this global pipe. Then, when a signal is caught, I'll just write something to the pipe. All threads should be woken, no?
Well, on FreeBSD, MacOSX and maybe somewhere else there is kevent() call, that allows listening on a broad range of system events including connect requests and signaling when data arrives to the socket.
It will solve all of your problems in a neat way, but it's not portable. There are libs such libevent and libev, that wraps OS-specific functionality like kevent() on BSD's, epoll() on Linux and so on. May be it would help you.
You can use the recv() primitive. If it returns 0, that means that the socket has been closed.
More information: http://beej.us/guide/bgnet/output/html/singlepage/bgnet.html#recvman

An IOCP documentation interpretation question - buffer ownership ambiguity

Since I'm not a native English speaker I might be missing something so maybe someone here knows better than me.
Taken from WSASend's doumentation at MSDN:
lpBuffers [in]
A pointer to an array of WSABUF
structures. Each WSABUF structure
contains a pointer to a buffer and the
length, in bytes, of the buffer. For a
Winsock application, once the WSASend
function is called, the system owns
these buffers and the application may
not access them. This array must
remain valid for the duration of the
send operation.
Ok, can you see the bold text? That's the unclear spot!
I can think of two translations for this line (might be something else, you name it):
Translation 1 - "buffers" refers to the OVERLAPPED structure that I pass this function when calling it. I may reuse the object again only when getting a completion notification about it.
Translation 2 - "buffers" refer to the actual buffers, those with the data I'm sending. If the WSABUF object points to one buffer, then I cannot touch this buffer until the operation is complete.
Can anyone tell what's the right interpretation to that line?
And..... If the answer is the second one - how would you resolve it?
Because to me it implies that for each and every data/buffer I'm sending I must retain a copy of it at the sender side - thus having MANY "pending" buffers (in different sizes) on an high traffic application, which really going to hurt "scalability".
Statement 1:
In addition to the above paragraph (the "And...."), I thought that IOCP copies the data to-be-sent to it's own buffer and sends from there, unless you set SO_SNDBUF to zero.
Statement 2:
I use stack-allocated buffers (you know, something like char cBuff[1024]; at the function body - if the translation to the main question is the second option (i.e buffers must stay as they are until the send is complete), then... that really screws things up big-time! Can you think of a way to resolve it? (I know, I asked it in other words above).
The answer is that the overlapped structure and the data buffer itself cannot be reused or released until the completion for the operation occurs.
This is because the operation is completed asynchronously so even if the data is eventually copied into operating system owned buffers in the TCP/IP stack that may not occur until some time in the future and you're notified of when by the write completion occurring. Note that with write completions these may be delayed for a surprising amount of time if you're sending without explicit flow control and relying on the the TCP stack to do flow control for you (see here: some OVERLAPS using WSASend not returning in a timely manner using GetQueuedCompletionStatus?) ...
You can't use stack allocated buffers unless you place an event in the overlapped structure and block on it until the async operation completes; there's not a lot of point in doing that as you add complexity over a normal blocking call and you don't gain a great deal by issuing the call async and then waiting on it.
In my IOCP server framework (which you can get for free from here) I use dynamically allocated buffers which include the OVERLAPPED structure and which are reference counted. This means that the cleanup (in my case they're returned to a pool for reuse) happens when the completion occurs and the reference is released. It also means that you can choose to continue to use the buffer after the operation and the cleanup is still simple.
See also here: I/O Completion Port, How to free Per Socket Context and Per I/O Context?

Socket Read In Multi-Threaded Application Returns Zero Bytes or EINTR (104)

Am a c-coder for a while now - neither a newbie nor an expert. Now, I have a certain daemoned application in C on a PPC Linux. I use PHP's socket_connect as a client to connect to this service locally. The server uses epoll for multiplexing connections via a Unix socket. A user submitted string is parsed for certain characters/words using strstr() and if found, spawns 4 joinable threads to different websites simultaneously. I use socket, connect, write and read, to interact with the said webservers via TCP on their port 80 in each thread. All connections and writes seems successful. Reads to the webserver sockets fail however, with either (A) all 3 threads seem to hang, and only one thread returns -1 and errno is set to 104. The responding thread takes like 10 minutes - an eternity long:-(. *I read somewhere that the 104 (is EINTR?), which in the network context suggests that ...'the connection was reset by peer'; or (B) 0 bytes from 3 threads, and only 1 of the 4 threads actually returns some data. Isn't the socket read/write thread-safe? I use thread-safe (and reentrant) libc functions such as strtok_r, gethostbyname_r, etc.
*I doubt that the said webhosts are actually resetting the connection, because when I run a single-threaded standalone (everything else equal) all things works perfectly right, but of course in series not parallel.
There's a second problem too (oops), I can't write back to the client who connect to my epoll-ed Unix socket. My daemon application will hang and hog CPU > 100% for ever. Yet nothing is written to the clients end. Am sure the client (a very typical PHP socket application) hasn't closed the connection whenever this is happening - no error(s) detected either. Any ideas?
I cannot figure-out whatever is wrong even with Valgrind, GDB or much logging. Kindly help where you can.
Yes, read/write are thread-safe. But beware of gethostbyname() and getservbyname() if you're using them - they return pointers to static data, and may not be thread-safe.
errno 104 is ECONNREFUSED (not EINTR). Use strerror or perror to get the textual error message (like 'Connection reset by peer') for a particular errno code.
The best way to figure out what's going wrong is often to do very detailed logging - log the results of every operation, plus details like the IP address/port connecting to, the number of bytes read/written, the thread id, and so forth. And, of course, make sure your logging code is thread-safe :-)
Getting an ECONNRESET after 10 minutes sounds like the result of your connection timing out. Either the web server isn't sending the data or your app isn't receiving it.
To test the former, hookup a program like Wireshark to the local loopback device and look for traffic to and from the port you are using.
For the later, take a look at the epoll() man page. They mention a scenario where using edge triggered events could result in a lockup, because there is still data in the buffer, but no new data comes in so no new event is triggered.

What causes the ENOTCONN error?

I'm currently maintaining some web server software and I need to perform a lot of I/O operations. The read(), write(), close() and shutdown() calls, when used on a socket, may sometimes raise an ENOTCONN error. What exactly does this error mean? What are the conditions that would trigger it? I can never seem to reproduce it locally but there are users who can.
Right now I just ignore ENOTCONN when raised by close() and shutdown() because it seems harmless, but I'm not entirely sure.
EDIT:
I am absolutely sure that the connect() call succeeded. I check for its return value.
ENOTCONN is most often raised by close() and shutdown(). I've only very rarely seen a read() and write() raising ENOTCONN.
If you are sure that nothing on your side of the TCP connection is closing the connection, then it sounds to me like the remote side is closing the connection.
ENOTCONN, as others have pointed out, simply means that the socket is not connected. This doesn't necessarily mean that connect failed. The socket may well have been connected previously, it just wasn't at the time of the call that resulted in ENOTCONN.
This differs from:
ECONNRESET: the other end of the connection sent a TCP reset packet. This can happen if the other end is refusing a connection, or doesn't acknowledge that it is already connected, among other things.
ETIMEDOUT: this generally applies only to connect. This can happen if the connection attempt is not successful within a system-dependent amount of time.
EPIPE can sometimes be returned by some socket-related system calls under conditions that are more or less the same as ENOTCONN. For example, on some systems, EPIPE and ENOTCONN are synonymous when returned by send.
While it's not unusual for shutdown to return ENOTCONN, since this function is supposed to tear down the TCP connection, I would be surprised to see close return ENOTCONN. It really should never do that.
Finally, as dwc mentioned, EBADF shouldn't apply in your scenario unless you are attempting some operation on a file descriptor that has already been closed. Having a socket get disconnected (i.e. the TCP connection has broken) is not the same as closing the file descriptor associated with that socket.
It's because, at the moment of shutting() the socket, you have data in the socket's buffer waiting to be delivered to the remote party which has closed() or shutted down() its receiving socket.
I don't finish understanding how sockets work, I am rather a noob, and I've failed to even find the files where this "shutdown" function is implemented, but seeing that there's practically no user manual for the whole sockets thing I started trying all possibilities until I got the error in a "controlled" environment. It could be something else, but after much trying these are the explanations I settled for:
If you sent data after the remote side closed the connection, when you shutdown(), you get the error.
If you sent data before the remote side closed the connection but it didn't get received() on the other end, you can shutdown() once, the next time you try to shutdown(), you get the error.
If you didn't send any data, you can shutdown all the times you want, as long as the remote side doesn't shutdown(); once the remote side has shutdown(), if you try to shutdown() and the socket was already shutdown(), you get the error.
I believe ENOTCONN is returned, because shutdown() is not supposed to return ECONNRESET or other more accurate errors.
It is wrong to assume that the other side “just” closed the connection. On the TCP-level, the other side can only half-close a connection (or abort it). The connection is ordinary fully closed if both sides do a shutdown() (or close()). If both side do that, shutdown() actually succeeds for both of them!
The problem is that shutdown() did not succeed in ordinary (half-)closing the connection, neither as the first one to close it, nor as the second one. – From the errors listed in the POSIX docs for shutdown(), ENOTCONN is the least inappropriate, because the others indicate problems with arguments passed to shutdown() (or local resource problems to handle the request).
So what happened? These days, a NAT device somewhere between the two parties involved might have dropped the association and sends out RESET packets as a reaction. Reset connections are so common for IPv4, that you will get them anywhere in your code, even masked as ENOTCONN in shutdown().
A coding bug might also be the reason. On a non-blocking socket, for example, a connect() can return 0 without indicating a successful connection yet.
Transport endpoint is not connected
The socket is associated with a connection-oriented protocol and has not been connected. This is usually a programming flaw.
From: http://www.wlug.org.nz/ENOTCONN
If you're sure you've connected properly in the first place, ENOTCONN is most likely to be caused by either the fd being closed on your end (perhaps in another thread?) while you're in the middle of a request, or by the connection dropping while you're in the middle of the request.
In any case, it means that the socket is not connected. Go ahead and clean up that socket. It's dead. No problem calling close() or shutdown() on it.