How can a single-threaded NGINX handle so many connections? - sockets

NGNIX uses epoll notification to know if there is any data on the socket to read.
Let assume:
There are two requests to the server.
nginx is notificated about this two requests and starts to:
receive the first request
parse ist headers
check the boudary (body size)
send the first request to upstream server
nginx is singe-threaded and can do only one operation at the same time.
But what happens with the second request?
Does nginx receive the second request during parsing the first one?
Or it begins to handle the second request after getting the first done?
Or something else that I don't understand.
If 1. is correct than I don't understand how it is possible within a single thread.
If 2. is correct than how can nginx be so fast? because nginx handles all incoming requests sequentially. At any given time only ONE request handling is possible.
Please help me to understand.

Nginx is not a single threaded application. It does not start a thread for each connection but it starts several worker threads during start. The nginx architecture is well described in the
Actually a single threaded non-blocking application is the most efficient design for a single processor hardware. When we have only one CPU and the application is completely non-blocking the application can fully utilize the CPU power. Non-blocking application means that application does not call any function that might wait for an event. All IO operation are asynchronous. That means application does not call simple read() from socket because the call might wait till data is available. Non-blocking application uses some mechanism how to notify application that data is available and it can call read() without risk that the call will wait for something. So ideal non-blocking application needs only one thread for one CPU in the system. As nginx uses non-blocking calls the processing in multiple threads has no meaning because there would be no CPU to execute additional threads.
The real data receiving from a network card to a buffer is done in the kernel when network card issue an interrupt. Then nginx gets a request in a buffer and process it. It has no meaning to start processing another request till the current request processing is done or till the current request processing requires an action that might block (for example disk read).


Making a HTTP API server asynchronous with Future, how does it make it non-blocking?

I am trying to write a HTTP API server which does basic CRUD operation on a specific resource. It talks to an external db server to do the operations.
Future support in scala is pretty good, and for all non-blocking computation, future is used. I have used future in many places where we wrap an operation with future and move on, when the value is eventually available and the call back is triggered.
Coming to an HTTP API server's context, it is possible to implement non-blocking asynchronous calls, but when a GET or a POST call still blocks the main thread right?
When a GET request is made, a success 200 means the data is written to the db successfully and not lost. Until the data is written to the server, the thread that was created is still blocking until the final acknowledgement has been received from the database that the insert is successful right?
The main thread(created when http request was received) could delegate and get a Future back, but is it still blocked until the onSuccess is trigged which gets triggered when the value is available, which means the db call was successful.
I am failing to understand how efficiently a HTTP server could be designed to maximize efficiency, what happens when few hundred requests hit a specific endpoint and how it is dealt with. I've been told that slick takes the best approach.
If someone could explain a successful http request lifecycle with future and without future, assuming there are 100 db connection threads.
When a GET request is made, a success 200 means the data is written to
the db successfully and not lost. Until the data is written to the
server, the thread that was created is still blocking until the final
acknowledgement has been received from the database that the insert is
successful right?
The thread that was created for the specific request need not be blocked at all. When you start an HTTP server, you always have the "main" thread ongoing and waiting for requests to come in. Once a request starts, it is usually offloaded to a thread which is taken from the thread pool (or ExecutionContext). The thread serving the request doesn't need to block anything, it only needs to register a callback which says "once this future completes, please complete this request with a success or failure indication". In the meanwhile, the client socket is still pending a response from your server, nothing returns. If, for example, we're on Linux and using epoll, then we pass the kernel a list of file descriptors to monitor for incoming data and wait for that data to become available, in which we will get back a notification for.
We get this for free when running on top of the JVM due to how java.NIO is implemented for Linux.
The main thread (created when http request was received) could delegate
and get a Future back, but is it still blocked until the onSuccess is
trigged which gets triggered when the value is available, which means
the db call was successful.
The main thread usually won't be blocked, as it is whats in charge of accepting new incoming connections. If you think about it logically, if the main thread blocked until your request completed, that means that we could only serve one concurrent request, and who wants a server which can only handle a single request at a time?
In order for it to be able to accept multiple request, it will never handle the processing of the route on the thread in which it accepts the connection, it will always delegate it to a background thread to do that work.
In general, there are many ways of doing efficient IO in both Linux and Windows. The former has epoll while the latter has IO completion ports. For more on how epoll works internally, see
First off, there has to be something blocking the final main thread for it to keep running. But it's no different than having a threadpool and joining to it. I'm not exactly sure what you're asking here, since I think we both agree that using threads/concurrency is better than a single threaded operation.
Future is easy and efficient because it abstracts all the thread handling from you. By default, all new futures run in the global implicit ExecutionContext, which is just a default threadpool. Once you kick of a Future request, that thread will spawn and run, and your program execution will continue. There are also convenient constructs to directly manipulate the results of a future. For example, you can map, and flatMap on futures, and once that future(thread) returns, it will run your transformation.
It's not like single threaded languages where a single future will actually block the entire execution if you have a blocking call.
When you're comparing efficiency, what are you comparing it to?
In general "non-blocking" may mean different things in different contexts: non-blocking = asynchronous (your second question) and non-blocking = non-blocking IO (your first question). The second question is a bit simpler (addresses more traditional or well-known aspect let's say), so let's start from it.
The main thread(created when http request was received) could delegate and get a Future back, but is it still blocked until the onSuccess is trigged which gets triggered when the value is available, which means the db call was successful.
It is not blocked, because Future runs on different thread, so your main thread and thread where you execute your db call logic run concurrently (main thread still able to handle other requests while db call code of previous request is executing).
When a GET request is made, a success 200 means the data is written to the db successfully and not lost. Until the data is written to the server, the thread that was created is still blocking until the final acknowledgement has been received from the database that the insert is successful right?
This aspect is about IO. Thread making DB call (Network IO) is not necessary blocked. It is the case for old "thread per request" model, when thread is really blocked and you need create another thread for another DB request. However, nowadays non-blocking IO became popular. You can google for more details about it, but in general it allows you to use one thread for several IO operations.

Semaphore error logged in mobicents sip servlet

We have an application written against Mobicents SIP Servlets, currently this is using v2.1.547 but I have also tested against v3.1.633 with the same behavior noted.
Our application is working as a B2BUA, we have an incoming SIP call and we also have an outbound SIP call being placed to an MRF which is executing VXML. These two SIP calls are associated with a single SipApplicationSession - which is the concurrency model we have configured.
The scenario which recreates this 100% of the time is as follows:
inbound call placed to our application (call is not answered)
outbound call placed to MRF
inbound call hangsup
application attempts to terminate the SipSession associated with the outbound call
I am seeing this being logged:
2015-12-17 09:53:56,771 WARN [SipApplicationSessionImpl] (MSS-Executor-Thread-14) Failed to acquire session semaphore java.util.concurrent.Semaphore#55fcc0cb[Permits = 0] for 30 secs. We will unlock the semaphore no matter what because the transaction is about to timeout. THIS MIGHT ALSO BE CONCURRENCY CONTROL RISK. app Session is5faf5a3a-6a83-4f23-a30a-57d3eff3281c;SipController
I am willing to believe somehow our application might be triggering this behavior but I can't see how at the moment. I would have thought acquiring/releasing the Semaphore was all internal to the implementation so it should ensure something doesn't acquire the Semaphore and never release it?
Any pointers on how to get to the bottom of this would be appreciated, as I said it is 100% repeatable so getting logs etc is all possible.
It's hard to tell without seeing any logs or application code on how you access and schedule messages to be sent. But if you use the same SipApplicationSession in an asynchronous manner you may want to use our vendor specific asynchronous API,%20org.mobicents.javax.servlet.sip.SipApplicationSessionAsynchronousWork) which will guarantee that the access to the SipapplicationSession is serialized and avoid any concurrency issues.

Winsock: Can i call send function at the same time for different socket?

Let's say, I have a server with many connected clients via TCP, i have a socket for every client and i have a sending and receiving thread for every client. Is it safe and possible to call send function at the same time as it will not call send function for same socket.
If it's safe and ok, Can i stream data to clients simultaneously without blocking send function for other clients ?
Thank you very much for answers.
Yes it is possible and thread-safe. You could have tested it, or worked out for yourself that IS, IIS, SQL Server etc. wouldn't work very well if it wasn't.
Assuming this is Windows from the tag of "Winsock".
This design (having a send/receive thread for every single connected client), overall, is not going to scale. Hopefully you are aware of that and you know that you have an extremely limited number of clients (even then, I wouldn't write it this way).
You don't need to have a thread pair for every single client.
You can serve tons of clients with a single thread using non-blocking IO and read/write ready notifications (either with select() or one of the varieties of Overlapped IO such as completion routines or completion ports). If you use completion ports you can set a pool of threads to handle socket IO and queue the work for your own worker thread or threads/threadpool.
Yes, you can send and receive to many sockets at once from different threads; but you shouldn't need those extra threads because you shouldn't be making blocking calls to send/recv at all. When you make a non-blocking call the amount that could be written immediately is written and the function returns, you then note how much was sent and ask for notification when the socket is next writable.
I think you might want to consider a different approach as this isn't simple stuff; if you're using .Net you might get by with building this with TcpListener or HttpListener (both of which use completion ports for you), though be aware that you can't easily disable Nagle's algorithm with those so if you need interactivity (think of the auto-complete on Google's search page) then you probably won't get the performance you want.

serving large file using select, epoll or kqueue

Nginx uses epoll, or other multiplexing techniques(select) for its handling multiple clients, i.e it does not spawn a new thread for every request unlike apache.
I tried to replicate the same in my own test program using select. I could accept connections from multiple client by creating a non-blocking socket and using select to decide which client to serve. My program would simply echo their data back to them .It works fine for small data transfers (some bytes per client)
The problem occurs when I need to send a large file over a connection to the client. Since i have only one thread to serve all client till the time I am finished reading the file and writing it over to the socket i cannot resume serving other client.
Is there a known solution to this problem, or is it best to create a thread for every such request ?
When using select you should not send the whole file at once. If you e.g. are using sendfile to do this it will block until the whole file has been sent. Instead use a small buffer, and send a little data at a time to each client. Then use select to identify when the socket is again ready to be written to and send some more until all data has been sent. This will allow you to handle multiple clients in parallel.
The simplest approach is to create a thread per request, but it's certainly not the most scalable approach. I think at this time basically all high-performance web servers use various asynchronous approaches built on things like epoll (Linux), kqueue (BSD), or IOCP (Windows).
Since you don't provide any information about your performance requirements, and since all the non-threaded approaches require restructuring your application to use these often-complex asynchronous techniques (as described in the C10K article and others found from there), for now your best bet is just to use the threaded approach.
Please update your question with concrete requirements for performance and other relevant data if you need more.
For background this may be useful reading
I think you are using your callback to handle a single connection. This is not how it was designed. Your callback has to handle the whatever-thousand of connections you are planning to serve, i.e from the number of file descriptor you get as parameter, you have to know (by reading the global variables) what to do with that client, either read() or send() or ... whatever

what is blocking and non-blocking web server, what difference between both?

I have seen many a web framework provide a non-blocking web server, I just want to know what it means.
a blocking web-server is similar to a phone call. you need to wait on-line to get a response and continue; where as a non-blocking web-server is like a sms service. you sms your request,do your things and react when you receive an sms back!
Using a blocking socket, execution will wait (ie. "block") until the full socket operation has taken place. So, you can process any results/responses in your code immediately after. These are also called synchronous sockets.
A non-blocking socket operation will allow execution to resume immediately and you can handle the server's response with a callback or event. These are called asynchronous sockets.
Non-blocking generally means event driven, multiplexing all activity via an event driven system in a single thread, as opposed to using multiple threads.