How application server handle multiple requests to save data into table - jboss

I have created a web application in jsf and it has a button.
If the button is clicked then it will go to the server side and execute the below function to save the data in a table and I am using mybatis for this.
public void save(A a)
{
SqlSession session = null;
try{
session = SqlConnection.getInstance().openSession();
TestMapper testmap= session.getMapper(TestMapper.class);
testmap.insert(a);
session .commit();
}
catch(Exception e){
}
finally{
session.close();
}
}
Now i have deployed this application in an application server JBoss(wildfly).
As per my understanding, when multiple users try to access the application
by hitting the URL, the application server creates thread for each of the user request.
For example if 4 clients make request then 4 threads will be generated that is t1,t2,t3 and t4.
If all the 4 users hit the save button at the same time, how save method will be executed, like if t1 access the method and execute insert statement
to insert data into table, then t2,t3 and t4 or simultaneously all the 4 threads will execute the insert method and insert data?

To bring some context I would describe first two possible approaches to handling requests. In this case HTTP but these approaches do not depend on the protocol used and the main important thing is that requests come from the network and for their execution some IO is needed (either access to filesystem or database or network calls to other systems). Note that the following description has some simplifications.
These two approaches are:
synchronous
asynchronous
In general to process the typical HTTP request that involves DB access at least four IO operations are needed:
request handler needs to read the request data from the client socket
request handler needs to write request to the socket connected to the DB
request handler needs to read response from the DB socket
request handler needs to write the response to the client socket
Let's see how this is done for both cases.
Synchronous
In this approach the server has a pool (think a collection) of threads that are ready to serve a request.
When the request comes in the server borrows a thread from the pool and executes a request handler in that thread.
When the request handler needs to do the IO operation it initiates the IO operation and then waits for its completion. By wait I mean that thread execution is blocked until the IO operation completes and the data (for example response with the results of the SQL query) is available.
In this case concurrency that is requests processing for multiple clients simultaneously is achieved by having some number of threads in the pool. IO operations are much slower if compared to CPU so most of the time the thread processing some request is blocked on IO operation and CPU cores can execute stages of the request processing for other clients.
Note that because of the slowness of the IO operations thread pool used for handling HTTP requests is usually large enough. Documentation for sync requests processing subsystem used in wildfly says about 10 threads per CPU core as a reasonable value.
Asynchronous
In this case the IO is handled differently. There is a small number of threads handling IO. They all work the same way and I'll describe one of them.
Such thread runs a loop which basically waits for events and every time an event happen it calls a handler for an event.
The first such event is new request. When a request processing is started the request handler is invoked from the loop that is run by one of the IO threads. The first thing the request handler is doing it tries to read request from the client socket. So the handler initiates the IO operation on the client socket and returns control to the caller. That means that the thread is released and it can process another event.
Another event happens when the IO operations that reads from client socket got some data available. In this case the loop invokes the handler at the point where the handler returned the control to the loop after the IO initiate namely it is resumed on the next step that processes the input data (like parses HTTP parameters) and initiates new IO operation (in this case request to the DB socket). And again the handler releases the thread so it can handler other events (like completion of IO operations that are part of other clients' requests processing).
Given that IO operations are slow compared to the speed of CPU itself one thread handling IO can process a lot of requests concurrently.
Note: that it is important that the requests handler code never uses any blocking operation (like blocking IO) because that would steal the IO thread and will not allow other requests to proceed.
JSF and Mybatis
In case of JSF and mybatis the synchronous approach is used. JSF uses a servlet to handle requests from the UI and servlets are handled by the synchronous processors in WildFly. JDBC which is used by mybatis to communicate to a DB is also using synchronous IO so threads are used to execute requests concurrently.
Congestions
All of the above is written with the assumption that there is no other sources of the congestion. By congestion here I mean a limitation on the ability of the certain component of the system to execute things in parallel.
For example imagine a situation that a database is configured to only allow one client connection at a time (this is not a reasonable configuration and I'm using this only to demonstrate the idea). In this case even if multiple threads can execute the code of the save method in parallel all but one will be blocked at the moment when they try to open the connection to the database.
Another similar example is if you are using sqlite database. It only allows one client to write to the DB at a time. So at the point when thread A tries to execute insert it will be blocked if the is another thread B that is already executing the insert. And only after the commit executed by the thread B the thread A would be able to proceed with the insert. The time A depends on the time it take for B to execute its request and the number of other threads waiting to do a write operation to the same DB.
In practice if you are using a RDBMS that scales better (like postgresql, mysql or oracle) you will not hit this problem when using the small number of connection. But it may become a problem when there is a big number of concurrent requests and there is a limitation in the DB on the number of client connections or the connection pool is used to limit the number of connections on the application side. In this case if there are already many connections to the database the new clients will wait until existing requests are finished and connections are closed.

Related

Vertx JDBC limiting number of queries

I am working on an microservice that is developed using Vertx framework. One of service receives 100s(400 average) of events per second from event bus and it is written to MSSQL DB. Queries are executed using JDBCPool and currently configured with 40 as max number of connection. C3P0 is used for connection pooling.
My problem is, sometime the pool got exhausted and there are lot of statements waiting to be executed and this make whole application un-responsive. If I increase the pool size the DB exhibit slowness which affects other services also. So I am planning to write event to a Queue and poll the event from queue and then write to db, by this way I can control number of connections to DB by increasing/decreasing the number poller instance.
Current design
Source system -> Event bus -> Async IO -> DB
Proposed design
Source system -> Event bus -> Queue -> Polling -> DB
To keep it simple, I am trying to replace the DB IO async part with kind of sync.
Code
poll(){
// poll queue with 100ms timeout
//after getting event from queue call dao
dao.insert(event)
.onSuccess(statValObj -> {
poll();
})
}
The above looks like a recursion, so would it impact vertx eventloop?
Can connection pool size limit the number of queries/operation without freezing the entire call stack?
Vertx version - 4.2.0

Play Framework + JDBC + Futures

Assuming I obtain a JDBC connection through injection, like so:
class SqlQuery #Inject()(db: Database) extends Controller { /* .... */ }
And that the pool of connections is large enough, for example 100. Is it possible to create a Future to avoid blocking when running the SQL statement (similar to Slick futures)? Or the fact that the number of connections in the pool is large means that the SQL statement will not block?
Using futures is not synonymous with non-blocking. Futures allow you to execute code on another thread, or some type of executor, in general. However, the code you execute can still block.
JDBC is a blocking API. This means that when you execute a query through JDBC, the calling thread is blocked while it waits for a response from the database. Another term for this would be synchronous. A non-blocking or asynchronous API would accept a response asynchronously, freeing the calling thread from actively waiting for it. Reactive slick uses it's own driver to accept responses from a database in an asynchronous manner, which means the calling thread can be freed as soon as the query is dispatched to the database.
The difference between the two is this:
Imagine your application has a database connection pool of size 100, and a fixed thread pool of size 10. Then, let's say you wrap all of your JDBC calls in futures. Let's also say that your SqlQuery controller has a method that makes several JDBC calls at the same time. All of these queries will be run in parallel, until the thread pool is exhausted, which means you would only be able to run 10 queries at the same time at any given moment. While the calling thread would not be blocked by the JDBC calls, the threads executing them would. With enough queries running in parallel, the thread pool would become exhausted and it would no longer matter how many connections were in the pool. You could deal with this by making your thread pool larger, or using a fork join pool that expands as needed, but this could incur performance costs due to the creation of new threads and context switching. After all, your CPU is limited.
Using an asynchronous database driver like reactive slick would not block your limited pool of threads, and you would be able to run as many queries concurrently as you had connections in the pool (100 in this example). Saving threads from being blocked means saving CPU time that would otherwise be spent just waiting for responses, which means you can use it to continue to handle other requests, etc.

async call back using scala and play2 or spray

I have a systems design challenge that I would like to get some community feedback on.
Basic system structure:
[Client] ---HTTP-POST--> [REST Service] ---> [Queue] ---> [Processors]
[Client] POSTs json to [REST Service] for processing.
Based on request, [Rest Services] sends data to various queues to be picked up by various processors written in various languages and running in different processes.
Work is parallelized in each processor but can still take up to 30 seconds to process. The time to process is a function of the complexity of the data and cannot be speed up.
The result cannot be streamed back to the client as it is completed because there is a final post processing step that can only be completed once all the sub steps are completed.
Key challenge: Once the post processing is complete, the client either needs to:
be sent the results after the client has been waiting
be notified async that the job is completed and passed an id to request the final result
Design requirements
I don't want to block the [REST Service]. It needs to take the incoming request, route the data to the appropriate queues for processing in other processes, and then be immediately available for the next incoming request.
Normally I would have used actors and/or futures/promises so the [REST Service] is not blocked when waiting for background workers to complete. The challenge here is the workers doing the background work are running in separate processes/VMs and written in various technology stacks. In order to pass these messages between heterogeneous systems and to ensure integrity of the request lifetime, a durable queue is being used (not in memory message passing or RPC).
Final point of consideration, in order to scale, there are a load balanced set of [REST Services] and [Processors] in respective pools. Therefore, since the messages from the [REST Service] to the [Processor] need to be sent asynchronously via a queue (and everything is running is separate processes), there is no way to correlate the work done in a background [Processor] back to its original calling [REST Service] instance in order to return the final processed data in a promise or actor message and finally pass the response back to the original client.
So, the question is, how to make this correlation? Once the all the background processing is completed, I need to get the result back to the client either via a long waited response or a notification (I do not want to use something like UrbanAirship as most of the clients are browsers or other services.
I hope this is clear, if not, please ask for clarification.
Edit: Possible solution - thoughts?
I think I pass a spray RequestContext to any actor which can then response back to the client (does not have to be the original actor that received HTTP request). If this is true, can I cache the RequestContext and then use it later to asynchronously send the response to the appropriate client using this cached RequestContext when the processing is completed?
Well, it's not the best because it requires more work from your Client, but it sounds like you want to implement a webhook. So,
[Client] --- POST--> [REST Service] ---> [Calculations] ---> POST [Client]
[Client] --- GET
For explanation:
Client sends a POST request to your service. Your Service then does whatever processing necessary. Upon completion, your service will then send an HTTP-POST to a URL that the Client has already set. With that POST data, the Client will then have the necessary information to then do a GET request for the completed data.

Should I use IOCPs or overlapped WSASend/Receive?

I am investigating the options for asynchronous socket I/O on Windows. There is obviously more than one option: I can use WSASend... with an overlapped structure providing either a completion callback or an event, or I could use IOCPs and the (new) thread pool. From I usually read, the latter option is the recommended one.
However, it is not clear to me, why I should use IOCPs if the completion routine suffices for my goal: tell the socket to send this block of data and inform me if it is done.
I understand that the IOCP stuff in combination with CreateThreadpoolIo etc. uses the OS thread pool. However, the "normal" overlapped I/O must also use separate threads? So what is the difference/disadvantage? Is my callback called by an I/O thread and blocks other stuff?
Thanks in advance,
Christoph
You can use either but, for servers, IOCP with the 'completion queue' will have better performance, in general, because it can use multiple client<>server threads, either with CreateThreadpoolIo or some user-space thread pool. Obviously, in this case, dedicated handler threads are usual.
Overlapped completion-routine I/O is more useful for clients, IMHO. The completion-routine is fired by an Asynchronous Procedure Call that is queued to the thread that initiated the I/O request, (WSASend, WSARecv). This implies that that thread must be in a position to process the APC and typically this means a while(true) loop around some 'blahEx()' call. This can be useful because it's fairly easy to wait on a blocking queue, or other inter-thread signal, that allows the thread to be supplied with data to send and the completion routine is always handled by that thread. This I/O mechanism leaves the 'hEvent' OVL parameter free to use - ideal for passing a comms buffer object pointer into the completion routine.
Overlapped I/O using an actual synchro event/Semaphore/whatever for the overlapped hEvent parameter should be avoided.
Windows IOCP documentation recommends no more than one thread per available core per completion port. Hyperthreading doubles the number of cores. Since use of IOCPs results in a for all practical purposes event-driven application the use of thread pools adds unnecessary processing to the scheduler.
If you think about it you'll understand why: an event should be serviced in its entirety (or placed in some queue after initial processing) as quickly as possible. Suppose five events are queued to an IOCP on a 4-core computer. If there are eight threads associated with the IOCP you run the risk of the scheduler interrupting one event to begin servicing another by using another thread which is inefficient. It can be dangerous too if the interrupted thread was inside a critical section. With four threads you can process four events simultaneously and as soon as one event has been completed you can start on the last remaining event in the IOCP queue.
Of course, you may have thread pools for non-IOCP related processing.
EDIT________________
The socket (file handles work fine too) is associated with an IOCP. The completion routine waits on the IOCP. As soon as a requested read from or write to the socket completes the OS - via the IOCP - releases the completion routine waiting on the IOCP and returns with the additional information you provided when you called the read or write (I usually pass a pointer to a control block). So the completion routine immediately "knows" where the to find information pertinent to the completion.
If you passed information referring to a control block (similar) then that control block (probably) needs to keep track of what operation has completed so it knows what to do next. The IOCP itself neither knows nor cares.
If you're writing a server attached to the internet, the server would issue a read to wait for client input. That input may arrive a milli-second or a week later and when it does the IOCP will release the completion routine which analyzes the input. Typically it responds with a write containing the data requested in the input and then waits on the IOCP. When the write completed the IOCP again releases the completion routine which sees that the write has completed, (typically) issues a new read and a new cycle starts.
So an IOCP-based application typically consumes very little (or no) CPU until the moment a completion occurs at which time the completion routine goes full tilt until it has finished processing, sends a new I/O request and again waits on the completion port. Apart from the IOCP timeout (which can be used to signal house-keeping or such) all I/O-related stuff occurs in the OS.
To further complicate (or simplify) things it is not necessary that sockets be serviced using the WSA routines, the Win32 functions ReadFile and WriteFile work just fine.

How does I/O work in Akka?

How does the actor model (in Akka) work when you need to perform I/O (ie. a database operation)?
It is my understanding that a blocking operation will throw an exception (and essentially ruin all concurrency due to the evented nature of Netty, which Akka uses). Hence I would have to use a Future or something similar - however I don't understand the concurrency model.
Can 1 actor be processing multiple message simultaneously?
If an actor makes a blocking call in a future (ie. future.get()) does that block only the current actor's execution; or will it prevent execution on all actors until the blocking call has completed?
If it blocks all execution, how does using a future assist concurrency (ie. wouldn't invoking blocking calls in a future still amount to creating an actor and executing the blocking call)?
What is the best way to deal with a multi-staged process (ie. read from the database; call a blocking webservice; read from the database; write to the database) where each step is dependent on the last?
The basic context is this:
I'm using a Websocket server which will maintain thousands of sessions.
Each session has some state (ie. authentication details, etc);
The Javascript client will send a JSON-RPC message to the server, which will pass it to the appropriate session actor, which will execute it and return a result.
Execution of the RPC call will involve some I/O and blocking calls.
There will be a large number of concurrent requests (each user will be making a significant amount of requests over the WebSocket connection and there will be a lot of users).
Is there a better way to achieve this?
Blocking operations do not throw exceptions in Akka. You can do blocking calls from an Actor (which you probably want to minimize, but thats another story).
no, 1 actor instance cannot.
It will not block any other actors. You can influence this by using a specific Dispatcher. Futures use the default dispatcher (the global event driven one normally) so it runs on a thread in a pool. You can choose which dispatcher you want to use for your actors (per actor, or for all). I guess if you really wanted to create a problem you might be able to pass exactly the same (thread based) dispatcher to futures and actors, but that would take some intent from your part. I guess if you have a huge number of futures blocking indefinitely and the executorservice has been configured to a fixed amount of threads, you could blow up the executorservice. So a lot of 'ifs'. a f.get blocks only if the Future has not completed yet. It will block the 'current thread' of the Actor from which you call it (if you call it from an Actor, which is not necessary by the way)
you do not necessarily have to block. you can use a callback instead of f.get. You can even compose Futures without blocking. check out talk by Viktor on 'the promising future of akka' for more details: http://skillsmatter.com/podcast/scala/talk-by-viktor-klang
I would use async communication between the steps (if the steps are meaningful processes on their own), so use an actor for every step, where every actor sends a oneway message to the next, possibly also oneway messages to some other actor that will not block which can supervise the process. This way you could create chains of actors, of which you could make many, in front of it you could put a load balancing actor, so that if one actor blocks in one chain another of the same type might not in the other chain. That would also work for your 'context' question, pass of workload to local actors, chain them up behind a load balancing actor.
As for netty (and I assume you mean Remote Actors, because this is the only thing that netty is used for in Akka), pass of your work as soon as possible to a local actor or a future (with callback) if you are worried about timing or preventing netty to do it's job in some way.
Blocking operations will generally not throw exceptions, but waiting on a future (for example by using !! or !!! send methods) can throw a time out exception. That's why you should stick with fire-and-forget as much as possible, use a meaningful time-out value and prefer callbacks when possible.
An akka actor cannot explicitly process several messages in a row, but you can play with the throughput value via the config file. The actor will then process several message (i.e. its receive method will be called several times sequentially) if its message queue it's not empty: http://akka.io/docs/akka/1.1.3/scala/dispatchers.html#id5
Blocking operations inside an actor will not "block" all actors, but if you share threads among actors (recommended usage), one of the threads of the dispatcher will be blocked until operations resume. So try composing futures as much as possible and beware of the time-out value).
3 and 4. I agree with Raymond answers.
What Raymond and paradigmatic said, but also, if you want to avoid starving the thread pool, you should wrap any blocking operations in scala.concurrent.blocking.
It's of course best to avoid blocking operations, but sometimes you need to use a library that blocks. If you wrap said code in blocking, it will let the execution context know you may be blocking this thread so it can allocate another one if needed.
The problem is worse than paradigmatic describes since if you have several blocking operations you may end up blocking all threads in the thread pool and have no free threads. You could end up with deadlock if all your threads are blocked on something that won't happen until another actor/future gets scheduled.
Here's an example:
import scala.concurrent.blocking
...
Future {
val image = blocking { load_image_from_potentially_slow_media() }
val enhanced = image.enhance()
blocking {
if (oracle.queryBetter(image, enhanced)) {
write_new_image(enhanced)
}
}
enhanced
}
Documentation is here.