Call to REST service with InvokeHTTP is executed multiple times - rest

I am invoking a REST service with the InvokeHTTP processor.
It is a query service (GET ), it returns an authorization token that the other services of the flow need. The thing is that I need it to be executed only once.
It is happening that it performs the query many times and when it passes the flowfile to the next processor it is very slow and never ends.
How do I get the call to the REST service with InvokeHTTP to execute only once?

Related

Is MongoDB Realm function scalable?

I am pretty new to MongoDB.
I am in a scenario where it is possible for a system to invoke functions simultaneously many time.
I have gone through mongoDB Atlas function documentation and didn't find anything which speaks about scalability or concurrency issues.
Can a single function be invoked multiple times in parallel?
for example: Three different request trying to invoke same function will all three request be handled one by one or in parallel.
You can call the functions concurrently, provided the workload adheres to the App Services' limitation of 5000 concurrent requests. So, to address your point: if 3 different services try to invoke the same function at a time, they will be handled in parallel.
Additionally, you can use HTTPS Endpoints to expose a Function and trigger it through an endpoint call.

Handling multiple requests with same body - REST API

Let's say I have a micro service which just registers a user into the database and we expose it to our client. I want to understand what's the better way of handling the following scenario,
What if the user sends multiple requests in parallel(say 10 requests within the 1 second) with same request body. Should I keep the requests in a queue and register for the very first user and deny all the other 9 requests, or should I classify each request and compare whichever having similar request body and if any of them has different request body shall be picked up one each and rest are rejected? or What's the best thing I can do to handle this scenario?
One more thing I would like to understand, is it recommended to have rate-limiting (say n requests per minute) on a global API level or micro-service level?
Thanks in advance!
The best way is to use an idempotent call. Instead of exposing an endpoint like this :
POST /users + payload
Expose an endpoint like this :
PUT /user/ID + payload
You let the caller generate the id, and you ask for an UUID. With UUID, no matter who generates it. This way, if caller invokes your endpoint multiple times, the first time you will create the user, the following times you will juste update the user with the same payload, which means you'll do nothing. At least you won't generate duplicates.
It's always a good practice to protect your services with rate-limiting. You have to set it at API level. If you define it at microservice level, you will authorize N times the rate if you have N instances, because you will ditribute the requests.

How application server handle multiple requests to save data into table

I have created a web application in jsf and it has a button.
If the button is clicked then it will go to the server side and execute the below function to save the data in a table and I am using mybatis for this.
public void save(A a)
{
SqlSession session = null;
try{
session = SqlConnection.getInstance().openSession();
TestMapper testmap= session.getMapper(TestMapper.class);
testmap.insert(a);
session .commit();
}
catch(Exception e){
}
finally{
session.close();
}
}
Now i have deployed this application in an application server JBoss(wildfly).
As per my understanding, when multiple users try to access the application
by hitting the URL, the application server creates thread for each of the user request.
For example if 4 clients make request then 4 threads will be generated that is t1,t2,t3 and t4.
If all the 4 users hit the save button at the same time, how save method will be executed, like if t1 access the method and execute insert statement
to insert data into table, then t2,t3 and t4 or simultaneously all the 4 threads will execute the insert method and insert data?
To bring some context I would describe first two possible approaches to handling requests. In this case HTTP but these approaches do not depend on the protocol used and the main important thing is that requests come from the network and for their execution some IO is needed (either access to filesystem or database or network calls to other systems). Note that the following description has some simplifications.
These two approaches are:
synchronous
asynchronous
In general to process the typical HTTP request that involves DB access at least four IO operations are needed:
request handler needs to read the request data from the client socket
request handler needs to write request to the socket connected to the DB
request handler needs to read response from the DB socket
request handler needs to write the response to the client socket
Let's see how this is done for both cases.
Synchronous
In this approach the server has a pool (think a collection) of threads that are ready to serve a request.
When the request comes in the server borrows a thread from the pool and executes a request handler in that thread.
When the request handler needs to do the IO operation it initiates the IO operation and then waits for its completion. By wait I mean that thread execution is blocked until the IO operation completes and the data (for example response with the results of the SQL query) is available.
In this case concurrency that is requests processing for multiple clients simultaneously is achieved by having some number of threads in the pool. IO operations are much slower if compared to CPU so most of the time the thread processing some request is blocked on IO operation and CPU cores can execute stages of the request processing for other clients.
Note that because of the slowness of the IO operations thread pool used for handling HTTP requests is usually large enough. Documentation for sync requests processing subsystem used in wildfly says about 10 threads per CPU core as a reasonable value.
Asynchronous
In this case the IO is handled differently. There is a small number of threads handling IO. They all work the same way and I'll describe one of them.
Such thread runs a loop which basically waits for events and every time an event happen it calls a handler for an event.
The first such event is new request. When a request processing is started the request handler is invoked from the loop that is run by one of the IO threads. The first thing the request handler is doing it tries to read request from the client socket. So the handler initiates the IO operation on the client socket and returns control to the caller. That means that the thread is released and it can process another event.
Another event happens when the IO operations that reads from client socket got some data available. In this case the loop invokes the handler at the point where the handler returned the control to the loop after the IO initiate namely it is resumed on the next step that processes the input data (like parses HTTP parameters) and initiates new IO operation (in this case request to the DB socket). And again the handler releases the thread so it can handler other events (like completion of IO operations that are part of other clients' requests processing).
Given that IO operations are slow compared to the speed of CPU itself one thread handling IO can process a lot of requests concurrently.
Note: that it is important that the requests handler code never uses any blocking operation (like blocking IO) because that would steal the IO thread and will not allow other requests to proceed.
JSF and Mybatis
In case of JSF and mybatis the synchronous approach is used. JSF uses a servlet to handle requests from the UI and servlets are handled by the synchronous processors in WildFly. JDBC which is used by mybatis to communicate to a DB is also using synchronous IO so threads are used to execute requests concurrently.
Congestions
All of the above is written with the assumption that there is no other sources of the congestion. By congestion here I mean a limitation on the ability of the certain component of the system to execute things in parallel.
For example imagine a situation that a database is configured to only allow one client connection at a time (this is not a reasonable configuration and I'm using this only to demonstrate the idea). In this case even if multiple threads can execute the code of the save method in parallel all but one will be blocked at the moment when they try to open the connection to the database.
Another similar example is if you are using sqlite database. It only allows one client to write to the DB at a time. So at the point when thread A tries to execute insert it will be blocked if the is another thread B that is already executing the insert. And only after the commit executed by the thread B the thread A would be able to proceed with the insert. The time A depends on the time it take for B to execute its request and the number of other threads waiting to do a write operation to the same DB.
In practice if you are using a RDBMS that scales better (like postgresql, mysql or oracle) you will not hit this problem when using the small number of connection. But it may become a problem when there is a big number of concurrent requests and there is a limitation in the DB on the number of client connections or the connection pool is used to limit the number of connections on the application side. In this case if there are already many connections to the database the new clients will wait until existing requests are finished and connections are closed.

Gatling synchronous Http request/response chain

I have implemented a chain of executions and each execution will send a HTTP request to the server and does check if the response status is 2XX. I need to implement a synchronous model in which the next execution in the chain should only get triggered when the previous execution is successful i.e response status is 2xx.
Below is the snapshot of the execution chain.
feed(postcodeFeeder).
exec(Seq(LocateStock.locateStockExecution, ReserveStock.reserveStockExecution, CancelOrder.cancelStockExecution,
ReserveStock.reserveStockExecution, ConfirmOrder.confirmStockExecution, CancelOrder.cancelStockExecution)
Since gatling has asynchronous IO model, what am currently observing is the HTTP requests are sent to the server in an asynchronous manner by a number of users and there is no real dependency between the executions with respect to a single user.
Also I wanted to know for an actor/user if an execution in a chain fails due the check, does it not proceed with the next execution in the chain?
there is no real dependency between the executions with respect to a single user
No, you are wrong. Except when using "resources", requests are sequential for a given user. If you want to stop the flow for a given user when it encounters an error, you can use exitblockonfail.
Gatling does not consider the failure response from the previous request before firing next in chain. You may need to cover the entire block with exitBlockOnFail{} to block the gatling to fire next.

Are batched requests sent over Neo4j REST API executed in parallel

If I use Neo4j REST Batch endpoint, are the requests in the same batch executed in parallel? I suspect not, because how else would one request be able to refer to another in the same batch? But I haven't been able to find any documentation that clearly states one way or the other, and I am trying to make a recommendation to someone else about the performance of REST batches vs. transactional Cypher.
They are not executed in parallel but sequentially.
But they are streamed, if you use the header: X-Stream:true