Concurrent requests : what best practices? - postgresql

I have a Play! Framework 2.3 project hosted on Heroku with the Postgres Addon.
It handles requests from mobile applications (Post a message).
For different reasons, I have duplicate (twice) rows (messages) in database :
the app might send the request twice in a short time ( less than 10ms )
I have multiple dynos that handle requests in parallel
Event if before writing in DB, I check the message does not exist yet. So I guess the first has still not been wrote when the second comes.
I also tried to write a message footprint in the memcache before handling the request (after form validation). But I still got twice messages sometimes.
The solutions I found are :
have a unique constraint on some database field (like a message timestamp client-side generated ?)
regularly check to remove duplicates
As I do not have means to update mobile apps I will script a regular check of duplicates.
Any other idea ?
What are the best practices to handle such concurrent requests ?
Attachement : my pseudo code
public static Result submit() {
User user = MySecured.getCurrentUser(ctx());
final Form<Message> filledForm = form(Message.class).bindFromRequest();
.... Some database pre-verification
if (filledForm.hasErrors()) {
ObjectNode error = Json.newObject();
error.put("error", filledForm.errorsAsJson());
return ok(error);
} else {
if(Cache.get(KEY_LOCK_FLASH_WRITING+filledForm.data().get("mail"))!=null){
return internalServerError();
}
//Verify this flash hasnt already been handled (requests can come twice from client)
Message sameMessage = Message.findSame(filledForm.get().mail, filledForm.get().message);
if(sameMessage!=null){
Logger.info("[Submit] message already exists" + sameMessage.id);
ObjectNode jsonResult = Json.newObject();
.... Processing a result ... no matter this does not happen.
return ok(jsonResult);
}
final Message flash = filledForm.get();
Cache.set(KEY_LOCK_FLASH_WRITING+flash.mail, "");
... some fields initializations like flash.author = new Author();
... Then some promises
return ok();
}
}

Related

How to batch requests to the same URL without causing memory leaks

I have a system that processes images. Essentially, I provide an ID to it, and it fetches a source image, and then it begins performing transformations on it to resize and reformat it.
This system gets quite a bit of usage, and one of the things that I've noticed is that I tend to get many requests for the same ID simultaneously, but in different requests to the webserver.
What I'd like to do is "batch" these requests. For example, if there's 5 simultaneous requests for the image "user-upload.png", I'd like there to be only one HTTP request to fetch the source image.
I'm using NestJS with default scopes for my service, so the service is shared between requests. Requests to fetch the image are done with the HttpModule, which is using axios internally.
I only care about simultaneous requests. Once the request finishes, it will be cached, and that prevents new requests from hitting the HTTP url.
I've thought about doing something like this (Pseudocode):
#Provider()
class ImageFetcher {
// Store in flight requests as a map between id:promise
inFlightRequests = { }
fetchImage(id: string) {
if (this.inFlightRequests[id]) {
return this.inFlightRequests[id]
}
this.inFlightRequests[id] = new Promise(async (resolve, reject) => {
const { data } = await this.httpService.get('/images' + id)
// error handling omitted here
resolve(data)
delete inFlightRequests[id]
})
return this.inFlightRequests[id]
}
}
The most obvious issue I see is the potential for a memory leak. This is solveable with more custom code, but I thought I'd see if anyone has any suggestions for doing this without writing more code.
In particular, I've also thought about using an axios interceptor, but I'm not entirely sure how to handle that properly. Any pointers here would be really appreciated.

Not receiving latest data using Npgsql LISTEN/NOTIFY

I'm using .NET Core app with a PostgreSQL database (with Npgsql) combined with SignalR to receive real-time data and latest data entries. However, I am not receiving the latest entry, and sometimes the Clients.All.SendAsync method sends more than one entry to the client. Here is my code:
Hub method that sends new data to client:
public async Task SendForexAsync(string name)
{
var product = GetForex(name);
await Clients.All.SendAsync("CurrentData", product);
using (var conn = new NpgsqlConnection(ApplicationDbContext.GetConnectionString()))
{
conn.Open();
var cmd = new NpgsqlCommand("LISTEN new_forex", conn).ExecuteNonQuery();
conn.Notification += async (o, e) =>
{
var newProduct = GetForex(name);
await Clients.All.SendAsync("NewData", newProduct);
};
while (true)
{
await conn.WaitAsync();
}
}
}
Console app that periodically polls for new data from an API:
var addedStocksDJI = FetchNewStocks("DJI");
if (addedStocksAAPL > 0 || addedStocksDJI > 0)
{
using (var conn = new NpgsqlConnection(ApplicationDbContext.GetConnectionString()))
{
conn.Open();
var cmd = new NpgsqlCommand("NOTIFY new_stocks", conn).ExecuteNonQuery();
}
}
The other code of the app is most definitely correct because I was receiving new and correct data before I tried implementing the LISTEN/NOTIFY feature. But now, I get one (or more) of entries of newProduct on my client, but it is the "old" product, that is, the database does not query and send the latest entries, but only the old ones via SignalR. When I refresh the page manually, the new data is correctly displayed, though.
I believe it has something to do with a single connection being open so I constantly receive only the "old" set of data, but even if that is the case, I am unable to figure out why I sometimes get more than one packet of data, even though I am only trying to send one, and I am calling NOTIFY only once.
I figured it out. Hopefully this will help someone else who gets stuck with this in the future!
The issue was that I was declaring my dbContext via .NET Core's dependency injection in my Hub class, which created the context only once per that class, and also because of that per page or WebSocket transaction. Which is why I was unable to get the latest data, I assume, since the dbContext was "old" and unaware of changes.
I fixed the problem by using a dbContext via the using scheme inside of my methods, twice in my SendForexAsync method (once per every call of the GetForex function), as well as in the GetForex function itself. That way, a dbContext is created and disposed of immediately, so the next time I poll the database for new data via the GetForex function (when I get a notification from the database due to the NOTIFY from the console app), a new instance of dbContext is created which can contain that new data.

Is the following code with Vert.x really reactive?

Do I have a wrong understanding of "reactive" or is something wrong in my example?
I did a small code sample in Vertx: In a REST service I read data from mongodb and returning as JSON.
...........
Router router = Router.router(vertx);
router.route().handler(BodyHandler.create());
router.get("/gilders").handler(this::listAll);
vertx.createHttpServer().requestHandler(router::accept).listen(8080);
}
private void listAll(RoutingContext routingContext) {
mongoClient.find("gliders", new JsonObject(), results -> {
List<JsonObject> objects = results.result();
/* is this non blocking?!
mongoClient.find return immediately, but the rest client just
gets results, after mongo delivered all results
*/
List<Glider> gilder = objects.stream()
.map(res -> {
Glider g = new Glider();
g.setName(res.getString("name"));
g.setPrice(res.getString("price"));
return g;
})
.collect(Collectors.toList());
routingContext.response()
.putHeader("content-type", "application/json; charset=utf-8")
.end(Json.encodePrettily(gilder));
});
}
OK, its not blocking, I could compute something else meanwhile waiting for mongo.
But somehow I thought about "reactive" is that the REST client will get already the first chunks of the mongo results even mongo is still not ready finding all by that time (HTTP Streaming). But like this, the callback is just invoked, when mongo found all results.
Reactive is not the same as streaming. Reactive is a concept around data flows, your application will react to events, e.g.: data returned from mongoDB. You can now implement streaming on top of it by asking the mongo client to start pumping data asap as it arrives from the network. However in a blocking API you could do streaming by blocking the application for data and then pass it one by one to a consumer.

"ERROR: 57014: canceling statement due to user request" Npgsql

I am having this phantom problem in my application where one in every 5 request on a specific page (on an ASP.NET MVC application) throws this error:
Npgsql.NpgsqlException: ERROR: 57014: canceling statement due to user request
at Npgsql.NpgsqlState.<ProcessBackendResponses>d__0.MoveNext()
at Npgsql.ForwardsOnlyDataReader.GetNextResponseObject(Boolean cleanup)
at Npgsql.ForwardsOnlyDataReader.GetNextRow(Boolean clearPending)
at Npgsql.ForwardsOnlyDataReader.Read()
at Npgsql.NpgsqlCommand.GetReader(CommandBehavior cb)
...
On the npgsql github page I found the following bug report: 615
It says there:
Regardless of what exactly is happening with Dapper, there's
definitely a race condition when cancelling commands. Part of this is
by design, because of PostgreSQL: cancel requests are totally
"asynchronous" (they're delivered via an unrelated socket, not as part
of the connection to be cancelled), and you can't restrict the
cancellation to take effect only on a specific command. In other
words, if you want to cancel command A, by the time your cancellation
is delivered command B may already be in progress and it will be
cancelled instead.
Although they have made "changes to hopefully make cancellations much safer" in Npgsql 3.0.2 my current code is incompatible with this version because the need of migration described here.
My current workaround (stupid): I have commented the code in Dapper that says command.Cancel(); and the problem seems to be gone.
if (reader != null)
{
if (!reader.IsClosed && command != null)
{
//command.Cancel();
}
reader.Dispose();
reader = null;
}
Is there a better solution to the problem? And secondly what am I loosing with the current fix (except that I have to remember the change every time I update Dapper)?
Configuration:
NET45,
Npgsql 2.2.5,
Postgresql 9.3
I found why my code didn't dispose the reader, resulting in calling command.Cancel(). This only happens with QueryMultiple method when not every refcursor is read.
Changing the code from:
using (var multipleResults = connection.QueryMultiple("schema.getuserbysocialsecurity", new { socialSecurityNumber }))
{
var client = multipleResults.Read<Client>().SingleOrDefault();
if (client != null)
{
client.Address = multipleResults.Read<Address>().Single();
}
return client;
}
To:
using (var multipleResults = connection.QueryMultiple("schema.getuserbysocialsecurity", new { socialSecurityNumber }))
{
var client = multipleResults.Read<Client>().SingleOrDefault();
var address = multipleResults.Read<Address>().SingleOrDefault();
if (client != null)
{
client.Address = address;
}
return client;
}
This fixed the issue and now the reader is properly disposed and command.Cancel() is not invoked.
Hope this helps anyone else!
UPDATE
The npgsql docs for version 2.2 states:
Npgsql is able to ask the server to cancel commands in progress. To do
this, call the NpgsqlCommand’s Cancel method. Note that another thread
must handle the request as the main thread will be blocked waiting for
command to finish. Also, the main thread will raise an exception as a
result of user cancellation. (The error code is 57014.)
I have also posted an issue on the Dapper github page.

Waiting for more than one event (using GWT)

I want to fetch two XML documents from the server and resume processing when both have arrived. Can I fetch them in parallel, or do I have to refrain from issuing the second request until the first has completed?
You can fetch them in parallel, but keep in mind that browsers have a limit on the number of parallel requests, see http://www.browserscope.org/?category=network (choose "Major Versions" in the dropdown on the top left to see more versions). Note especially, that IE < 8 has a limit of 2 connections per hostname!
If you still want to do this, then note that the responses can arrive in any order. So you'll have to implement something that will keep track of the requests/responses (a counter or something more sophisticated), so that you'll know when all responses you need have arrived.
The best solution is often to send just one request that asks for both XML documents, and the server returns them both at once in one response.
Make both requests, then check when either one completes whether the other is done, and continue if it is.
private String responseOne;
private String responseTwo;
public startRequests() {
makeAsyncRequestOne(new AsyncCallback<String>() {
onSuccess(String response) {
this.responseOne = response;
if (responseTwo != null) {
proceed();
}
}
});
makeAsyncRequestTwo(new AsyncCallback<String>() {
onSuccess(String response) {
this.responseTwo = response;
if (responseOne != null) {
proceed();
}
}
});
}
As Chris points out, this may hit a ceiling on maximum concurrent requests to the same hostname, so if you have lots of requests to send at once, you could keep a queue of requests and call the next one in proceed() until the queue is exhausted.
But if you plan on having a lot of concurrent requests, you probably need to redesign your service anyway, to batch operations together.