Vertx mysql client asynchronous and synchronou - vert.x

I need help on the vertx mysql client v4.1.0 java api. I have used the two codes below however I am getting inconsistent results with number 2 as it returns nulls even when there is a database record. Is it not a blocking call??
/* 1. asynchronous */
mysql.query(sql).execute(asr -> {
if (asr.succeeded()) {
RowSet<Row> rowset = asr.result();
}else{
//Log and handle error
}
});
/* 2. ??? synchronous */
RowSet<Row> rowset = mysql.query(sql).execute().result();

Both .reactivex and raw Vert.x MySQL client library variants are asynchronous by design providing a Handler base type as a way to handle SQL queries results:
io.vertx.sqlclient.Query#execute(Handler<AsyncResult> handler)
Execute the query.
Vert.x still provides API portability to common patterns such as Java's Future and here down the method signature:
io.vertx.sqlclient.Query#execute()
Like execute(Handler) but returns a Future of the asynchronous result
The Reactive client library port provides an rxfied method as well returning query results as a publisher Single:
io.vertx.reactivex.sqlclient.Query#rxExecute()
Execute the query.
With above considerations in mind, when calling:
RowSet<Row> rowset = mysql.query(sql).execute().result();
You are executing the SQL query and then blocking the current thread until the query result is returned and the Future is resolved.

Related

Understanding Signal and Query in cadence

Query - Query is to expose this internal state to the external world. A query is exposed as an asynchronous callback that is invoked by external entities.
What do you mean by asynchronous callback?
And Doc says, Query has two limitations 1). Should not mutate the state of a Workflow 2). There won't be any blocking operation.
#Override
public String queryGreeting() {
greeting = "val";
return greeting;
}
But I did mutate the variable in the query method and It is changing the value.
Is it just a conviction that we should not write mutable or blocking code inside query method?
I didn't see any difference between query and signal. A query method will be called even after the completion of a workflow where as Signal won't?
Is my understanding correct?
The query should not mutate workflow variables. This is going to break workflow recovery.
The signal can mutate any workflow data as well as invoke blocking operations like activities.

Spring Boot controller preventing multiple inserts upon quick successive requests in mongodb

I have a REST API to calculate something upon a request, and if the same request is made again, return the result from the cache, which consist of documents saved in MongoDB. To know if two request is the same, I am hashing some relevant fields in the request. But when same request is made in a quick succession, duplicate documents occur in MongoDB, which later results in "IncorrectResultSizeDataAccessException" when I try to read them.
To solve it I tried to synchronize on hash value in following controller method (tried to cut out non relevant parts):
#PostMapping(
path = "/{myPath}",
consumes = {MediaType.APPLICATION_JSON_UTF8_VALUE},
produces = {MediaType.APPLICATION_JSON_UTF8_VALUE})
#Async("asyncExecutor")
public CompletableFuture<ResponseEntity<?>> retrieveAndCache( ... a,b,c,d various request parameters) {
//perform some validations on request...
//hash relevant equest parameters
int hash = Objects.hash(a, b, c, d);
synchronized (Integer.toString(hash).intern()) {
Optional<Result> resultOpt = cacheService.findByHash(hash);
if (resultOpt.isPresent()) {
return CompletableFuture.completedFuture(ResponseEntity.status(HttpStatus.OK).body(opt.get().getResult()));
} else {
Result result = ...//perform requests to external services and do some calculations...
cacheService.save(result);
return CompletableFuture.completedFuture(ResponseEntity.status(HttpStatus.OK).body(result));
}
}
}
//cacheService methods
#Transactional
public Optional<Result> findByHash(int hash) {
return repository.findByHash(hash); //this is the part that throws the error
}
I am sure that no hash collision is occuring, its just when the same request is performed in a quick succession duplicate records occur. To my understanding, it shouldn't occur as long as I have only 1 running instance of my spring boot application. Do you see any other reason than there are multiple instances running in production?
You should check the settings of your MongoDB client.
If one thread calls the cacheService.save(result) method, and after that method returns, releases the lock, then another thread calls cacheService.findByHash(hash), it's still possible that it will not find the record that you just saved.
It's possible that e.g. the save method returns as soon as the saved object is in the transaction log, but not fully processed yet. Or the save is processed on the primary node, but the findByHash is executed on the secondary node, where it's not replicated yet.
You could use WriteConcern.MAJORITY, but I'm not 100% sure if it covers everything.
Even better is to let MongoDB do the locking by using findAndModify with FindAndModifyOptions.upsert(true), and forget about the lock in your java code.

Vertx to mongoDB connections

I'm working on a Java/vertx project where the backend is MongoDB (I used to work with Elixir/Erlang since some time, and I'm quite new to vertx but I believe it's the best fit). Basically, I have an http API handled by some HttpServerVerticles which need to store data to (or retrieve data from) the mongo db and to send the appropriate reply to the API caller. I'm looking for the right pattern to implement the queries and the handling of the replies.
From the official guide and some tutorials, I see that for a relational JDBC database, it is necessary to define a dedicated verticle that will handle queries asynchronously. This was my first try with the mongo client but it introduces a lot of boilerplate.
On the other hand, from the mongo client documentation I read that it's Completely non-blocking and that it has its own connection pool. Does that mean that we can safely (from vertx event loop point of view), define and use the mongo client directly in the http verticle ?
Is there any alternative pattern ?
Versions : vertx:3.5.4 / mongodb:4.0.3
It's like that: mongo connection pool is exactly like SQL-db pool synchronous and blocking in it's nature, but is wrapped with non-blocking vert.x API around.
So, instead of a normal blocking way of
JsonObject obj = mongo.get( someQuery )
you have rather a non-blocking call out of the box:
mongo.findOne( 'collectionName', someQuery ){ AsyncResult<JsonObject> res ->
JsonObject obj = res.result()
doStuff( obj )
}
That means, that you can safely use it directly on the event-loop in any type of verticle without reinventing the asyncronous wheel over and over again.
At our client we use mongodb-driver-rx. Vertx has support for RX (vertx-rx-java) and it fits pretty well on mongodb-driver-rx.
For more information see:
https://mongodb.github.io/mongo-java-driver-rx/
https://vertx.io/docs/vertx-rx/java/
https://github.com/vert-x3/vertx-examples/blob/master/rxjava-2-examples/src/main/java/io/vertx/example/reactivex/database/mongo/Client.java

MongoDb and expressJS trying to undertand Code

function _allUsers(callback){
var db = connect.get();
db.collection("users").find({}).toArray(function(err,data){
if(err){
callback(err);
}else{
callback(null,data);
}
});
}
I am trying to understand this code, I have been looking around the web but I find the explanations kinda defficult to understand ( I am new at Mean stack), so my questions are:
What does the Collection method do? I am not sure but the string "users" is it just the name of our collection with all users?
Why do we have to use a callback in this situation? (I find callbacks very confusing).
And why do we have to give toArray function, an annonymous function?
Instead of toArray could I use pretty method() without any annonymous function as a parameter?
MEAN Stack is a software bundle of software programs supporting applications written in all javascript. This means you can use javascript from your database, to your back-end and front-end.
MEAN actually stands for the first characters of each software program included in the stack. MongoDB, Expressjs, AngularJS and NodeJS.
1
MongoDB is a NoSQL database which uses BSON (similar to JSON) to store so called documents. Look at a document as if it is a single entity or row in a traditional database. These entities (or rows) are stored in collections (a collection of documents) which can be compared to tables.
So the answer to your 1st question is opens up the users collection, which grants access to all the user documents.
2
NodeJS is asynchronous by design. This allows NodeJS to perform a lot of operations while running on a single thread*. Because NodeJS is single-threaded we need a way to write our code non-blocking meaning we can start an operation, proceed with executing other code and come back whenever that operation is finished.
In your case we request access to the users collection, this takes some time. In order to allow other parts of our application to continue processing we use a callback. When we have access to our collection, our callback is executed and we can perform whatever operation we wanted to do when we first requested access.
*NodeJS actually runs on multiple threads but a developer never has to worry about multithreading, NodeJS does that for us.'
3
This is exactly what the previous point is about.
The .toArray() method returns an array that contains all the documents from a cursor. The method iterates completely the cursor, loading all the documents into RAM and exhausting the cursor. Source
.toArray() is a computionally intensive operation. Since we do not want to wait untill .toArray() is finished but proceed processing the rest of our code, we give it a callback so that we can come back to our collection processing whenever it's ready.
4
From what I can read from the docs I guess you could indeed write blocking code and do it this way:
var users = db.collection("users").find({}).toArray();
This however will block your code entirely. There is never a good reason to do this.
Disclaimer: I left out or oversimplified details in this explanation for ease of understanding.
db.collection('users') this will return the users collection instance
we are using callback for asynchronous
the annonymous function in toArray is its callback
this is dependent on the library in use..
without any annonymous function as a parameter
expressjs is an asynchronous programming, we need callback || Promises
You can think of the collection as of table in MySQL. A collection consists of documents (rows/items/records in MySQL). Your example calls the Users collection and finds all documents (records) in it.
About the callbacks - NodeJS/Express are commonly callbacks-oriented. This is the pattern they use and most of the code is using it, because it is asynchronous. If you need to be sure that some snippet is executed right after some other snippet, you have to use callback (or promise).
Calling toArray() depends on what your callback expects. You can skip calling this method if the callback expects the Query object returned by the find() method. All that depends on your callback.
You can use non-anonymous function, too, but you have to have in mind the asynchronous logic and continue using callbacks/promises. You can read more about callbacks and promises in this Quora's article.
Here you can find more about the find() method.

Wrap Slick queries in Futures

I am trying to query a MySQL database asynchronously using Slick. The following code template, which I use to query about 90k rows in a for comprehension, seems to be working initially, but the program consumes several gigabytes of RAM and fails without warning after around 200 queries.
import scala.slick.jdbc.{StaticQuery => Q}
def doQuery(): Future[List[String]] = future {
val q = "select name from person"
db withSession {
Q.query[String](q).list
}
}
I have tried setting up connections both using the fromURL method and also using a c3p0 connection pool. My question is: Is this the way to do asynchronous calls to the database?
Async is still an open issue for Slick.
You could try using Iterables and stream data instead of storing it in memory with a solution similar to this: Treating an SQL ResultSet like a Scala Stream
Although please omit the .toStream call at the end. It will cache the data in memory, while Iterable will not.
If you want an async version of iterable you could look into Observables.
It turns out that this is a non issue (actually a bug in my code, which opened a new database connection for each query). In my experience, you can wrap DB queries in Futures as shown above and compose them later with Scala Async or Rx, as shown here. All is required for good performance is a large thread pool (x2 the CPUs in my case) and an equally large connection pool.
Slick 3 (Reactive Slick) looks like it might address this.