MongoDb and expressJS trying to undertand Code - mongodb

function _allUsers(callback){
var db = connect.get();
db.collection("users").find({}).toArray(function(err,data){
if(err){
callback(err);
}else{
callback(null,data);
}
});
}
I am trying to understand this code, I have been looking around the web but I find the explanations kinda defficult to understand ( I am new at Mean stack), so my questions are:
What does the Collection method do? I am not sure but the string "users" is it just the name of our collection with all users?
Why do we have to use a callback in this situation? (I find callbacks very confusing).
And why do we have to give toArray function, an annonymous function?
Instead of toArray could I use pretty method() without any annonymous function as a parameter?

MEAN Stack is a software bundle of software programs supporting applications written in all javascript. This means you can use javascript from your database, to your back-end and front-end.
MEAN actually stands for the first characters of each software program included in the stack. MongoDB, Expressjs, AngularJS and NodeJS.
1
MongoDB is a NoSQL database which uses BSON (similar to JSON) to store so called documents. Look at a document as if it is a single entity or row in a traditional database. These entities (or rows) are stored in collections (a collection of documents) which can be compared to tables.
So the answer to your 1st question is opens up the users collection, which grants access to all the user documents.
2
NodeJS is asynchronous by design. This allows NodeJS to perform a lot of operations while running on a single thread*. Because NodeJS is single-threaded we need a way to write our code non-blocking meaning we can start an operation, proceed with executing other code and come back whenever that operation is finished.
In your case we request access to the users collection, this takes some time. In order to allow other parts of our application to continue processing we use a callback. When we have access to our collection, our callback is executed and we can perform whatever operation we wanted to do when we first requested access.
*NodeJS actually runs on multiple threads but a developer never has to worry about multithreading, NodeJS does that for us.'
3
This is exactly what the previous point is about.
The .toArray() method returns an array that contains all the documents from a cursor. The method iterates completely the cursor, loading all the documents into RAM and exhausting the cursor. Source
.toArray() is a computionally intensive operation. Since we do not want to wait untill .toArray() is finished but proceed processing the rest of our code, we give it a callback so that we can come back to our collection processing whenever it's ready.
4
From what I can read from the docs I guess you could indeed write blocking code and do it this way:
var users = db.collection("users").find({}).toArray();
This however will block your code entirely. There is never a good reason to do this.
Disclaimer: I left out or oversimplified details in this explanation for ease of understanding.

db.collection('users') this will return the users collection instance
we are using callback for asynchronous
the annonymous function in toArray is its callback
this is dependent on the library in use..
without any annonymous function as a parameter
expressjs is an asynchronous programming, we need callback || Promises

You can think of the collection as of table in MySQL. A collection consists of documents (rows/items/records in MySQL). Your example calls the Users collection and finds all documents (records) in it.
About the callbacks - NodeJS/Express are commonly callbacks-oriented. This is the pattern they use and most of the code is using it, because it is asynchronous. If you need to be sure that some snippet is executed right after some other snippet, you have to use callback (or promise).
Calling toArray() depends on what your callback expects. You can skip calling this method if the callback expects the Query object returned by the find() method. All that depends on your callback.
You can use non-anonymous function, too, but you have to have in mind the asynchronous logic and continue using callbacks/promises. You can read more about callbacks and promises in this Quora's article.
Here you can find more about the find() method.

Related

Can I assume that all databases will return a promise?

I am designing a set of unit and integration tests with a friend of mine, and we had a doubt. We know the answer, at least, what is more likely to be true. However, we would like to hear your thoughts.
We are designing a test for MongoDB, and we expect that we should receive a promise, after asking to save a document. So far so good.
What about if we change the database??? can we assume for sure that all databases when queried will return a promise??
I guess regarding the _id, it depends from database to database, we are using _id from MongoDB for testing reasons.
We are using the following test using in Jest
create://this method do exist on service, however, at the moment of testing, it is empty, just a placeholder
jest.fn().mockImplementation((cat: CreateCatDto) =>
Promise.resolve({ _id: 'a uuid', ...cat })
)
The idea is to design backends that do not depend on the database, but for testing and developing reasons, we are using MongoDB and PostgreSQL.
Keep in mind the term promise doesn't exist as a concept for all databases, and to provide a conclusive answer to your question for all databases is not possible.
That being said, if by promise you mean the primary key or identity (in general database terms) after inserting new data, then the answer is no, there is no guarantee, not even on PostgreSQL that'll you be able to do that. It's possible for tables, even in PostgreSQL, to exist without those constraints.
Otherwise, if by promise, you mean specifically the concept in a procedural or functional language like JavaScript (as your example code in your update indicates), then yes, you should always receive a promise object if your application code is utilizing asynchronous calls appropriately.
But that would be regardless of what the asynchronous call was to whether it's a database directly (and regardless of what database system that was), an API endpoint, or another piece of application code. Also, in that case, your question (or any follow up questions) would be better suited for StackOverflow.com.
Can I assume that all databases return a promise?
No. Most if not all database wire protocols are synchronous, meaning the client blocks until it gets a response. Even the databases that expose some sort of RESTful APIs are synchronous, because HTTP.
Some client-side drivers may wrap this synchronous logic and exhibit asynchronous behaviour by returning something like JavaScrpt Promises or Java Futures, but it is entirely up to the driver implementation you choose to use.

CQRS - How to handle if a command requires data from db (query)

I am trying to wrap my head around the best way to approach this problem.
I am importing a file that contains bunch of users so I created a handler called
ImportUsersCommandHandler and my command is ImportUsersCommand that has List<User> as one of the parameters.
In the handler, for each user that I need to import I have to make sure that the UserType is valid, this is where the confusion comes in. I need to do a query against the database, to get list of all possible user types and than for each user I am importing, I want to verify that the user type id in the import matches one that is in the db.
I have 3 options.
Create a query GetUserTypesQuery and get the rest of this and then pass it on to the ImportUsersCommand as a list and verify inside the command handler
Call the GetUserTypesQuery from the command itself and not pass it (command calling another query)
Do not create a GetUsersTypeQuery and just do the query results within the command (still a query but no query/handler involved)
I feel like all these are dirty solutions and not the correct way to apply CQRS.
I agree option 1 sounds the best but would maybe suggest adding a pre handler to validate your input?
So ImportUsersCommandHandler deals with importing you data (and only that) and add a handler that runs before that validates (in your example, checks the user types and maybe other stuff) and bails out of it does not pass. So it queries the db, checks the usertypes and does whatever it needs to if it fails. Otherwise it just passes down to your business handler (ImportUsersCommandHandler).
I am used to using Mediatr in NET Core and this pattern works well (this is what we do) so sorry if this does not fit with your environment/setup!

How to update data with GraphQL

I am studying graphql.
I can retrieve data from my mongo database with queries, I can create data with mutations.
But how I can modify existing data?
I am a bit lost here...
I have to create a new mutation?
Yes, every mutation describes a specific action that can be done to a bit of data. GraphQL is not like REST - it doesn't specify any standard CRUD-type actions.
When you are writing a mutation to update some data, you have two options. Let's explain them in the context of a todo item that has a completed status, and a text field:
Write mutations that represent semantic actions - markTodoCompleted, updateTodoText, etc.
Write a generic mutation that just sets any properties passed it, you could call it updateTodo.
I prefer the first approach, because it makes it more clear what the client is doing when it calls a certain mutation. In the second approach, you need to be careful to validate the values to be set to make sure someone can't set some invalid combination.
In short, you need to define your own mutations to update data.

MongoDB - is this efficient?

I have a collection users in mongodb. I am using the node/mongodb API.
In my node.js if I go:
var users = db.collection("users");
and then
users.findOne({ ... })
Is this idiomatic and efficient?
I want to avoid loading all users into application memory.
In Mongo, collection.find() returns a cursor, which is an abstract representation of the results that match your query. It doesn't return your results over the wire until you iterate over the cursor by calling cursor.next(). Before iterating over it, you can, for instance, call cursor.limit(x) to limit the number of results that it is allowed to return.
collection.findOne() is effectively just a shortcut version of collection.find().limit(1).next(). So the cursor is never allowed to return more than the one result.
As already explained, the collection object itself is a facade allowing access to the collection, but which doesn't hold any actual documents in memory.
findOne() is very efficient, and it is indeed idiomatic, though IME it's more used in dev/debugging than in real application code. It's very useful in the CLI, but how often does a real application need to just grab any one document from a collection? The only case I can think of is when a given query can only ever return one document (i.e. an effective primary key).
Further reading:
Cursors,
Read Operations
Yes, that should only load one user into memory.
The collection object is exactly that, it doesn't return all users when you create a new collection object, only when you either use a findOne or you iterate a cursor from the return of a find.

is this a bad habit when using a database call?

i'm using tornado for its simplicity, and am using it with Pymongo, so because i hear always about asynchronous calls, to serve lot of clients, then, i was asking, what is really an asynchronous calls to a database, so this code for example:
for example, suppose a page where a user have 4 areas where to search, so the result will be a 4 results.
A = calls the database to search for an element a.
B = calls the database to search for an element b.
C = calls the database to search for an element c.
D = calls the database to search for an element d.
then render a pages where a user will see the results (a,b,c,d)
so, this will be a killer for the server, since he must stay for all the 4 requests to finish, or do it serve the first result and then wait even if the database calls are blocking and make a bucket where he joins all the results to be served to the client? or the split of the 4 operations must be done with asynchronous database library (like Motor or Asyncmongo)?
Every call to PyMongo will block Tornado's IOLoop and prevent further processing of any client HTTP request until the PyMongo method completes.
http://api.mongodb.org/python/current/faq.html#does-pymongo-support-asynchronous-frameworks-like-gevent-tornado-or-twisted