How to make Mongoose model.insertMany insert documents with numerical and ordered ids? - mongodb

I have this route in the backend express server:
router.route('/fillInformationAssetsSeverityEvaluation').post((req, res) => {
informationAssetsSeverityEvaluationRow.remove({}, (err) => {
if (err)
console.log(err);
else
// res.json("informationAssets Collection has been dropped!");
res.json('information Assets Severity Evaluation data has been received on the server side')
informationAssetsSeverityEvaluationRow.insertMany([req.body[0]], {
multi: true
}).then(documentsInserted => {
console.log('[req.body[0]]: ', [req.body[0]]);
console.log('documentsInserted: ', documentsInserted);
console.log('You have succesfully inserted ', documentsInserted.length, ' documents in informationAssetsSeverityEvaluation collection');
});
});
})
For the sake of simplicity, I am inserting only one document.
[req.body[0]]
{ REF: 'REFSHIT',
confFin: 'A',
confRep: 'A'}
But, in the real applications, I am inserting multiple documents similar to that.
This consoleLog :
console.log('documentsInserted: ', documentsInserted);
logs:
documentsInserted: [ { _id: 5d3453afc302d718e4870b53,
REF: 'REFSHIT',
confFin: 'A',
confRep: 'A'}]
As you see the id is automatically generated:
> _id: 5d3453afc302d718e4870b53
What I would like is: The ids of the different documents to be "numerically ordered". I.e:
Document 0 would have id 0
Document 1 would have id 1
Document 2 would have id 2
And so on and so forth.
After having made some research, I found out that I can do this manually by inserting the id manually inside the updateMany objects.
However, since I receive the documents objects from the request body, this is not a viable solution.
Any help?

Finally after trying four modules and a couple of days of trying for something that should be native to mongodb, I have found a simple solution. I hope it helps someone.
1/ Install mongoose-plugin-autoinc
2/
import mongoose from 'mongoose';
import { autoIncrement } from 'mongoose-plugin-autoinc';
const connection = mongoose.createConnection("mongodb://localhost/myDatabase");
const BookSchema = new mongoose.Schema({
author: { type: Schema.Types.ObjectId, ref: 'Author' },
title: String,
genre: String,
publishDate: Date
});
BookSchema.plugin(autoIncrement, 'Book');
const Book = connection.model('Book', BookSchema);
2/ In my case I have the models defined in models.js and the connection defined in server.js so I had to write this :
BookSchema.plugin(autoIncrement, 'Book');
in models.js
and instead of
const Book = connection.model('Book', BookSchema);
I have:
module.exports = {
informationAssetsRow: mongoose.model('informationAssetsRow', informationAssetsRow),
};
And in server.js:
const {
informationAssetsRow,
} = require('./models/models')

Related

MongoDB query with 300k documents takes more than 30 seconds

Ok, as said in title, I have "performance issue" where I need to get all documents from a collection but it takes too long. Players collection contains around 300k documents with small size and query in service goes like this:
async getAllPlayers() {
const players = await this.playersCollection.find({}, {projection: { playerId: 1, name: 1, surname: 1, shirtNumber: 1, position: 1 }}).toArray();
return players;
}
Overall size is 6.4MB. I'm using Fastify adapter, fastify-compress and mongodb native driver. If I remove projection, it takes almost a minute.
Any idea how to improve this?
The best time I get is 8 seconds, where fast-json-stringify give me more than 10 seconds boost over 300k records:
'use strict'
// run fresh mongo
// docker run --name temp --rm -p 27017:27017 mongo
const fastify = require('fastify')({ logger: true })
const fjs = require('fast-json-stringify')
const toString = fjs({
type: 'object',
properties: {
playerId: { type: 'integer' },
name: { type: 'string' },
surname: { type: 'string' },
shirtNumber: { type: 'integer' },
}
})
fastify.register(require('fastify-mongodb'), {
forceClose: true,
url: 'mongodb://localhost/mydb'
})
fastify.get('/', (request, reply) => {
const dataStream = fastify.mongo.db.collection('foo')
.find({}, {
limit: 300000,
projection: { playerId: 1, name: 1, surname: 1, shirtNumber: 1, position: 1 }
})
.stream({
transform(doc) {
return toString(doc) + '\n'
}
})
reply.type('application/jsonl')
reply.send(dataStream)
})
fastify.get('/insert', async (request, reply) => {
const collection = fastify.mongo.db.collection('foo')
const batch = collection.initializeOrderedBulkOp();
for (let i = 0; i < 300000; i++) {
const player = {
playerId: i,
name: `Name ${i}`,
surname: `surname ${i}`,
shirtNumber: i
}
batch.insert(player);
}
const { result } = await batch.execute()
return result
})
fastify.listen(8080)
In any case, you should consider to:
paginate your output
or pushing the data into a bucket (like S3) and return to the client a URL to download the file directly, this will speed up a lot the process and will save your node.js process from this data streaming
Note that the compression in node.js is a heavy process, so it slows it down a lot the response. An nginx proxy adds it by default without the need to implement it in your business logic server.

Get all documents from an authenticated user (relation OneToMany)

I am learning to use mongoDB AND ExpressJS by building a Rest API that I would use with ReactJS.
I have always chosen MySQL for the management of my database, but the mongoDB database is not relational and it is still difficult for me to understand.
An example of what I want to do
Let's say that I have created a blog and want to get all the articles from a user logged in with an account.
All these operations are managed with a REST API and MongoDB.
How to create a OneToMany relationship between articles and a user.
With MySQL I just had to specify a user_id key for each article in an article table.
But with mongoDB how to create this and especially for a user who is logged in with an account, so that only a logged in user can view his articles.
EDIT
I have tried something, it works but I don't know if it's the right approach.
Context:
I made a REST API with NodeJS and ExpressJS.
The API will allow a user to organize their applications to facilitate the search for a job.
A user must create an account and log in to take advantage of all of the application's features, so no information is publicly available.
For registration and authentication of a user, I use PassportJS, mongoConnect and ExpressSession
To start, the User model of mongoDB
const userSchema = mongoose.Schema({
name: {
type:String
},
email: {
type:String,
required:true,
unique:true
},
email_is_verified: {
type:Boolean,
default:false
},
password: {
type:String,
},
referral_code : {
type:String,
default: function() {
let hash = 0;
for(let i=0; i < this.email.length; i++){
hash = this.email.charCodeAt(i) + ((hash << 5) - hash);
}
let res = (hash & 0x00ffffff).toString(16).toUpperCase();
return "00000".substring(0, 6 - res.length) + res;
}
},
referred_by : {
type: String,
default:null
},
third_party_auth: [ThirdPartyProviderSchema],
date: {
type:Date,
default: Date.now
}
},
{ strict: false }
);
module.exports = mongoose.model('Users', userSchema);
The Apply model represents an apply for a job, for now there is only the title.
To create the OneToMany relationship, I add a User field which refers to my User model
Function to retrieve all applies, so I retrieve the user id of the session.
const applySchema = mongoose.Schema({
title: { type:String, required:true },
user: {
type: mongoose.Schema.Types.ObjectId,
ref: "User"
}
})
module.exports = mongoose.model('Apply', applySchema);
I created a controller for the management of a user's applies
exports.getAllApplies = (req, res, next) => {
res.locals.currentUser = req.user;
const userId = res.locals.currentUser.id
Apply.find({ user:userId })
.then(applies => res.status(200).json({ message:'success',
applies:applies }))
.catch(error => res.status(400).json({ error:error, message: 'Failed'}))
}
Function allowing to consult an apply
exports.getOneApply = (req, res, next) => {
res.locals.currentUser = req.user;
const userId = res.locals.currentUser.id
Apply.findOne({ _id:req.params.id, user:userId })
.then(apply => res.status(200).json({ message: `Apply with id
${apply._id} success`, apply:apply}))
.catch(error => res.status(500).json({ error:error, message:'Failed'}))
}
The routes of my api, I add an auth middleware to allow requests only for a user with a token
const express = require('express');
const router = express.Router();
const auth = require('../middleware/auth');
const applyCtrl = require('../controllers/apply');
router.get('/', auth, applyCtrl.getAllApplies);
router.get('/:id', auth, applyCtrl.getOneApply);
module.exports = router;
I apologize for the length of the post, if you have any questions, I would be happy to answer them.
Thank you in advance for your help and your answers.

how to connect postgresql with graphql [duplicate]

GraphQL has mutations, Postgres has INSERT; GraphQL has queries, Postgres has SELECT's; etc., etc.. I haven't found an example showing how you could use both in a project, for example passing all the queries from front end (React, Relay) in GraphQL, but to a actually store the data in Postgres.
Does anyone know what Facebook is using as DB and how it's connected with GraphQL?
Is the only option of storing data in Postgres right now to build custom "adapters" that take the GraphQL query and convert it into SQL?
GraphQL is database agnostic, so you can use whatever you normally use to interact with the database, and use the query or mutation's resolve method to call a function you've defined that will get/add something to the database.
Without Relay
Here is an example of a mutation using the promise-based Knex SQL query builder, first without Relay to get a feel for the concept. I'm going to assume that you have created a userType in your GraphQL schema that has three fields: id, username, and created: all required, and that you have a getUser function already defined which queries the database and returns a user object. In the database I also have a password column, but since I don't want that queried I leave it out of my userType.
// db.js
// take a user object and use knex to add it to the database, then return the newly
// created user from the db.
const addUser = (user) => (
knex('users')
.returning('id') // returns [id]
.insert({
username: user.username,
password: yourPasswordHashFunction(user.password),
created: Math.floor(Date.now() / 1000), // Unix time in seconds
})
.then((id) => (getUser(id[0])))
.catch((error) => (
console.log(error)
))
);
// schema.js
// the resolve function receives the query inputs as args, then you can call
// your addUser function using them
const mutationType = new GraphQLObjectType({
name: 'Mutation',
description: 'Functions to add things to the database.',
fields: () => ({
addUser: {
type: userType,
args: {
username: {
type: new GraphQLNonNull(GraphQLString),
},
password: {
type: new GraphQLNonNull(GraphQLString),
},
},
resolve: (_, args) => (
addUser({
username: args.username,
password: args.password,
})
),
},
}),
});
Since Postgres creates the id for me and I calculate the created timestamp, I don't need them in my mutation query.
The Relay Way
Using the helpers in graphql-relay and sticking pretty close to the Relay Starter Kit helped me, because it was a lot to take in all at once. Relay requires you to set up your schema in a specific way so that it can work properly, but the idea is the same: use your functions to fetch from or add to the database in the resolve methods.
One important caveat is that the Relay way expects that the object returned from getUser is an instance of a class User, so you'll have to modify getUser to accommodate that.
The final example using Relay (fromGlobalId, globalIdField, mutationWithClientMutationId, and nodeDefinitions are all from graphql-relay):
/**
* We get the node interface and field from the Relay library.
*
* The first method defines the way we resolve an ID to its object.
* The second defines the way we resolve an object to its GraphQL type.
*
* All your types will implement this nodeInterface
*/
const { nodeInterface, nodeField } = nodeDefinitions(
(globalId) => {
const { type, id } = fromGlobalId(globalId);
if (type === 'User') {
return getUser(id);
}
return null;
},
(obj) => {
if (obj instanceof User) {
return userType;
}
return null;
}
);
// a globalId is just a base64 encoding of the database id and the type
const userType = new GraphQLObjectType({
name: 'User',
description: 'A user.',
fields: () => ({
id: globalIdField('User'),
username: {
type: new GraphQLNonNull(GraphQLString),
description: 'The username the user has selected.',
},
created: {
type: GraphQLInt,
description: 'The Unix timestamp in seconds of when the user was created.',
},
}),
interfaces: [nodeInterface],
});
// The "payload" is the data that will be returned from the mutation
const userMutation = mutationWithClientMutationId({
name: 'AddUser',
inputFields: {
username: {
type: GraphQLString,
},
password: {
type: new GraphQLNonNull(GraphQLString),
},
},
outputFields: {
user: {
type: userType,
resolve: (payload) => getUser(payload.userId),
},
},
mutateAndGetPayload: ({ username, password }) =>
addUser(
{ username, password }
).then((user) => ({ userId: user.id })), // passed to resolve in outputFields
});
const mutationType = new GraphQLObjectType({
name: 'Mutation',
description: 'Functions to add things to the database.',
fields: () => ({
addUser: userMutation,
}),
});
const queryType = new GraphQLObjectType({
name: 'Query',
fields: () => ({
node: nodeField,
user: {
type: userType,
args: {
id: {
description: 'ID number of the user.',
type: new GraphQLNonNull(GraphQLID),
},
},
resolve: (root, args) => getUser(args.id),
},
}),
});
We address this problem in Join Monster, a library we recently open-sourced to automatically translate GraphQL queries to SQL based on your schema definitions.
This GraphQL Starter Kit can be used for experimenting with GraphQL.js and PostgreSQL:
https://github.com/kriasoft/graphql-starter-kit - Node.js, GraphQL.js, PostgreSQL, Babel, Flow
(disclaimer: I'm the author)
Have a look at graphql-sequelize for how to work with Postgres.
For mutations (create/update/delete) you can look at the examples in the relay repo for instance.
Postgraphile https://www.graphile.org/postgraphile/ is Open Source
Rapidly build highly customisable, lightning-fast GraphQL APIs
PostGraphile is an open-source tool to help you rapidly design and
serve a high-performance, secure, client-facing GraphQL API backed
primarily by your PostgreSQL database. Delight your customers with
incredible performance whilst maintaining full control over your data
and your database. Use our powerful plugin system to customise every
facet of your GraphQL API to your liking.
You can use an ORM like sequelize if you're using Javascript or Typeorm if you're using Typescript
Probably FB using mongodb or nosql in backend. I've recently read a blog entry which explain how to connect to mongodb. Basically, you need to build a graph model to match the data you already have in your DB. Then write resolve, reject function to tell GQL how to behave when posting a query request.
See https://www.compose.io/articles/using-graphql-with-mongodb/
Have a look at SequelizeJS which is a promise based ORM that can work with a number of dialects; PostgreSQL, MySQL, SQLite and MSSQL
The below code is pulled right from its example
const Sequelize = require('sequelize');
const sequelize = new Sequelize('database', 'username', 'password', {
host: 'localhost',
dialect: 'mysql'|'sqlite'|'postgres'|'mssql',
pool: {
max: 5,
min: 0,
acquire: 30000,
idle: 10000
},
// SQLite only
storage: 'path/to/database.sqlite',
// http://docs.sequelizejs.com/manual/tutorial/querying.html#operators
operatorsAliases: false
});
const User = sequelize.define('user', {
username: Sequelize.STRING,
birthday: Sequelize.DATE
});
sequelize.sync()
.then(() => User.create({
username: 'janedoe',
birthday: new Date(1980, 6, 20)
}))
.then(jane => {
console.log(jane.toJSON());
});

Asynchronous Issues with JEST and MongoDB

I am getting inconsistent results with JEST when I try to remove items from a MongoDB Collection using the beforeEach() Hook.
My Mongoose schema and model defined as:
// Define Mongoose wafer sort schema
const waferSchema = new mongoose.Schema({
productType: {
type: String,
required: true,
enum: ['A', 'B'],
},
updated: {
type: Date,
default: Date.now,
index: true,
},
waferId: {
type: String,
required: true,
trim: true,
minlength: 7,
},
sublotId: {
type: String,
required: true,
trim: true,
minlength: 7,
},
}
// Define unique key for the schema
const Wafer = mongoose.model('Wafer', waferSchema);
module.exports.Wafer = Wafer;
My JEST tests:
describe('API: /WT', () => {
// Happy Path for Posting Object
let wtEntry = {};
beforeEach(async () => {
wtEntry = {
productType: 'A',
waferId: 'A01A001.3',
sublotId: 'A01A001.1',
};
await Wafer.deleteMany({});
// I also tried to pass in done and then call done() after the delete
});
describe('GET /:id', () => {
it('Return Wafer Sort Entry with specified ID', async () => {
// Create a new wafer Entry and Save it to the DB
const wafer = new Wafer(wtEntry);
await wafer.save();
const res = await request(apiServer).get(`/WT/${wafer.id}`);
expect(res.status).toBe(200);
expect(res.body).toHaveProperty('productType', 'A');
expect(res.body).toHaveProperty('waferId', 'A01A001.3');
expect(res.body).toHaveProperty('sublotId', 'A01A001.1');
});
}
So the error I always get is related to duplicate keys when I run my tests more than once:
MongoError: E11000 duplicate key error collection: promis_tests.promiswts index: waferId_1_sublotId_1 dup key: { : "A01A001.3", : "A01A001.1" }
But I do not understand how I can get this duplicate key error if the beforeEach() were firing properly. Am I trying to clear the collection improperly? I've tried passing in a done element to the before each callback and invoking it after delete command. I've also tried implementing the delete in beforeAll(), afterEach(), and afterAll() but still get inconsistent results. I'm pretty stumped on this one. I might just removed the schema key all together but I would like to understand what is going on here with the beforeEach(). Thanks in advance for any advice.
It might be because you are not actually using the promise API that mongoose has to offer. By default, mongooses functions like deleteMany() do not return a promise. You will have to call .exec() at the end of the function chain to return a promise e.g. await collection.deleteMany({}).exec(). So you are running into a race condition. deleteMany() also accepts a callback, so you could always wrap it in a promise. I would do something like this:
describe('API: /WT', () => {
// Happy Path for Posting Object
const wtEntry = {
productType: 'A',
waferId: 'A01A001.3',
sublotId: 'A01A001.1',
};
beforeEach(async () => {
await Wafer.deleteMany({}).exec();
});
describe('GET /:id', () => {
it('Return Wafer Sort Entry with specified ID', async () => {
expect.assertions(4);
// Create a new wafer Entry and Save it to the DB
const wafer = await Wafer.create(wtEntry);
const res = await request(apiServer).get(`/WT/${wafer.id}`);
expect(res.status).toBe(200);
expect(res.body).toHaveProperty('productType', 'A');
expect(res.body).toHaveProperty('waferId', 'A01A001.3');
expect(res.body).toHaveProperty('sublotId', 'A01A001.1');
});
}
Also, always expect the assertions with asynchronous code
https://jestjs.io/docs/en/asynchronous.html
You can read more about mongoose promises and query objects here
https://mongoosejs.com/docs/promises.html
Without deleting the schema index this seems to be the most reliable solution. Not 100% sure why it works over async await Wafer.deleteMany({});
beforeEach((done) => {
wtEntry = {
productType: 'A',
waferId: 'A01A001.3',
sublotId: 'A01A001.1',
};
mongoose.connection.collections.promiswts.drop(() => {
// Run the next test!
done();
});
});

Correct way to seed MongoDB with references via mongoose

I have three schemas, one which references two others:
userSchema
{ name: String }
postSchema
{ content: String }
commentSchema
{
content: String,
user: { ObjectID, ref: 'User' },
post: { ObjectID, ref: 'Post' }
}
How can I seed this database in a sane, scalable way? Even using bluebird promises it quickly becomes a nightmare to write.
My attempt so far involves multiple nested promises and is very hard to maintain:
User
.create([{ name: 'alice' }])
.then(() => {
return Post.create([{ content: 'foo' }])
})
.then(() => {
User.find().then(users => {
Post.find().then(posts => {
// `users` isn't even *available* here!
Comment.create({ content: 'bar', user: users[0], post: posts[0] })
})
})
})
This is clearly not the correct way of doing this. What am I missing?
Not sure about bluebird, but the nodejs Promise.all should do the job:
Promise.all([
User.create([{ name: 'alice' }]),
Post.create([{ content: 'foo' }])
]).then(([users, posts]) => {
const comments = [
{ content: 'bar', user: users[0], post: posts[0] }
];
return Comment.create(comments);
})
If you want to seed database with automatically references, use Seedgoose.
This is the easiest seeder for you to use. You don't need to write any program files, but only data files. And Seedgoose handles smart references for you. And by the way, I'm the author and maintainer of this package.
Try this it will work fine:
Note: Node Promise.all will make sure that the both query is executed properly and then return the result in Array:[Users, Posts],
If you get any error during execution of any query, it will be handle by catch block of the Promise.all.
let queryArray = [];
queryArray.push(User.create([{ name: 'alice' }]));
queryArray.push(Post.create([{ content: 'foo' }]));
Promise.all(queryArray).then(([Users, Posts]) => {
const comments = [
{ content: 'bar', user: Users[0], post: posts[0] }
];
return Comment.create(comments);
}).catch(Error => {
console.log("Error: ", Error);
})