Indexing has no effect on db.find() - mongodb

I've Just started to work with MongoDB, so you might find my question really stupid.I tried to search a lot before posting my query here, Any Help would be Appreciated.
I also came across this link StackOverFlow Link, which advised to apply .sort() on every query, but that would increase the query time.
So I tried to index my collection using .createIndexes({_id:-1}), to sort data in descending order of creation time(newest to oldest), After that when I used the .find() method to get data in sorted format(newest to Oldest) I did'nt get the desired result , I still had to sort the data :( .
// connecting db
mongoose.connect(dbUrl, dbOptions);
const db = mongoose.connection;
// listener on db events
db.on('open', ()=>{console.log('DB SUCESSFULLY CONNECTED !!');});
db.on('error', console.error.bind(console, 'connection error:'));
// creating Schema for a person
const personSchma = new mongoose.Schema(
{ name: String,
age : Number}
)
// creating model from person Schema
const person = mongoose.model('person', personSchma);
// Chronological Order of Insertion Of Data
// {name: "kush", age:22}
// {name: "clutch", age:22}
// {name: "lauv", age:22}
person.createIndexes({_id:-1}, (err)=>{
if (err){
console.log(err);
}
})
person.find((err, persons)=>{
console.log(persons)
// Output
// [
// { _id: 6026eadd58a2b124d85b0f8d, name: 'kush', age: 22, __v: 0 },
// { _id: 6026facdf200f8261005f8e0, name: 'clutch', age: 22, __v: 0 },
// { _id: 6026facdf200f8261005f8e1, name: 'lauv', age: 22, __v: 0 }
// ]
})
person.find().sort({_id:-1}).lean().limit(100).then((persons)=>{
console.log(persons);
// Output
// [
// { _id: 6026facdf200f8261005f8e1, name: 'lauv', age: 22, __v: 0 },
// { _id: 6026facdf200f8261005f8e0, name: 'clutch', age: 22, __v: 0 },
// { _id: 6026eadd58a2b124d85b0f8d, name: 'kush', age: 22, __v: 0 }
// ]
})

Indexes are special data structure, which can be used to run the queries efficiently. While running the query, MongoDB tries to see which index should be used for running the query efficiently and then that index will be used.
Creating an index with {_id:-1} will create an auxiliary data structure(index) which will be sorted newest first. It doesn't affect the order of the data which we are storing.
To sort the data in descending order(newest first) we will have to explicitly add the sort operation in your query and make sure that an index for descending order _id is present.

Related

MongoDB - Get IDs of inserted and existing documents after "Insert if not exist" operation on multiple documents

I have to insert multiple documents if they don't already exist, but the important thing is that in the query results I need to have IDs of both the inserted and already existing items.
I'm trying with the following bulkWrite operation:
// external_id is a unique id other than the mongo _id
let items = [
{external_id: 123, name: "John"},
{external_id: 456, name: "Mike"},
{external_id: 789, name: "Joseph"}
];
db.collection("my_collection")
.bulkWrite(
items.map((item) => {
return {
updateOne: {
filter: { external_id: item.external_id },
update: { $setOnInsert: item},
upsert: true,
},
};
})
);
The problem is that the BulkWriteResult return only the _id of the inserted items in upsertedIds, while for the existing items return only the nMatched number.
The other solution I have think about is to make (1) a find over an array of ids, (2) check the results for the ones already existing, and (3) then insertMany for the new ones:
let ids = [123, 456, 789];
let items = [
{external_id: 123, name: "John"},
{external_id: 456, name: "Mike"},
{external_id: 789, name: "Joseph"}
];
// STEP 1: Find alredy existings items
db.collection("my_collection")
.find({ external_id: { $in: ids } })
.toArray(function (err, existingItems) {
// If John already exist
// existingItems = [{_id: ObjectId, external_id: 123, name: "John"}]
// STEP 2: Check which item has to be created
let itemsToBeCreated = items.filter((item) =>
!existingItems.some((ex) => ex.external_id === item.external_id)
);
// STEP 3: Insert new items
db.collection("my_collection")
.insertMany(itemsToBeCreated, function (err, result) {
// FINALLY HERE I GET ALL THE IDs OF THE EXISTING AND INSERTED ITEMS
});
});
With this solution I'm concerned about performance, because these operations are fired 100K times a day for 10 items each, and about 90% of the times the items are new. So 900K new items and 100K already existing.
I would like to know if there is a better way of achieving this.
Thanks in advance

Mongo DB updateOne query calculation

I have this Schema:
const HeroSchema = new mongoose.Schema({
name: {
type: String,
required: true
},
power_start: {
type: Number,
required: true,
default: Math.floor(Math.random() * 10)
},
power_current: {
type: Number,
required: true,
default: 0
}
})
When a user clicks a button I want to randomly add 'power' between 1-10 to 'power_current'. Means, I need the final number in 'power_current' to be the sum of 'power_start' + random number + 'power_current'
'power start' getting random number as the default.
I just started to write and stuck:
updateHeroPower: async({id})=>{
const res = await HeroSchema.updateOne({
_id: id
},{
$set:{
}
})
}
Thanks
For Mongo version 4.2+ you can use pipelined updates like so:
const res = await HeroSchema.updateOne({
_id: id
},
[
{
$set: {
power_current: {$sum: ["$power_current", "$power_start", Math.floor(Math.random() * 10)]}
}
}
]
)
Older Mongo version do not have the power to access document fields within the update object.
This means you'll have to split it into 2 calls. First read the document and only then update it using the information you fetched.
One thing to note is I'm not sure you actually want the logic you requested as summing power_current and power_start every single time will actually inflate the score. I think what you want to do is sum random with power_start if power_current does not exist, and otherwise sum it with power_current as it already embodies power_start value after the initial sum.
Assuming my assumption is correct you can achieve this with $ifNull
const res = await HeroSchema.updateOne({
_id: id
},
[
{
$set: {
power_current: {$sum: [{$ifNull: ["$power_current", "$power_start"]}, Math.floor(Math.random() * 10)]}
}
}
]
)

Use GraphQL Query to get results form MongoDB after aggregation with mongoose

so i have following problem.
I have a mongoDB collection and a corresponding mongoose model which looks like this.
export const ListItemSchema = new Schema<ListItemSchema>({
title: { type: String, required: true },
parentId: { type: Schema.Types.ObjectId, required: false },
});
export const TestSchema = new Schema<Test>(
{
title: { type: String, required: true },
list: { type: [ListItemSchema], required: false },
}
);
As you can see, my TestSchema holds an Array of ListItems inside -> TestSchema is also my Collection in MongoDB.
Now i want to query only my ListItems from a Test with a specific ID.
Well that was not that big of a problem at least from the MongoDB side.
I use MongoDB Aggregation Framework for this and call my aggregation inside a custom Resolver.
Here is the code to get an array of only my listItems from a specific TestModel
const test = TestModel.aggregate([
{$match: {_id: id}},
{$unwind: "$list"},
{
$match: {
"list.parentId": {$eq: null},
},
},
{$replaceRoot: {newRoot: "$list"}},
]);
This is the result
[ { _id: randomId,
title: 't',
parentId: null },
{ _id: randomId,
title: 'x'
parentId: null
} ]
The Query to trigger the resolver looks like this and is placed inside my Test Type Composer.
query getList {
test(testId:"2f334575196fe042ea83afbf", parentId: null) {
title
}
}
So far so good... BUT! Ofc my query will fail or will result in a not so good result^^ because GraphQL expects data based on the Test-Model but receives a completely random array.
So after a lot of typing here is the question:
How do i have to change my query to receive the list array?
Do i have to adjust the query or is it something with mongoose?
i really stuck at this point so any help would be awesome!
Thanks in advance :)
I'm not sure if I understood your issue correctly.
In your graphql, try to leave out exclamation mark(!) from the Query type.
something like :
type Query {
test: TestModel
}
instead of
type Query {
test: TestModel!
}
then you'll get the error message in console but still be able to receive any form of data.

Pulling/deleting an item from a nested array

Note: it's a Meteor project.
My schema looks like that:
{
_id: 'someid'
nlu: {
data: {
synonyms:[
{_id:'abc', value:'car', synonyms:['automobile']}
]
}
}
}
The schema is defined with simple-schema. Relevant parts:
'nlu.data.synonyms.$': Object,
'nlu.data.synonyms.$._id': {type: String, autoValue: ()=> uuidv4()},
'nlu.data.synonyms.$.value': {type:String, regEx:/.*\S.*/},
'nlu.data.synonyms.$.synonyms': {type: Array, minCount:1},
'nlu.data.synonyms.$.synonyms.$': {type:String, regEx:/.*\S.*/},
I am trying to remove {_id:'abc'}:
Projects.update({_id: 'someid'},
{$pull: {'nlu.data.synonyms' : {_id: 'abc'}}});
The query returns 1 (one doc was updated) but the item was not removed from the array. Any idea?
This is my insert query
db.test.insert({
"_id": "someid",
"nlu": {
"data": {
"synonyms": [
{
"_id": "abc"
},
{
"_id": "def"
},
10,
[ 5, { "_id": 5 } ]
]
}
}
})
And here is my update
db.test.update(
{
"_id": "someid",
"nlu.data.synonyms._id": "abc"
},
{
"$pull": {
"nlu.data.synonyms": {
"_id": "abc"
}
}
}
)
The problem broke down to the autoValue parameter on your _id property.
This is a very powerful feature to manipulate automatic values on your schema. However, it prevented from pulling as it had always returned a value, indicating that this field should be set.
In order to make it aware of the pulling, you can make it aware of an operator being present (as in cases of mongo updates).
Your autoValue would then look like:
'nlu.data.synonyms.$._id': {type: String, autoValue: function(){
if (this.operator) {
this.unset();
return;
}
return uuidv4();
}},
Edit: Note the function here being not an arrow function, otherwise it losses the context that is bound on it by SimpleSchema.
It basically only returns a new uuid4 when there is no operator present (as in insert operations). You can extend this further by the provided functionality (see the documentation) to your needs.
I just summarized my code to a reproducable example:
import uuidv4 from 'uuid/v4';
const Projects = new Mongo.Collection('PROJECTS')
const ProjectSchema ={
nlu: Object,
'nlu.data': Object,
'nlu.data.synonyms': {
type: Array,
},
'nlu.data.synonyms.$': {
type: Object,
},
'nlu.data.synonyms.$._id': {type: String, autoValue: function(){
if (this.operator) {
this.unset();
return;
}
return uuidv4();
}},
'nlu.data.synonyms.$.value': {type:String, regEx:/.*\S.*/},
'nlu.data.synonyms.$.synonyms': {type: Array, minCount:1},
'nlu.data.synonyms.$.synonyms.$': {type:String, regEx:/.*\S.*/},
};
Projects.attachSchema(ProjectSchema);
Meteor.startup(() => {
const insertId = Projects.insert({
nlu: {
data: {
synonyms:[
{value:'car', synonyms:['automobile']},
]
}
}
});
Projects.update({_id: insertId}, {$pull: {'nlu.data.synonyms' : {value: 'car'}}});
const afterUpdate = Projects.findOne(insertId);
console.log(afterUpdate, afterUpdate.nlu.data.synonyms.length); // 0
});
Optional Alternative: Normalizing Collections
However there is one additional note for optimization.
You can work around this auto-id generation issue by normalizing synonyms into an own collection, where the mongo insert provides you an id. I am not sure how unique this id will be compared to uuidv4 but i never faced id issues with that.
A setup could look like this:
const Synonyms = new Mongo.Collection('SYNONYMS');
const SynonymsSchema = {
value: {type:String, regEx:/.*\S.*/},
synonyms: {type: Array, minCount:1},
'synonyms.$': {type:String, regEx:/.*\S.*/},
};
Synonyms.attachSchema(SynonymsSchema);
const Projects = new Mongo.Collection('PROJECTS')
const ProjectSchema ={
nlu: Object,
'nlu.data': Object,
'nlu.data.synonyms': {
type: Array,
},
'nlu.data.synonyms.$': {
type: String,
},
};
Projects.attachSchema(ProjectSchema);
Meteor.startup(() => {
// just add this entry once
if (Synonyms.find().count() === 0) {
Synonyms.insert({
value: 'car',
synonyms: ['automobile']
})
}
// get the id
const carId = Synonyms.findOne()._id;
const insertId = Projects.insert({
nlu: {
data: {
synonyms:[carId] // push the _id as reference
}
}
});
// ['MG464i9PgyniuGHpn'] => reference to Synonyms document
console.log(Projects.findOne(insertId).nlu.data.synonyms);
Projects.update({_id: insertId}, {$pull: {'nlu.data.synonyms' : carId }}); // pull the reference
const afterUpdate = Projects.findOne(insertId);
console.log(afterUpdate, afterUpdate.nlu.data.synonyms.length);
});
I know this was not part of the question but I just wanted to point out that there are many benefits of normalizing complex document structures into separate collections:
no duplicate data
decouple data that is not intended to be bound (here: Synonyms could be also used independently from Projects)
update referred documents once, all Projects will point to the very actual version (since it's a reference)
finer publication/subscription handling => more control about what data flows over the wire
reduces complex auto and default value generation
changes in the referred collection's schema may have only few consequences for UI and functions that make use of the referrer's schema.
Of course this has also disadvantages:
more collections to handle
more code to write (more code = more potential errors)
more tests to write (much more time to invest)
sometimes you need to denormalize back for this one case out of 100
you have to invest a lot of time in data schema design before starting to code

MongoDB compound index order of the fields

I have collection schema
1) user
2) age
3) role
I have created compound index ( { age: 1, user: 1 } ). When I find documents with criteria { age: { $gt: 21, $lt: 50 }, user: 'user124' }, the index is properly used ( I am watching in explain()), but when I change order to { user: '124', age: { $gt: 21, $lt: 50 } } results and index usage is identical. When I have compound index on two fields, order in criteria doesn't matter?
This is correct, the order does not matter.
In fact, only arrays in the query are ordered and dictionarys are not.
http://json.org/