MongoDB design/refactoring - mongodb

I have a project that has (ta-daaa) scope-crept on me.
What started as a simple app to track calibrated tools (each tool has a yearly rotation cycle to check calibration) has turned into inventory tracking too.
So my current model has some required fields and an embedded doc of calibrations:
{
_id: ObjectId("51b0d94c3f72fb89c9000014"),
barcode: "H-131887",
calibrations: [
{
_id: ObjectId("51b0d94c3f72fb89c9000015"),
cal_date: ISODate("2013-07-03T16:04:57.893Z"),
cal_date_due: ISODate("2013-07-03T16:04:57.894Z"),
ats_in: ISODate("2013-06-01T16:04:57.895Z"),
ats_out: ISODate("2013-06-06T16:04:57.897Z")
},
{
_id: ObjectId("51b0e6053f72fbb27900001b"),
cal_date: ISODate("2013-06-13T00:00:00Z"),
cal_date_due: ISODate("2014-06-13T00:00:00Z"),
ats_in: ISODate("2013-06-06T00:00:00Z"),
ats_out: ISODate("2013-06-17T00:00:00Z"),
updated_at: ISODate("2013-07-09T14:44:31.113Z"),
created_at: ISODate("2013-06-06T19:41:57.770Z")
}
],
created_at: ISODate("2013-06-06T18:47:40.481Z"),
creator_id: ObjectId("5170547c791e4b1a16000001"),
description: "",
group: "engine",
location: "Cabinet 1",
maker: "MITUTOYO",
model: "2046S",
serial: "QEL228",
status: "In",
tool: "Dial Indicator",
updated_at: ISODate("2013-07-09T14:44:31.103Z")
}
What would be the best way to allow non-calibrated tools in this schema where Barcode/Serial are not required for those tools? Also, they won't have calibration dates, so my current table that lists the tool and most recent calibration date won't be happy returning nil calibrations...

It is unlikely that you will need to refactor your database schema.
MongoDB is supposed to work with heterogeneous data. That means not all documents in the same collection need to have the same fields. It is no problem at all for MongoDB when some documents have fields and even sub-documents regarding calibration information and some have not.
When you have a find-query which is not supposed to return documents which don't have calibration information, you can just add the find-condition calibrations: { $exists: true } and only return those documents where the calibration field exists. But even a query like find({"calibrations.cal_date_due":{$lt:ISODate()}) will not choke on documents which don't have a field calibrations and thus no calibrations.cal_date_due either. It will just skip these documents silently.

Related

Mongoose findOne not working as expected on nested records

I've got a collection in MongoDB whose simplified version looks like this:
Dealers = [{
Id: 123,
Name: 'Someone',
Email: 'someone#somewhere.com',
Vehicles: [
{
Id: 1234,
Make: 'Honda',
Model: 'Civic'
},
{
Id: 2345,
Make: 'Ford',
Model: 'Focus'
},
{
Id: 3456,
Make: 'Ford',
Model: 'KA'
}
]
}]
And my Mongoose Model looks a bit like this:
const vehicle_model = mongoose.Schema({
Id: {
Type: Number
},
Email: {
Type: String
},
Vehicles: [{
Id: {
Type: Number
},
Make: {
Type: String
},
Model: {
Type: String
}
}]
})
Note the Ids are not MongoDB Ids, just distinct numbers.
I try doing something like this:
const response = await vehicle_model.findOne({ 'Id': 123, 'Vehicles.Id': 1234 })
But when I do:
console.log(response.Vehicles.length)
It's returned all the Vehicles nested records instead on the one I'm after.
What am I doing wrong?
Thanks.
This question is asked very frequently. Indeed someone asked a related question here just 18 minutes before this one.
When query the database you are requesting that it identify and return matching documents to the client. That is a separate action entirely than asking for it to transform the shape of those documents before they are sent back to the client.
In MongoDB, the latter operation (transforming the shape of the document) is usually referred to as "Projection". Simple projections, specifically just returning a subset of the fields, can be done directly in find() (and similar) operations. Most drivers and the shell use the second argument to the method as the projection specification, see here in the documentation.
Your particular case is a little more complicated because you are looking to trim off some of the values in the array. There is a dedicated page in the documentation titled Project Fields to Return from Query which goes into more detail about different situations. Indeed near the bottom is a section titled Project Specific Array Elements in the Returned Array which describes your situation more directly. In it is where they describe usage of the positional $ operator. You can use that as a starting place as follows:
db.collection.find({
"Id": 123,
"Vehicles.Id": 1234
},
{
"Vehicles.$": 1
})
Playground demonstration here.
If you need something more complex, then you would have to start exploring usage of the $elemMatch (projection) operator (not the query variant) or, as #nimrod serok mentions in the comments, using the $filter aggregation operator in an aggregation pipeline. The last option here is certainly the most expressive and flexible, but also the most verbose.

Sorting nested objects in MongoDB

So I have documents that follow this schema:
{
_id: String,
globalXP: {
xp: {
type: Number,
default: 0
},
level: {
type: Number,
default: 0
}
},
guilds: [{ _id: String, xp: Number, level: Number }]
}
So basically users have their own global XP and xp based on each guild they are in.
Now I want to make a leaderboard for all the users that have a certain guildID in their document.
What's the most efficient way to fetch all the user documents that have the guild _id in their guilds array and how do I sort them afterwards?
I know it might be messy as hell but bare with me here.
If I've understand well, you only need this line of code:
var find = await model.find({"guilds._id":"your_guild_id"}).sort({"globalXP.level":-1})
This query will return all documentas where guilds array contains the specific _id and sort by player level.
In this way the best level will be displayed first.
Here is an example how the query works. Please check if it work as you expected.

MongoDB multiple schemas in one collection

I am new to mongo and have done a lot of reading and a proof of concept. There are many discussions about multiple collections or embedded documents. Isn't there another choice? Ignoring my relationalDB mind... Couldn't you put two different schemas in the same collection?
Crude example:
{
_id: 'f48a2dea-e6ec-490d-862a-bd1791e76d9e',
_owner: '7a147aad-e3fd-4e55-9fd5-e2cb48d31a83'
manufacturer: 'Porsche',
model: '911',
img: '<byte array>'
},{
_id: '821ca9b7-faa1-4516-a27e-aec79fcb89a9',
_owner: '46ade116-cd59-4d0c-a4d3-cd2e517a256c',
manufacturer: 'Nissan',
model: 'GT-R',
img: '<byte array>'
},{
_id: '87999e27-c98b-4cad-b444-75626f161840'
_owner: 'fba765c8-32dd-49ba-91d3-d361b40bf4a7',
manufacturer: 'BMW',
model: 'M3',
wiki:'http://en.wikipedia.org/wiki/Bmw_m3',
img: '<byte array>'
}
and a totally difference schema in the same collection as well
{
_id: '7a147aad-e3fd-4e55-9fd5-e2cb48d31a83',
name: 'Keeley Bosco',
email: 'katlyn#jenkinsmaggio.net,
city": 'Lake Gladysberg',
mac: '08:fd:0b:cd:77:f7',
timestamp: '2015-04-25 13:57:36 +0700',
},{
_id: '46ade116-cd59-4d0c-a4d3-cd2e517a256c',
name: 'Rubye Jerde',
email: 'juvenal#johnston.name',
city: null,
mac: '90:4d:fa:42:63:a2',
timestamp: '2015-04-25 09:02:04 +0700',
},{
_id: 'fba765c8-32dd-49ba-91d3-d361b40bf4a7',
name: 'Miss Darian Breitenberg',
email: null,
city: null,
mac: 'f9:0e:d3:40:cb:e9',
timestamp: '2015-04-25 13:16:03 +0700',
}
(The reason I don't use an embedded document (in my real POC) is that a person may have 80000 "cars" and go over the 16MB limit).
Besides the aching desire to compartmentalize data is there a downfall here?
The reasoning for doing this may be so that we can correlate the records... I do see that 3.2 has join. The project it too new to know all of the business cases.
Although Mongodb supports different schema within a same collection. However, as a good practice, better to stick to one schema or similar schema through out the collection, so your application logic will be simpler.
In your case, yes, it is good that you didn't use a embedded document considering the size of the sub document. However, I would suggest to go for normalized data model which is not really bad in this kind of situation.
Further you can refer here: https://docs.mongodb.com/master/core/data-model-design/

MongoDB schema design: reference by ID vs. reference by name?

With this simple example
(use short ObjectId to make it read easier)
Tag documents:
{
_id: ObjectId('0001'),
name: 'JavaScript',
// other data
},
{
_id: ObjectId('0002'),
name: 'MongoDB',
// other data
},
...
Assume that we need a individual tag collection, e.g. we need to store some information on each tag.
If reference by ID:
// a book document
{
_id: ObjectId('9876'),
title: 'MEAN Web Development',
tags: [ObjectId('0001'), ObjectId('0002'), ...]
}
If reference by name:
{
_id: ObjectId('9876'),
title: 'MEAN Web Development',
tags: ['JavaScript', 'MongoDB', ...]
}
It's known that "reference by ID" is feasible.
I'm thinking if use "reference by name", a query for book's info only need to find within the book collection, we could know the tags' name without a join ($lookup) operation, which should be faster.
If the app performs a tag checking before book creating and modifying, this should also be feasible, and faster.
I'm still not very sure:
Is there any hider on "reference by name" ?
Will "reference by name" slower on "finding all books with a given tag" ? Maybe ObjectId is somehow special ?
Thanks.
I would say it depends on what your use case is for tags. As you say, it will be more expensive to do a $lookup to retrieve tag names if you reference by id. On the other hand, if you expect that tag names may change frequently, all documents in the book collection containing that tag will need to be updated every change.
The ObjectID is simply a 12 byte value, which is autogenerated by a driver if no _id is present in inserted documents. See the MongoDB docs for more info. The only "special behavior" would be the fact that _id has an index by default. An index will speedup lookups in general, but indexes can be created on any field, not just _id.
In fact, the _id does not need to be an ObjectID. It is perfectly legal to have documents with integer _id values for instance:
{
_id: 1,
name: 'Javascript'
},
{
_id: 2,
name: 'MongoDB'
},

Meteor Collection: find element in array

I have no experience with NoSQL. So, I think, if I just try to ask about the code, my question can be incorrect. Instead, let me explain my problem.
Suppose I have e-store. I have catalogs
Catalogs = new Mongo.Collection('catalogs);
and products in that catalogs
Products = new Mongo.Collection('products');
Then, people add there orders to temporary collection
Order = new Mongo.Collection();
Then, people submit their comments, phone, etc and order. I save it to collection Operations:
Operations.insert({
phone: "phone",
comment: "comment",
etc: "etc"
savedOrder: Order //<- Array, right? Or Object will be better?
});
Nice, but when i want to get stats by every product, in what Operations product have used. How can I search thru my Operations and find every operation with that product?
Or this way is bad? How real pro's made this in real world?
If I understand it well, here is a sample document as stored in your Operation collection:
{
clientRef: "john-001",
phone: "12345678",
other: "etc.",
savedOrder: {
"someMetadataAboutOrder": "...",
"lines" : [
{ qty: 1, itemRef: "XYZ001", unitPriceInCts: 1050, desc: "USB Pen Drive 8G" },
{ qty: 1, itemRef: "ABC002", unitPriceInCts: 19995, desc: "Entry level motherboard" },
]
}
},
{
clientRef: "paul-002",
phone: null,
other: "etc.",
savedOrder: {
"someMetadataAboutOrder": "...",
"lines" : [
{ qty: 3, itemRef: "XYZ001", unitPriceInCts: 950, desc: "USB Pen Drive 8G" },
]
}
},
Given that, to find all operations having item reference XYZ001 you simply have to query:
> db.operations.find({"savedOrder.lines.itemRef":"XYZ001"})
This will return the whole document. If instead you are only interested in the client reference (and operation _id), you will use a projection as an extra argument to find:
> db.operations.find({"savedOrder.lines.itemRef":"XYZ001"}, {"clientRef": 1})
{ "_id" : ObjectId("556f07b5d5f2fb3f94b8c179"), "clientRef" : "john-001" }
{ "_id" : ObjectId("556f07b5d5f2fb3f94b8c17a"), "clientRef" : "paul-002" }
If you need to perform multi-documents (incl. multi-embedded documents) operations, you should take a look at the aggregation framework:
For example, to calculate the total of an order:
> db.operations.aggregate([
{$match: { "_id" : ObjectId("556f07b5d5f2fb3f94b8c179") }},
{$unwind: "$savedOrder.lines" },
{$group: { _id: "$_id",
total: {$sum: {$multiply: ["$savedOrder.lines.qty",
"$savedOrder.lines.unitPriceInCts"]}}
}}
])
{ "_id" : ObjectId("556f07b5d5f2fb3f94b8c179"), "total" : 21045 }
I'm an eternal newbie, but since no answer is posted, I'll give it a try.
First, start by installing robomongo or a similar software, it will allow you to have a look at your collections directly in mongoDB (btw, the default port is 3001)
The way I deal with your kind of problem is by using the _id field. It is a field automatically generated by mongoDB, and you can safely use it as an ID for any item in your collections.
Your catalog collection should have a string array field called product where you find all your products collection items _id. Same thing for the operations: if an order is an array of products _id, you can do the same and store this array of products _id in your savedOrder field. Feel free to add more fields in savedOrder if necessary, e.g. you make an array of objects products with additional fields such as discount.
Concerning your queries code, I assume you will find all you need on the web as soon as you figure out what your structure is.
For example, if you have a product array in your savedorder array, you can pull it out like that:
Operations.find({_id: "your operation ID"},{"savedOrder.products":1)
Basically, you ask for all the products _id in a specific operation. If you have several savedOrders in only one operation, you can specify too the savedOrder _id, if you used the one you had in your local collection.
Operations.find({_id: "your_operation_ID", "savedOrder._id": "your_savedOrder_ID"},{"savedOrder.products":1)
ps: to bad-ass coders here, if I'm doing it wrong, please tell me.
I find an answer :) Of course, this is not a reveal for real professionals, but is a big step for me. Maybe my experience someone find useful. All magic in using correct mongo operators. Let solve this problem in pseudocode.
We have a structure like this:
Operations:
1. Operation: {
_id: <- Mongo create this unique for us
phone: "phone1",
comment: "comment1",
savedOrder: [
{
_id: <- and again
productId: <- whe should save our product ID from 'products'
name: "Banana",
quantity: 100
},
{
_id:,
productId: <- Another ID, that we should save if order
name: "apple",
quantity: 50
}
]
And if we want to know, in what Operation user take "banana", we should use mongoDB operator"elemMatch" in Mongo docs
db.getCollection('operations').find({}, {savedOrder: {$elemMatch:{productId: "f5mhs8c2pLnNNiC5v"}}});
In simple, we get documents our saved order have products with id that we want to find. I don't know is it the best way, but it works for me :) Thank you!