MongoDB: Is a range query possible using multikeys? - mongodb

var jd = {
type: "Person",
attributes: {
name: "John Doe",
age: 30
}
};
var pd = {
type: "Person",
attributes: {
name: "Penelope Doe",
age: 26
}
};
var ss = {
type: "Book",
attributes: {
name: "The Sword Of Shannara",
author: "Terry Brooks"
}
};
db.things.save(jd);
db.things.save(pd);
db.things.save(ss);
db.things.ensureIndex({attributes: 1})
db.things.find({"attributes.age": 30}) // => John Doe
db.things.find({"attributes.age": 30}).explain() // => BasicCursor... (don't want a scan)
db.things.find({"attributes.age": {$gte: 18}) // John Doe, Penelope Doe (via a scan)
The goal is that all attributes be indexed and searchable via range queries and that the index actually be used (as opposed to a collection scan). There's no telling what attributes a document will have. I have read about multikeys but they seem only to work (by index) with exact-match queries.
Multikeys prefers this format for a document:
var pd = {
type: "Person",
attributes: [
{name: "Penelope Doe"},
{age: 26}
]
};
Is there a pattern where by one index I can find items by attribute using a range?
EDIT:
In a schemaless DB it makes sense to have potentially a limitless array of types, yet a collection name practically implies some sort of type. But if we go to the extreme, we want to allow for any number of types within a collection (so that we don't have to define a collection for every conceivable custom type a user might imagine). Searching, therefore, by attributes (of any sort) with just a single deep index (that supports ranged queries) makes this sort of thing far more feasible. Seems to me a natural fit for a schemaless DB.
Opened a ticket if you wanna vote it up:
http://jira.mongodb.org/browse/SERVER-2675

Yes range queries work with multikeys. However multikeys are for arrays rather than embedded objects.
In the example above try
db.things.ensureIndex({"attributes.age": 1})

Range queries are possible using multikeys; however, expressing the query can be tricky.

Related

Mongoose findOne not working as expected on nested records

I've got a collection in MongoDB whose simplified version looks like this:
Dealers = [{
Id: 123,
Name: 'Someone',
Email: 'someone#somewhere.com',
Vehicles: [
{
Id: 1234,
Make: 'Honda',
Model: 'Civic'
},
{
Id: 2345,
Make: 'Ford',
Model: 'Focus'
},
{
Id: 3456,
Make: 'Ford',
Model: 'KA'
}
]
}]
And my Mongoose Model looks a bit like this:
const vehicle_model = mongoose.Schema({
Id: {
Type: Number
},
Email: {
Type: String
},
Vehicles: [{
Id: {
Type: Number
},
Make: {
Type: String
},
Model: {
Type: String
}
}]
})
Note the Ids are not MongoDB Ids, just distinct numbers.
I try doing something like this:
const response = await vehicle_model.findOne({ 'Id': 123, 'Vehicles.Id': 1234 })
But when I do:
console.log(response.Vehicles.length)
It's returned all the Vehicles nested records instead on the one I'm after.
What am I doing wrong?
Thanks.
This question is asked very frequently. Indeed someone asked a related question here just 18 minutes before this one.
When query the database you are requesting that it identify and return matching documents to the client. That is a separate action entirely than asking for it to transform the shape of those documents before they are sent back to the client.
In MongoDB, the latter operation (transforming the shape of the document) is usually referred to as "Projection". Simple projections, specifically just returning a subset of the fields, can be done directly in find() (and similar) operations. Most drivers and the shell use the second argument to the method as the projection specification, see here in the documentation.
Your particular case is a little more complicated because you are looking to trim off some of the values in the array. There is a dedicated page in the documentation titled Project Fields to Return from Query which goes into more detail about different situations. Indeed near the bottom is a section titled Project Specific Array Elements in the Returned Array which describes your situation more directly. In it is where they describe usage of the positional $ operator. You can use that as a starting place as follows:
db.collection.find({
"Id": 123,
"Vehicles.Id": 1234
},
{
"Vehicles.$": 1
})
Playground demonstration here.
If you need something more complex, then you would have to start exploring usage of the $elemMatch (projection) operator (not the query variant) or, as #nimrod serok mentions in the comments, using the $filter aggregation operator in an aggregation pipeline. The last option here is certainly the most expressive and flexible, but also the most verbose.

MongoDB schema design: reference by ID vs. reference by name?

With this simple example
(use short ObjectId to make it read easier)
Tag documents:
{
_id: ObjectId('0001'),
name: 'JavaScript',
// other data
},
{
_id: ObjectId('0002'),
name: 'MongoDB',
// other data
},
...
Assume that we need a individual tag collection, e.g. we need to store some information on each tag.
If reference by ID:
// a book document
{
_id: ObjectId('9876'),
title: 'MEAN Web Development',
tags: [ObjectId('0001'), ObjectId('0002'), ...]
}
If reference by name:
{
_id: ObjectId('9876'),
title: 'MEAN Web Development',
tags: ['JavaScript', 'MongoDB', ...]
}
It's known that "reference by ID" is feasible.
I'm thinking if use "reference by name", a query for book's info only need to find within the book collection, we could know the tags' name without a join ($lookup) operation, which should be faster.
If the app performs a tag checking before book creating and modifying, this should also be feasible, and faster.
I'm still not very sure:
Is there any hider on "reference by name" ?
Will "reference by name" slower on "finding all books with a given tag" ? Maybe ObjectId is somehow special ?
Thanks.
I would say it depends on what your use case is for tags. As you say, it will be more expensive to do a $lookup to retrieve tag names if you reference by id. On the other hand, if you expect that tag names may change frequently, all documents in the book collection containing that tag will need to be updated every change.
The ObjectID is simply a 12 byte value, which is autogenerated by a driver if no _id is present in inserted documents. See the MongoDB docs for more info. The only "special behavior" would be the fact that _id has an index by default. An index will speedup lookups in general, but indexes can be created on any field, not just _id.
In fact, the _id does not need to be an ObjectID. It is perfectly legal to have documents with integer _id values for instance:
{
_id: 1,
name: 'Javascript'
},
{
_id: 2,
name: 'MongoDB'
},

Is it better to save id of a document in another document as ObjectId or String

Lets take a simple "bad" example : lets assume I have 2 collections 'person' and 'address'. And lets assume in 'address' I want to store '_id' of the person the address is associated with. Is there any benefit to store this "referential key" item as ObjectId vs string in 'address' collection?
I feel like storing them as string should not hurt but I have not worked in mongo for very long and do not know if it will hurt down the road if I follow this pattern.
I read the post here : Store _Id as object or string in MongoDB?
And its said that ObjectId is faster, and I assume its true if you are fetching/updating using the ObjectId in parent collection(for eg. fetching/updating 'person' collection using person._id as ObjectId), but I couldn't find anything that suggests that same could be true if searching by string id representation in other collection(in our example search in address collection by person._id as string)
Your feedback is much appreciated.
Regardless of performance, you should store the "referential key" in the same format as the _id field that you are referring too. That means that if your referred document is:
{ _id: ObjectID("68746287..."), value: 'foo' }
then you'd refer to it as:
{ _id: ObjectID(…parent document id…), subDoc: ObjectID("68746287...")
If the document that you're pointing to has a string as an ID, then it'd look like:
{ _id: "derick-address-1", value: 'foo' }
then you'd refer to it as:
{ _id: ObjectID(…parent document id…), subDoc: "derick-address-1" }
Besides that, because you're talking about persons and addresses, it might make more sense to not have them in two documents altogether, but instead embed the document:
{ _id: ObjectID(…parent document id…),
'name' : 'Derick',
'addresses' : [
{ 'type' : 'Home', 'street' : 'Victoria Road' },
{ 'type' : 'Work', 'street' : 'King William Street' },
]
}
As for use string as id of document, in meteor collection, you could generate the document id either Random.id() as string or Meteor.Collection.ObjectID() as ObjectId.
In this discussion loop, Mongodb string id vs ObjectId, here is one good summary,
ObjectId Pros
it has an embedded timestamp in it.
it's the default Mongo _id type; ubiquitous
interoperability with other apps and drivers
ObjectId Cons
it's an object, and a little more difficult to manipulate in practice.
there will be times when you forget to wrap your string in new ObjectId()
it requires server side object creation to maintain _id uniqueness
- which makes generating them client-side by minimongo problematic
String Pros
developers can create domain specific _id topologies
String Cons
developer has to ensure uniqueness of _ids
findAndModify() and getNextSequence() queries may be invalidated
All those information above is based on the meteor framework. For Mongodb, it is better to use ObjectId, reasons are in the question linked in your question.
Storing it as objectId is benificial. It is faster as ObjectId size is 12 bytes compared to string which takes 24 bytes.
Also, You should try to de-normalize your collections so that you don't need to make 2 collections (Opposite to RDBMS).
Something like this might be better in general:
{ _id : "1",
person : {
Name : "abc",
age: 20
},
address : {
street : "1st main",
city: "Bangalore",
country: "India"
}
}
But again, it depends on your use case. This might be not suitable sometimes.
Hope that helps! :)

Meteor Collection: find element in array

I have no experience with NoSQL. So, I think, if I just try to ask about the code, my question can be incorrect. Instead, let me explain my problem.
Suppose I have e-store. I have catalogs
Catalogs = new Mongo.Collection('catalogs);
and products in that catalogs
Products = new Mongo.Collection('products');
Then, people add there orders to temporary collection
Order = new Mongo.Collection();
Then, people submit their comments, phone, etc and order. I save it to collection Operations:
Operations.insert({
phone: "phone",
comment: "comment",
etc: "etc"
savedOrder: Order //<- Array, right? Or Object will be better?
});
Nice, but when i want to get stats by every product, in what Operations product have used. How can I search thru my Operations and find every operation with that product?
Or this way is bad? How real pro's made this in real world?
If I understand it well, here is a sample document as stored in your Operation collection:
{
clientRef: "john-001",
phone: "12345678",
other: "etc.",
savedOrder: {
"someMetadataAboutOrder": "...",
"lines" : [
{ qty: 1, itemRef: "XYZ001", unitPriceInCts: 1050, desc: "USB Pen Drive 8G" },
{ qty: 1, itemRef: "ABC002", unitPriceInCts: 19995, desc: "Entry level motherboard" },
]
}
},
{
clientRef: "paul-002",
phone: null,
other: "etc.",
savedOrder: {
"someMetadataAboutOrder": "...",
"lines" : [
{ qty: 3, itemRef: "XYZ001", unitPriceInCts: 950, desc: "USB Pen Drive 8G" },
]
}
},
Given that, to find all operations having item reference XYZ001 you simply have to query:
> db.operations.find({"savedOrder.lines.itemRef":"XYZ001"})
This will return the whole document. If instead you are only interested in the client reference (and operation _id), you will use a projection as an extra argument to find:
> db.operations.find({"savedOrder.lines.itemRef":"XYZ001"}, {"clientRef": 1})
{ "_id" : ObjectId("556f07b5d5f2fb3f94b8c179"), "clientRef" : "john-001" }
{ "_id" : ObjectId("556f07b5d5f2fb3f94b8c17a"), "clientRef" : "paul-002" }
If you need to perform multi-documents (incl. multi-embedded documents) operations, you should take a look at the aggregation framework:
For example, to calculate the total of an order:
> db.operations.aggregate([
{$match: { "_id" : ObjectId("556f07b5d5f2fb3f94b8c179") }},
{$unwind: "$savedOrder.lines" },
{$group: { _id: "$_id",
total: {$sum: {$multiply: ["$savedOrder.lines.qty",
"$savedOrder.lines.unitPriceInCts"]}}
}}
])
{ "_id" : ObjectId("556f07b5d5f2fb3f94b8c179"), "total" : 21045 }
I'm an eternal newbie, but since no answer is posted, I'll give it a try.
First, start by installing robomongo or a similar software, it will allow you to have a look at your collections directly in mongoDB (btw, the default port is 3001)
The way I deal with your kind of problem is by using the _id field. It is a field automatically generated by mongoDB, and you can safely use it as an ID for any item in your collections.
Your catalog collection should have a string array field called product where you find all your products collection items _id. Same thing for the operations: if an order is an array of products _id, you can do the same and store this array of products _id in your savedOrder field. Feel free to add more fields in savedOrder if necessary, e.g. you make an array of objects products with additional fields such as discount.
Concerning your queries code, I assume you will find all you need on the web as soon as you figure out what your structure is.
For example, if you have a product array in your savedorder array, you can pull it out like that:
Operations.find({_id: "your operation ID"},{"savedOrder.products":1)
Basically, you ask for all the products _id in a specific operation. If you have several savedOrders in only one operation, you can specify too the savedOrder _id, if you used the one you had in your local collection.
Operations.find({_id: "your_operation_ID", "savedOrder._id": "your_savedOrder_ID"},{"savedOrder.products":1)
ps: to bad-ass coders here, if I'm doing it wrong, please tell me.
I find an answer :) Of course, this is not a reveal for real professionals, but is a big step for me. Maybe my experience someone find useful. All magic in using correct mongo operators. Let solve this problem in pseudocode.
We have a structure like this:
Operations:
1. Operation: {
_id: <- Mongo create this unique for us
phone: "phone1",
comment: "comment1",
savedOrder: [
{
_id: <- and again
productId: <- whe should save our product ID from 'products'
name: "Banana",
quantity: 100
},
{
_id:,
productId: <- Another ID, that we should save if order
name: "apple",
quantity: 50
}
]
And if we want to know, in what Operation user take "banana", we should use mongoDB operator"elemMatch" in Mongo docs
db.getCollection('operations').find({}, {savedOrder: {$elemMatch:{productId: "f5mhs8c2pLnNNiC5v"}}});
In simple, we get documents our saved order have products with id that we want to find. I don't know is it the best way, but it works for me :) Thank you!

MongoDB many-to-many search

this is my collections: (many-to-many)
actors:
{
_id: 1,
name: "Name 1"
}
movies:
{
_id: 1,
name: "The Terminator",
production_year: 1984,
actors: [
{
actors_id: 1,
role_id : 1
},
{
actors_id: 2,
role_id : 1
}
]
}
I can't get a list of actors for some movie
it is not a problem when I have this:
{
_id: 1,
name: "The Terminator",
production_year: 1984,
actors: [1,2,3,4,5] (actors id's)
}
var a = db.movies.findOne(name:"The Terminator").actors
db.actors.find({"_id":{$in:a}})
but, how can I make it with this above structure:
if, I do this var a = db.movies.findOne(name:"The Terminator").actors
it returns me this:
[
{
actors_id: 1,
role_id : 1
},
{
actors_id: 2,
role_id : 1
}
]
How do I get only this in array [1,2] (actors_id) to get the names of actors (with $in)
Thanks,
Zoran
You don't. Within MongoDB you always query for documents so you have to make sure your schema is such that you can get all the information you need by querying for specific documents. There is no join/view like functionality in MongoDB.
Denormalization is usually the most appropriate choice in such cases. Your schema looks like it's designed for a traditional relational database and you will have to try and let go of some of the schema design principles that come with relational data.
Specifically for your example you could add the actor name to the embedded array so you have that information after querying for the movie.
Finally, consider if you're using the right tool for what you need to do. Too often people think of MongoDB is a "fast MySQL" which is entirely wrong. Document databases are very different to RDBMS and even k/v stores. If you have a lot of related data use an RDBMS.
variable a in db.movies.findOne(name:"The Terminator").actors is an array of documents, so you'd have to make it an array of integers (ids)