Can I utilize indexes when querying by MongoDB subdocument without known field names? - mongodb

I have a document structure like follows:
{
"_id": ...,
"name": "Document name",
"properties": {
"prop1": "something",
"2ndprop": "other_prop",
"other3": ["tag1", "tag2"],
}
}
I can't know the actual field names in properties subdocument (they are given by the application user), so I can't create indexes like properties.prop1. Neither can I know the structure of the field values, they can be single value, embedded document or array.
Is there any practical way to do performant queries to the collection with this kind of schema design?
One option that came to my mind is to add a new field to the document, index it and set used field names per document into this field.
{
"_id": ...,
"name": "Document name",
"properties": {
"prop1": "something",
"2ndprop": "other_prop",
"other3": ["tag1", "tag2"],
},
"property_fields": ["prop1", "2ndprop", "other3"]
}
Now I could first run query against property_fields field and after that let MongoDB scan through the found documents to see whether properties.prop1 contains the required value. This is definitely slower, but could be viable.

One way of dealing with this is to use schema like below.
{
"name" : "Document name",
"properties" : [
{
"k" : "prop1",
"v" : "something"
},
{
"k" : "2ndprop",
"v" : "other_prop"
},
{
"k" : "other3",
"v" : "tag1"
},
{
"k" : "other3",
"v" : "tag2"
}
]
}
Then you can index "properties.k" and "properties.v" for example like this:
db.foo.ensureIndex({"properties.k": 1, "properties.v": 1})

Related

Mongodb query to return field value

I am trying to construct a Mongodb query to return a field value. My JSON looks like this:
"question" : "Global_Deployment",
"displayOrder" : 1,
"answerOptions" : {
"fieldId" : "1001",
"fieldType" : "radiobutton",
"fieldName" : "Global Deployment?",
"fieldLabel" : "Global Deployment?",
"helpText" : "Help will go here",
"emailTagFormControl" : "Global_Deployment?",
"source" : "custom",
"status" : "active",
"required" : "true",
"multiSelect" : "false",
"purgeFlag" : "false",
"enableAuditTrack" : "false",
"fields" : [],
"fieldValue" : "Yes",
"options" : [
{
"optionName" : "Yes"
},
{
"optionName" : "No"
}
],
"comments" : {
"commentId" : "C1001",
"commentDetails" : []
}
My query to reach the field with the fieldName "Global Deployment" is this:
db.getCollection('requests').find({"sections.questions.answerOptions.fieldName":"Global Deployment?"})
What I want to know is what to add to this query to return the value of "fieldValue", which is on a different line in the JSON. I am new to Mongodb. Any help would be greatly appreciated.
1) If you've multiple documents in DB with "fieldName" : "Global Deployment?", then .find() would return all the matching documents i.e; in the output what you get is an array of documents then you need to iterate through the array to get answerOptions.fieldValue for each document, Check the below scenario, as I've explained there are chances of getting multiple documents if "sections.questions.answerOptions.fieldName" is not an unique field.
db.getCollection('requests').find({"sections.questions.answerOptions.fieldName":"Global Deployment?"}, {'sections.questions.answerOptions.fieldValue':1})
Output of find :
/* 1 */
[{
"_id" : ObjectId("5d4e19826e173840500f5674"),
"answerOptions" : {
"fieldValue" : "Yes"
}
},
/* 2 */
{
"_id" : ObjectId("5d4e19826e073840500f5674"),
"answerOptions" : {}
}]
If you only need documents which has fieldValue in it then do this :
db.getCollection('requests').find({"sections.questions.answerOptions.fieldName":"Global Deployment?", 'sections.questions.answerOptions.fieldValue':{$exists: true}}, {'answerOptions.fieldValue':1})
Ok now you've array of documents then do iterate thru each to retrieve your value, check this mongoDB cursor tutorial .
2) If you think fieldName is unique across collection, then you can use .findOne() , which would exactly return one document (In case if you've multiple matching documents it would return first found doc) :
db.getCollection('requests').findOne({"sections.questions.answerOptions.fieldName":"Global Deployment?"}, {'sections.questions.answerOptions.fieldValue':1})
Output of findOne :
{
"_id" : ObjectId("5d4e19826e173840500f5674"),
"answerOptions" : {
"fieldValue" : "Yes"
}
}
If you see .find({},{}) has two arguments, second one is called projection which literally be useful if you want to retrieve only required fields in the response, By default mongoDB will return the entire document what ever you've posted in the question will be retrieved, Data in mongoDB flows as JSON's so operating will be similar to using JSON's, Here you can retrieve the required fields out of result, but for best use of network efficiency if you don't need entire document you'll only get the required fields using projection.
You can specify the second condition separated by comma. Either you are trying to filter data with $and or with $or
With simple approach:
{"sections.questions.answerOptions.fieldName":"Global Deployment?","sections.questions.answerOptions.fieldValue":"Yes" }
By using $and method:
.find(
{
$and: [
{"sections.questions.answerOptions.fieldName":"Global Deployment?"},
{"sections.questions.answerOptions.fieldValue":"Yes"}
]
}
)
Same way you can use $or method. Just replace $and with $or.
Edit:
If you want to retrieve specific value (in your case fieldValue), query would be:
db.getCollection('requests').find({
"sections.questions.answerOptions.fieldName":"Global Deployment?"
}).map(function(item){
return item.fieldValue
})
The correct answer here is the method .distinct() (docs)
In your case try it like this:
db.getCollection('requests').find({"sections.questions.answerOptions.fieldName":"Global Deployment?"}).distinct('fieldValue');
That will return only the value you want.
If you use findOne you can use dot notation.
For example, if we start with creating a collection to test using the following to get close to your sample:
db.stackOverflow.insertOne({
sections: {
questions: {
question: "Global_Deployment",
displayOrder: 1,
answerOptions: {
fieldId: "1001",
fieldType: "radiobutton",
fieldName: "Global Deployment?",
fieldLabel: "Global Deployment?",
helpText: "Help will go here",
emailTagFormControl: "Global_Deployment?",
source: "custom",
status: "active",
required: "true",
multiSelect: "false",
purgeFlag: "false",
enableAuditTrack: "false",
fields: [],
fieldValue: "Yes",
options: [
{
optionName: "Yes",
},
{
optionName: "No",
},
],
comments: {
commentId: "C1001",
commentDetails: [],
},
},
},
},
})
then, this query will return "Yes".
db.stackOverflow.findOne({}).sections.questions.answerOptions.fieldValue

How to move Embedded Fields out of their embedded document?

Here is an example of one of my JSON docs:
{
"_id": 1,
"SongId": 1,
"Details": {
"Artist": "Cyndi Lauper",
"Album": "She's So Unusual",
"ReleaseYear": 1983
},
"SongTitle": "Girls Just Want To Have Fun"
}
How would one write a query to move the location of "Artist" and it's value out of the "Details" document, leaving "Album" & "ReleaseYear" still embedded.
In addition to updating the name of a field, the $rename operator can be used to move fields out of (or into) embedded documents.
When working with fields in embedded documents you need to use dot notation to refer to the field name.
Assuming a collection name of discography, you could move your Details.Artist field using:
db.discography.update(
{_id: 1},
{$rename: { "Details.Artist": "Artist"}}
)
Example result:
> db.discography.findOne({_id: 1})
{
"_id" : 1,
"SongId" : 1,
"Details" : {
"Album" : "She's So Unusual",
"ReleaseYear" : 1983
},
"SongTitle" : "Girls Just Want To Have Fun",
"Artist" : "Cyndi Lauper"
}

How do I query a hash sub-object that is dynamic in mongodb?

I currently have a Question object and am not sure how to query for it?
{ "title" : "Do you eat fast food?"
"answers" : [
{
"_id" : "506b422ff42c95000e00000d",
"title" : "Yes",
"trait_score_modifiers" : {
"hungry" : 1
}
},
{
"_id" : "506b422ff42c95000e00000e",
"title" : "No",
"trait_score_modifiers" : {
"not-hungry" : -1
}
}]
}
I am trying to find questions where the trait_score_modifieres is queried (sometimes it exists, sometimes not)
I have the following but it is not dynamic:
db.questions.find({"answers.trait_score_modifiers.not-hungry":{$exists: true}})
How could i do something like this?
db.questions.find({"answers.trait_score_modifiers.{}.size":{$gt: 0}})
You should modify the schema so you have consistent key names to query on. I ran into a similar problem using the aggregation framework, see question: Total values from all keys in subdocument
Something like this should work (not tested):
{
"title" : "Do you eat fast food?"
"answers" : [
{
"title" : "Yes",
"trait_score_modifiers" : [
{"dimension": "hungry", "value": 1}
]
},
{
"title" : "No",
"trait_score_modifiers" : [
{"dimension": "not-hungry", "value": -1}
]
}]
}
You can return all questions that have a dynamic dimension (e.g. "my new dimension") with:
db.questions.find("answers.trait_score_modifiers.dimension": "my new dimension")
Or limit the returned set to questions that have a specific value on that dimension (e.g. > 0):
db.questions.find(
"answers.trait_score_modifiers": {
"$elemMatch": {
"dimension": "my new dimension",
"value": {"$gt": 0}
}
}
)
Querying nested arrays can be a bit tricky, be sure to read up on the documentation In this case, $elemMatch is needed because otherwise you return a document that has some trait_score_modifier my new dimension but the matching value is in the dimension key of a different array element.
You need $elemMatch criteria in your query.
Refer to: http://docs.mongodb.org/manual/reference/projection/elemMatch/
Let me know if you need the query.

MongoDB return array values where key matches criteria

I'm trying to get the values from an array of objects for the keys that match certain criteria. For the objects in the array the keys will be longs and the values strings. Here's a sample MondgoDB document:
"_id" : ObjectId("509eba6d84f30613b4aee1ca"),
"timestamps" : [
{
"1234" : "ABC"
},
{
"2345" : "DEF"
},
{
"3456" : "GHI"
},
{
"4567" : [
"JKL",
"ABC"
]
},
{
"5678" : "GHI"
}
],
"word" : "foo"
For example I'd like to retrieve the values of all "timestamps" entries where the key is less than 3000 (i.e. "ABC" and "DEF" in the above). I've only had luck in finding which documents in the collection have specific keys by using coll.find({"timestamps.4567":{$exists:true}}) but I get no results when trying things like coll.find({"timestamps":{$lt:3000}}) - I'm obviously missing something there that would check if timestamps' keys are less than 3000, not the value of timestamps itself.
Maybe I got it all wrong... looks like you need to alter a bit the structure of your documents:
"_id" : ObjectId("509eba6d84f30613b4aee1ca"),
"timestamps" : [
{
"key": "1234",
"val": "ABC"
},
{
"key": "2345",
"val": "DEF"
},
"word" : "foo"
and then you can query using elemMatch:
db.test.find({timestamps: {$elemMatch: {'key': {$gt: '1234'}}}})
Make sure you have an index on timestamps.key
HTH

How to include different catalogs in a query in MongoDB?

Suppose I have different documents in different collections:
On cars:
{ "_id": 32534534, "color": "red", ... }
On houses:
{ "_id": 93867, "city": "Xanadu", ... }
How can I retrieve the corresponding document to the documents below, in people:
{ "name": "Alonso", "owns": [32534534], ... }
{ "name": "Kublai Khan", "owns": [93867], ... }
Can I use something like the code below?
(Note that I'm not specifying a catalog)
db.find({'_id': 93867})
If not, what would you suggest to achieve this efect?
I have just found this related question: MongoDB: cross-collection queries
Using DBrefs you can store links to documents outside your collection or even in another mongodb database. You will have to fetch the references in separate queries, different drivers handle this differently, for example with the python driver you can auto dereference.
An example of yours in the js shell might look like:
> red_car = {"color": "red", "model": "Ford Perfect"}
{"color": "red", "model": "Ford Perfect"}
> db.cars.save(red_car)
> red_car
{
"color" : "red",
"model" : "Ford Perfect",
"_id" : ObjectId("4f041d96874e6f24e704f887")
}
> // Save as DBRef
> alonso = {"name": "Alonso", "owns": [new DBRef('cars', red_car._id)]}
{
"name" : "Alonso",
"owns" : [
{
"$ref" : "cars",
"$id" : ObjectId("4f041d96874e6f24e704f887")
}
]
}
> db.people.save(alonso)
As you can see DBRefs are a formal spec for referencing objects, that always contain the ObjectId but also can contain information on the database and the collection. In the above example you can see it stores the collection cars in the $ref field. Searching is trivial as you just do a query on the dbref:
> dbref = new DBRef('cars', red_car._id)
> red_car_owner = db.people.find({"owns": {$in: [dbref]}})[0]
> red_car_owner
{
"_id" : ObjectId("4f0448e3a1c5cd097fc36a65"),
"name" : "Alonso",
"owns" : [
{
"$ref" : "cars",
"$id" : ObjectId("4f0448d1a1c5cd097fc36a64")
}
]
}
Dereferencing can be done via the fetch() command in the shell:
> red_car_owner.owns[0].fetch()
{
"_id" : ObjectId("4f0448d1a1c5cd097fc36a64"),
"color" : "red",
"model" : "Ford Perfect"
}
However depending on your use case you may want to optimise this and write some code that iterates over the owns array and does as few find() queries as possible...
I think there is no way of achieving querying from multiple collections at once. I may suggest storing them inside the same collection like below with a type field.
{ "_id": 32534534, "type": "car", "color": "red", ... }
{ "_id": 93867, "type": "house", "city": "Xanadu", ... }
You need to restructure your people document to have a type to be added
{ "name": "Alonso", "owns": {ids:[32534534],type:'car'} ... }
{ "name": "Kublai Khan", "owns":{ids:[93867],type:'house'} ... }
so now you can find the people who owns the red color car by
db.people.find({type:car,ids:32534534})
and houses by
db.people.find({type:house,ids:93867})