Is there a way to query for dict of arrays of array in MongoDB - mongodb

My mongo db collection contains the structure as :
{
"_id" : ObjectId("5889ce0d2e9bfa938c49208d"),
"filewise_word_freq" : {
"33236365" : [
[
"cluster",
4
],
[
"question",
2
],
[
"differ",
2
],[
"come",
1
]
],
"33204685" : [
[
"node",
6
],
[
"space",
4
],
[
"would",
3
],[
"templat",
1
]
]
},
"file_root" : "socialcast",
"main_cluster_name" : "node",
"most_common_words" : [
[
"node",
16
],
[
"cluster",
7
],
[
"n't",
3
]
]
}
I want to search for a value "node" inside the arrays of arrays of the filename (in my case its "33236365","33204685" and so on...) of the dict filewise_word_freq.
And if the value("node") is present inside any one of the array of arrays of the filename(33204685), then should return the filename(33204685).
I tried from this link of stackoverflow :
enter link description here
I tried to execute for my use case it didn't work. And above all this I didn't no how to return only the filename rather the entire object or document.
db.frequencydist.find({"file_root":'socialcast',"main_cluster_name":"node","filewise_word_freq":{$elemMatch:{$elemMatch:{$elemMatch:{$in:["node"]}}}}}).pretty().
It returned nothing.
Kindly help me.

the data model you have chosen has made it extremely difficult to either query or even for aggregation. I would suggest to revise your document model. However I think you can use $where
db.collection.find({"file_root": 'socialcast',
"main_cluster_name": "node", $where : "for(var i in this.filewise_word_freq){for(var j in this.filewise_word_freq[i]){if(this.filewise_word_freq[i][j].indexOf("node")>=0){return true}}}"})
yes, this will return you the whole document and from your application you might need to filter the files name out.
you might also want to see map-reduce functionality, though that's not recommended.
One other way is to do it through functions, functions runs on mongo server and are saved in a special collection.
Still going back to the db model, do revise it if that's a possibility. maybe something like
{
"_id" : ObjectId("5889ce0d2e9bfa938c49208d"),
"filewise_word_freq" : [
{
"fileName":"33236365",
"word_counts" : {
"cluster":4,
"question":2,
"differ":2,
"come":1
}
},
{
"fileName":"33204685",
"word_counts" : {
"node":6,
"space":4,
"would":3,
"template":1
}
}
]
"file_root" : "socialcast",
"main_cluster_name" : "node",
"most_common_words" : [
{
"node":16
},
{
"cluster":7
},
{
"n't":3
}
]
}
It would be a lot easier to run aggregation on these.
For this model, the aggregation would be something like
db.collection.aggregate([
{$unwind : "$filewise_word_freq"},
{$match : {'filewise_word_freq.word_counts.node' : {$gte : 0}}},
{$group :{_id: 1, fileNames : {$addToSet : "$filewise_word_freq.fileName"}}},
{$project :{ _id:0}}
])
this will provide you a single document with a single field fileNames with list of all the filename
{
fileNames : ["33204685"]
}

You can try something like this. This will match the node as part of the query and returns filewise_word_freq.33204685 as part of the projection.
db.collection.find({
"file_root": 'socialcast',
"main_cluster_name": "node",
"filewise_word_freq.33204685": {
$elemMatch: {
$elemMatch: {
$in: ["node"]
}
}
}
}, {
"filewise_word_freq.33204685": 1
}).pretty();

Related

How to boost Mongodb search result based on Criteria given

I am working on the requirement where have write query in which if users enters any acronym of university(Ex: MIT) have to get the result from database. JSON looks like this:
{
"_id" : ObjectId("5d68cdcac8acd826e6a386b2"),
"name" : "Massachusetts Institute of Technology",
"acronyms" : [
"MIT"
]
}
,
{
"_id" : ObjectId("5d68ce0bc8acd826e6a45b29"),
"name" : "Manukau Institute of Technology",
"acronyms" : [
"MIT"
]
}
User might input "Name" as well. I have written "OR" query for that.
db.getCollection('universityCollection').find(
{$or: [{"name":"MIT"},{"acronyms":"MIT"}]}
)
Now my requirement is if users enters "input" and if it matches with acronym it should return it first after that it will return items which matches with name.
Current or query is not returning expected order.
Any pointers will help.
Please try below query.
db.getCollection('test').aggregate(
{ $match : { $or : [{ "name":"MIT" }, {"acronyms":"MIT" } ] } }
,{ "$project": {
"name": 1,
"acronyms": 1,
"sortOrder": {
"$setIsSubset": [ ["MIT" ] , "$acronyms" ] }
}
}
,{ "$sort": { "sortOrder": -1 } }
)
If you are not familiar with MongoDB aggregates, check the below links.
https://docs.mongodb.com/manual/reference/method/db.collection.aggregate/
https://docs.mongodb.com/manual/reference/operator/aggregation/setIsSubset/

mongoDB: Querying for documents that may have some specifics options

I'm quite new to mongodb and there is one thing I can't solve right now:
Let's pretend, you have the following document structure:
{
"_id": ObjectId("some object id"),
name: "valueName",
options: [
{idOption: "optionId", name: "optionName"},
{idOption: "optionId", name: "optionName"}
]
}
And each document can have multiples options that are already classified.
I'm trying to get all the documents in the collection that have, at least one, of the multiples options that I pass for the query.
I was trying with the operator $elemMatch something like this:
db.collectioName.find({"options.name": { $elemMatch: {"optName1","optName2"}}})
but it never show me the matches documents.
Can someone help and show me, what I'm doing wrong?
Thanks!
Given a collection which contains the following documents:
{
"_id" : ObjectId("5a023b8d027b5bd06add627a"),
"name" : "valueName",
"options" : [
{
"idOption" : "optionId",
"name" : "optName1"
},
{
"idOption" : "optionId",
"name" : "optName2"
}
]
}
{
"_id" : ObjectId("5a023b9e027b5bd06add627d"),
"name" : "valueName",
"options" : [
{
"idOption" : "optionId",
"name" : "optName3"
},
{
"idOption" : "optionId",
"name" : "optName4"
}
]
}
This query ...
db.collection.find({"options": { $elemMatch: {"name": {"$in": ["optName1"]}}}})
.. will return the first document only.
While, this query ...
db.collection.find({"options": { $elemMatch: {"name": {"$in": ["optName1", "optName3"]}}}})
...will return both documents.
The second example (I think) meeets this requirement:
I'm trying to get all the documents in the collection that have, at least one, of the multiples options that I pass for the query.

Is there a way to query an embedded document in an embedded document?

I have a weird mongodb document, but still need to query on it. Is it possible?
For example: I need every player within a certain radius.
{
"_id" : ObjectId("55d89c63c746230c200c528e"),
"speler_id" : 12,
"naam" : "Arjen Robben",
"seconds" : [
[
{
"locatie" : [
8.7173307286181370,
33.2784843816214250
],
"timestamp" : ISODate("1970-01-01T19:00:01.000Z")
},
{
"locatie" : [
-45.8853075448968970,
138.1526615469845800
],
"timestamp" : ISODate("1970-01-01T19:00:02.000Z")
},
{
"locatie" : [
80.5503710377444690,
10.0500048843973580
],
"timestamp" : ISODate("1970-01-01T19:00:03.000Z")
}
]
]
}
Well you can always use $geoWithin with $center or $centerSphere ( depending on whether these are global geometry coordinates or just a flat plane, for distance caluation purposes ) after processing with $unwind in the aggregation framework:
db.collection.aggregate([
{ "$unwind": "$seconds" },
{ "$unwind": "$seconds" },
{ "$match": {
"seconds.locatie": {
"$geoWithin": {
"$centerSphere": [
[
8.7173307286181370,
33.2784843816214250
],
100
]
}
}
}}
])
Which on the presented data would return:
{
"_id" : ObjectId("55d89c63c746230c200c528e"),
"speler_id" : 12,
"naam" : "Arjen Robben",
"seconds" : {
"locatie" : [
8.717330728618137,
33.278484381621425
],
"timestamp" : ISODate("1970-01-01T19:00:01Z")
}
}
{
"_id" : ObjectId("55d89c63c746230c200c528e"),
"speler_id" : 12,
"naam" : "Arjen Robben",
"seconds" : {
"locatie" : [
80.55037103774447,
10.050004884397358
],
"timestamp" : ISODate("1970-01-01T19:00:03Z")
}
}
Since $geoWithin does not "require" a geospatial index, then this is fine to use at later aggregation stages than the initial match. The $centerSphere in this case defines a point to query from and the "radius" extending from that point. This is just really a geometery "shortcut" as you can alternately provdide a GeoJSON polygon of your own definition.
But it's not really great. And mostly because it will not be able to use an index and therefore is pretty much brute force over the whole collection, and in that you cannot do nice things like return the distance from the queried point, like you can do with $geoNear.
Therefore while you can do things like this, most geoSpatial queries with MongoDB are best left to keeping that location data at the top level of the document, rather than embedded within arrays. So such modelling usually means having separate collection objects rather than embedded ones for the best results.
If you want an aggregated array in your response, then it is better to do this in aggregation after the intial geospatial query is made.

Removing a (sub) array in MongoDB using $pull

So..I'm evaluating Mongodb for managing a bit of my JSON back end. I'm totally new to it and I had one problem that was just messy to do in code, so I thought — heck — let me check out to see if it's time to finally start using Mongo
I have a data structure that is approximately like this:
[
{
"_id" : ObjectId("526f59ee82f2e293f9833c54"),
"humans" : [
{
"serviceUsers" : [
{
"foo1" : "bar2",
"foo2" : "bar3"
},
{
"foo1" : "baz2",
"foo2" : "baz3"
}
]
}
]
}
]
And now I want to remove any serviceUsers array elements that have "foo1" equal to "baz2" so that ideally I would end up with this:
[
{
"_id" : ObjectId("526f59ee82f2e293f9833c54"),
"humans" : [
{
"serviceUsers" : [
{
"foo1" : "bar2",
"foo2" : "bar3"
},
]
}
]
}
]
I figured that $pull was the place to start. And I tried a bunch of contortions. If I'm in collection mytests, I tried
db.mytests.update({"humans.serviceUsers.foo1":"baz2"}, {$pull:{"humans.serviceUsers" : {"foo1":"baz2"}}}, {multi: true})
Which to my admittedly naive eye, seems like it should follow the $pull syntax:
db.collection.update( { field: <query> }, { $pull: { field: <query> } } );
Mongo doesn't complain. But it doesn't change the collection in any way, either.
I also tried
db.mytests.update({}, {$pull:{"humans.serviceUsers" : {"foo1":"baz2"}}}, {multi: true})
Which also failed.
Any suggestions are greatly appreciated.
Thus humans is also array, you should use positional $ operator to access serviceUsers array of matched humans element:
db.mytests.update({ "humans.serviceUsers.foo1" : "baz2" },
{ $pull: { "humans.$.serviceUsers" : { "foo1": "baz2" }}});

Mongodb array structure

Theres something here I can't quite figure out.
When I attempt to query an object with several fields I yield no results. The object structure looks like this:
{
"_id" : ObjectId("4d8b55f017a7303b0b000000"),
"title" : "Apollo",
"body" : "A spaceflight mission to the moon",
"tags" : [ [ "moon", "space", "nasa", "mission" ] ]
}
This is my query:
db.test.find({ tags: { $all: ['moon', 'mission'] } })
However I do get result by creating a new object with a single field:
{
"_id" : ObjectId("4d8b9e5935037b3c8228709c"),
"tags" : [ "apple", "banana", "pear" ]
}
... with the same query as the one above.
['tags'] isn't nested inside any other array, so why does it not return my search queries? Please enlighten me.
Sincerely,
Why
Why are you using a nested array
"tags" : [ [ "moon", "space", "nasa", "mission" ] ]
here?
This does not make any sense.
db.test.find({ tags: { $all: [ ['moon', 'mission'] ] } })