So I have data that looks like this:
{
_id: 1,
ranking: 5,
tags: ['Good service', 'Clean room']
}
Each of these stand for a review. There can be multiple reviews with a ranking of 5. The tags field can be filled with up to 4 different tags.
4 tags are: 'Good service', 'Good food', 'Clean room', 'Need improvement'
I want to make a MongoDB aggregate query where I say 'for each ranking (1-5) give me the number of times each tag occurred for each ranking.
So an example result might look like this, _id being the ranking:
[
{ _id: 5,
totalCount: 5,
tags: {
goodService: 1,
goodFood: 3,
cleanRoom: 1,
needImprovement: 0
},
{ _id: 4,
totalCount: 7,
tags: {
goodService: 0,
goodFood: 2,
cleanRoom: 3,
needImprovement: 0
},
...
]
Having trouble with the counting the occurrences of each tag. Any help would be appreciated
You can try below aggregation.
db.colname.aggregate([
{"$unwind":"$tags"},
{"$group":{
"_id":{
"ranking":"$ranking",
"tags":"$tags"
},
"count":{"$sum":1}
}},
{"$group":{
"_id":"$_id.ranking",
"totalCount":{"$sum":"$count"},
"tags":{"$push":{"tags":"$_id.tags","count":"$count"}}
}}
])
To get the key value pair instead of array you can replace $push with $mergeObjects from 3.6 version.
"tags":{"$mergeObjects":{"$arrayToObject":[[["$_id.tags","$count"]]]}}
I want to rename tags in our documents' tags array, e.g. change all tags a in the collection to c. The documents look something like this:
[ { _id: …, tags: ['a', 'b', 'c'] },
{ _id: …, tags: ['a', 'b'] },
{ _id: …, tags: ['b', 'c', 'd'] } ]
I need to keep tags unique. This means, an update like this will not work, because the first document will end up containing tag c twice:
db.docs.update(
{ tags: 'a' },
{ $set: { 'tags.$': 'c' } }
)
So, I tried this alternatively:
db.docs.update(
{ tags: 'a' },
{
$pull: { 'a' },
$addToSet: { 'c' }
}
)
But this gives a MongoError: Cannot update 'tags' and 'tags' at the same time.
Any chance of renaming the tags with one single update?
According to official MongoDB documentation, there is no way of expressing "replace" operation on a set of elements. So I guess, there isn't a way to do this in single update.
Update:
After some more investigation, I came across this document. If I understand it correctly, your query should look like this:
db.docs.update({
tags: 'a'
}, {
$set: { 'tags.$': 'c'}
})
Where 'tags.$' represents selector of the first element in "tags" array that matches the query, so it replaces first occurence of 'a' with 'c'. As I understand, your "tags" array does not contain duplicates, so first match will be the only match.
If the following is document
{ 'a': {
'b': ['a', 'x', 'b'],
't': ['a', 'z', 'w', 't']
}
}
I want to be able to obtain the value associated with the nested object. For example, in python, I would do print(dict_name['a']['t']).
I have tried find() and findOne() on both of the commands below
db.my_collection.find({}, { 'a.t': 1 })
db.my_collection.find({ 'a.t': {$exists: 'true} })
but they haven't been returning the correct data.
How can I query for the document with 'a' as a key, then that document, obtain the value associated with 't', expecting ['a', 'z', 'w', 't'] to be returned?
How about this? :
db.my_collection.aggregate([{"$project":{"_id":"$_id", "t":"$a.t"}}]);
On this test collection
{
"_id" : ObjectId("577ba92187630c1a06c4bcac"),
"a" : {
"b" : [
1,
2
],
"t" : [
2,
3
]
}
}
It gave me the following result-
{ "_id" : ObjectId("577ba92187630c1a06c4bcac"), "t" : [ 2, 3 ] }
You can do the following aggregation:
db.collection.aggregate([
{
$project: {
'_id': '$_id',
't': '$a.t'
}
}
])
This should give you what you are looking for.
Heres a link to project, basically assigning your 'a.t' array to a new value named 't' (this is what 't': '$a.t' pretty much means)
Say Object embeds_many searched_items
Here is the document:
{"_id": { "$oid" : "5320028b6d756e1981460000" },
"searched_items": [
{
"_id": { "$oid" : "5320028b6d756e1981470000" },
"hotel_id": 127,
"room_info": [
{
"price": 10,
"amenity_ids": [
1,
2
]
},
{
"price": 160,
"amenity_ids": null
}
]
},
{
"_id": { "$oid" : "5320028b6d756e1981480000" },
"hotel_id": 161,
"room_info": [
{
"price": 400,
"amenity_ids": [4,5]
}
]
}
]
}
I want to find the "searched_items" having room_info.amenity_ids IN [2,3].
I've tried
object.searched_items.where('room_info.amenity_ids' => [2, 3])
object.searched_items.where('room_info.amenity_ids' =>{'$in' => [2,3]}
with no luck
mongoid provides elem_match method for searching within objects of Array Type
e.g.
class A
include Mongoid::Document
field :some_field, type: Array
end
A.create(some_field: [{id: 'a', name: 'b'}, {id: 'c', name: 'd'}])
A.elem_match(some_field: { :id.in=> ["a", "c"] }) => will return the object
Let me know if you have any other doubts.
update
class SearchedHotel
include Mongoid::Document
field :hotel_id, type: String
field :room_info, type: Array
end
SearchedHotel.create(hotel_id: "1", room_info: [{id: 1, amenity_ids: [1,2], price: 600},{id: 2, amenity_ids: [1,2,3], price: 1000}])
SearchedHotel.create(hotel_id: "2", room_info: [{id: 3, amenity_ids: [1,2], price: 600}])
SearchedHotel.elem_match(room_info: {:amenity_ids.in => [1,2]})
Mongoid::Criteria
selector: {"room_info"=>{"$elemMatch"=>{"amenity_ids"=>{"$in"=>[1, 2]}}}}
options: {}
class: SearchedHotel
embedded: false
And it returns both the records. Am I missing something from your question/requirement. If yes, do let me know.
It's important to distinguish between top-level queries sent to the MongoDB server and
client-side operations on embedded-documents that are implemented by Mongoid.
This is the underlying confusion between the original question and the answer from #sandeep-kumar and associated comments.
The original question is all about the where clause on embedded documents after the query result has already been fetched.
The answer #sandeep-kumar and comments are all about top-level queries.
The following test covers both, showing how answers from #sandeep-kumar do work on the examples in your comments,
and also what does and does not work on your original question.
To summarize, Sandeep's answers do work for top-level queries.
Please review your code, if there are remaining problems, please post the exact Ruby code that summarizes the problem.
For your original question, please note that "object" has already been fetched from MongoDB,
and that you can verify this by looking at the log/test.log file.
The subsequent "where" operations are all client-side execution by Mongoid.
Simple "where" clauses do work at the embedded document level.
Complex "where" clauses involving nested array values don't seem to work -
I didn't really expect Mongoid to reimplement '$in' on the client-side.
Knowing that the "object" already has the query result,
and that the association "searched_items" gives you convenient access to the embedded documents,
you can write Ruby code to select what you want as in the following test.
Hope that this helps.
test/unit/my_object_test.rb
require 'test_helper'
require 'pp'
class MyObjectTest < ActiveSupport::TestCase
def setup
MyObject.delete_all
A.delete_all
SearchedHotel.delete_all
end
test "original question with client-side where operation on embedded documents" do
doc = {"_id"=>{"$oid"=>"5320028b6d756e1981460000"}, "searched_items"=>[{"_id"=>{"$oid"=>"5320028b6d756e1981470000"}, "hotel_id"=>127, "room_info"=>[{"price"=>10, "amenity_ids"=>[1, 2]}, {"price"=>160, "amenity_ids"=>nil}]}, {"_id"=>{"$oid"=>"5320028b6d756e1981480000"}, "hotel_id"=>161, "room_info"=>[{"price"=>400, "amenity_ids"=>[4, 5]}]}]}
MyObject.create(doc)
puts
object = MyObject.first
<<-EOT.split("\n").each{|line| puts "#{line}:"; eval "pp #{line}"}
object.searched_items.where('hotel_id' => 127).to_a
object.searched_items.where(:hotel_id.in => [127,128]).to_a
object.searched_items.where('room_info.amenity_ids' => {'$in' => [2,3]}).to_a
object.searched_items.where('room_info.amenity_ids'.to_sym.in => [2,3]).to_a
object.searched_items.select{|searched_item| searched_item.room_info.any?{|room_info| room_info['amenity_ids'] && !(room_info['amenity_ids'] & [2,3]).empty?}}.to_a
EOT
end
test "A comment - top-level queries" do
A.create(some_field: [{id: 'a', name: 'b', tag_ids: [6,7,8]}, {id: 'c', name: 'd'}, tag_ids: [5,6,7]])
A.create(some_field: [{id: 'a', name: 'b', tag_ids: [1,2,3]}, {id: 'c', name: 'd'}, tag_ids: [2,3,4]])
puts
pp A.where('some_field.tag_ids'.to_sym.in => [2,3]).to_a
pp A.elem_match(some_field: { :tag_ids.in => [2,3,4] }).to_a
end
test "SearchedHotel comment - top-level query" do
s = <<-EOT
[#<SearchedHotel _id: 53253c246d756e49a7030000, hotel_id: \"1\", room_info: [{\"id\"=>1, \"amenity_ids\"=>[1, 2], \"price\"=>600}, {\"id\"=>2, \"amenity_ids\"=>[1, 2, 3], \"price\"=>1000}]>, #<SearchedHotel _id: 53253c246d756e49a7040000, hotel_id: \"2\", room_info: [{\"id\"=>3, \"amenity_ids\"=>[1, 2], \"price\"=>600}]>]
EOT
a = eval(s.gsub('#<SearchedHotel ', '{').gsub(/>,/, '},').gsub(/>\]/, '}]').gsub(/_id: \h+, /, ''))
SearchedHotel.create(a)
puts
<<-EOT.split("\n").each{|line| puts "#{line}:"; eval "pp #{line}"}
SearchedHotel.elem_match(room_info: {:amenity_ids.in => [1,2]}).to_a
EOT
end
end
$ ruby -Ilib -Itest test/unit/my_object_test.rb
Run options:
# Running tests:
[1/3] MyObjectTest#test_A_comment_-_top-level_queries
[#<A _id: 5359329d7f11ba034b000002, some_field: [{"id"=>"a", "name"=>"b", "tag_ids"=>[1, 2, 3]}, {"id"=>"c", "name"=>"d"}, {"tag_ids"=>[2, 3, 4]}]>]
[#<A _id: 5359329d7f11ba034b000002, some_field: [{"id"=>"a", "name"=>"b", "tag_ids"=>[1, 2, 3]}, {"id"=>"c", "name"=>"d"}, {"tag_ids"=>[2, 3, 4]}]>]
[2/3] MyObjectTest#test_SearchedHotel_comment_-_top-level_query
SearchedHotel.elem_match(room_info: {:amenity_ids.in => [1,2]}).to_a:
[#<SearchedHotel _id: 5359329d7f11ba034b000003, hotel_id: "1", room_info: [{"id"=>1, "amenity_ids"=>[1, 2], "price"=>600}, {"id"=>2, "amenity_ids"=>[1, 2, 3], "price"=>1000}]>,
#<SearchedHotel _id: 5359329d7f11ba034b000004, hotel_id: "2", room_info: [{"id"=>3, "amenity_ids"=>[1, 2], "price"=>600}]>]
[3/3] MyObjectTest#test_original_question_with_client-side_where_operation_on_embedded_documents
object.searched_items.where('hotel_id' => 127).to_a:
[#<SearchedItem _id: 5359329d7f11ba034b000006, hotel_id: 127, room_info: [{"price"=>10, "amenity_ids"=>[1, 2]}, {"price"=>160, "amenity_ids"=>nil}]>]
object.searched_items.where(:hotel_id.in => [127,128]).to_a:
[#<SearchedItem _id: 5359329d7f11ba034b000006, hotel_id: 127, room_info: [{"price"=>10, "amenity_ids"=>[1, 2]}, {"price"=>160, "amenity_ids"=>nil}]>]
object.searched_items.where('room_info.amenity_ids' => {'$in' => [2,3]}).to_a:
[]
object.searched_items.where('room_info.amenity_ids'.to_sym.in => [2,3]).to_a:
[]
object.searched_items.select{|searched_item| searched_item.room_info.any?{|room_info| room_info['amenity_ids'] && !(room_info['amenity_ids'] & [2,3]).empty?}}.to_a:
[#<SearchedItem _id: 5359329d7f11ba034b000006, hotel_id: 127, room_info: [{"price"=>10, "amenity_ids"=>[1, 2]}, {"price"=>160, "amenity_ids"=>nil}]>]
Finished tests in 0.089544s, 33.5031 tests/s, 0.0000 assertions/s.
3 tests, 0 assertions, 0 failures, 0 errors, 0 skips
Suppose I have a insert a set of documents each with an array field. I would like to find all documents such that their array field is a subset of a query array. For example, if I have the following documents,
collection.insert([
{
'name': 'one',
'array': ['a', 'b', 'c']
},
{
'name': 'two',
'array': ['b', 'c', 'd']
},
{
'name': 'three',
'array': ['b', 'c']
}
])
and I query collection.find({'array': {'$superset': ['a', 'b', 'c']}), I would expect to see documents one and three as ['a', 'b', 'c'] and ['b', 'c'] are both subsets of ['a', 'b', 'c']. In other words, I'd like to do the inverse of Mongo's $all query, which selects all documents such that the query array is a subset of the document's array field. Is this possible? and if so, how?
In MongoDb, for array field:
"$in:[...]" means "intersection" or "any element in",
"$all:[...]" means "subset" or "contain",
"$elemMatch:{...}" means "any element match"
"$not:{$elemMatch:{$nin:[...]}}" means "superset" or "in"
There is a simple way to do this with aggregation framework or with a find query.
Find query is simple, but you have to use $elemMatch operator:
> db.collection.find({array:{$not:{$elemMatch:{$nin:['a','b','c']}}}}, {_id:0,name:1})
Note that this indicates that we want to not match an array which has an element which is (at the same time) not equal to 'a', 'b' or 'c'. I added a projection which only returns the name field of the resultant document which is optional.
To do this within the context of aggregation, you can use $setIsSubset:
db.collection.aggregate([
// Project the original doc and a new field that indicates if array
// is a subset of ['a', 'b', 'c']
{$project: {
doc: '$$ROOT',
isSubset: {$setIsSubset: ['$array', ['a', 'b', 'c']]}
}},
// Filter on isSubset
{$match: {isSubset: true}},
// Project just the original docs
{$project: {_id: 0, doc: 1}}
])
Note that $setIsSubset was added in MongoDB 2.6.