I have a problem in mongodb.
I want to create aggregation witch result will be like this:
A 10
B 2
C 4
D 9
E 3
...
I have a column words in my table and I want to group my records according to first character of column words.
I find resolve for sql but not for mongo.
I will be very grateful for your help
You don't show what the docs in your collection look like, but you can use the aggregate collection method to do this:
// Group by the first letter of the 'words' field of each doc in the 'test'
// collection while generating a count of the docs in each group.
db.test.aggregate({$group: {_id: {$substr: ['$words', 0, 1]}, count: {$sum: 1}}})
Related
I have got a data like this : DATA , I try to group by domaine names , I want a result to be look like that :
[{
{ "domain": "gmail_com_"
"A": 3
"B": 5
"C": 3 },
............
}]
Where A,B are the lenght of the list for the domain names that match,and C is the size of duplicated ip address .But as you see in the result if the domain names is present in more than two diff timestamp it only group with the two first one, and I want to group two by two with all the possiblities, in my exemple , facebook is present in 3 diff tsp so we should have three diff pair. if someone can help me.
thnx
To get every possible pair of two discreet values from a series of documents, you will need to:
gather the values into an array
assign an index of some sort to identify each
duplicate the array
unwind both duplicates
eliminate pairs with the same index
An aggregation pipeline might look like so:
db.collection.aggregate([
{$group:{
_id:"$domain",
list:{$push:"$ip"}
}},
{$project:{
numberedList:{
$reduce: {
input: "$list",
initialValue: {a:[],c:0},
in:{
a:{$concatArrays:["$$value.a",[{ip:"$$this",idx:"$$value.c"}]]},
c:{$add:["$$value.c",1]}
}}}}},
{$project:{
left:"$numberedList",
right:"$numberedList"
}},
{$unwind:"$left"},
{$unwind:"$right"},
{$match:{$expr:{$ne:["$left.idx","$right.idx"]}}}
])
This should leave you with the domain name in _id and a pair of results in left and right, which you can then process as needed.
I have to return 2 documents from a single query. The first value which I will be giving in the query and the second will be the previous one(sorted).
I am able to design both separately. The below code gives separate outputs.
db.collection.find({'_id':'value1'})
db.collection.find({'_id': {'$lt': 'value1'}}).sort({'_id':-1}).limit(1)
How to combine them? So when I execute from my appl it returns 2 outputs
Fetch only a specific key instead of entire document
You can use $lte instead of $lt and limit with 2 - logically it will be the same operation
db.collection.find({ _id: { $lte: 'value1' } }, { _id: 1, yourKey: 1 }).sort({_id: -1}).limit(2)
EDIT: to get specific keys you need to specify them as second argument of .find()
I am using pipelines in pymongo to query a json file.
I have one list, "sixcities" containing the 6 'cities' with the 'highest count' of book shops i.e. the least book shops. (contains 6 pymongo instances)
{'_id': 'city1', 'count': 84}
{'_id': 'city2', 'count': 65}
{'_id': 'city3', 'count': 61}
{'_id': 'city4', 'count': 59}
{'_id': 'city5', 'count': 84}
{'_id': 'city6', 'count': 64}
I have a second list, "travelcities" with the counts of Travel Book shops in each of the 'cities' ( 20+) in the json file. (contains 20+pymongo instances)
{'_id': 'city1', 'count': 42}...etc
Please note:This list holds cities that do not feature in the first list.
I would like to use these lists to calculate the ratios of travel book shops in the 6 highest count cities.
The common key will be 'city' as this appears in documents of both lists
i.e. in list 2 : city1: 42 divided by in list 1: city1: 84 = 0.5 ratio
I am unsure of how to do this in pymongo as the information is in mongo documents within a list.
I thought some kind of nested loop would work:
dict={}
for i in sixcities: #loop through the first list
dict[i["_id"]]=i["count"]
for i in travelcities: #loop through second list
dict[i["_id"]]=i["count"]/(dict[i["_id"]]) #ratio
But I am getting the following result:
KeyError: 'city15'
This city does not appear in the first list as one of the 6 with the most bookshops, but it does appear in the second as containing a travel bookshop.
Any and all help is appreciated.
One of the problems in your code is that you are using same variable 'i' in both outer and inner loop
Consider this code which, for each city in first list search for it in the second list, then computes the ratio.
dict={}
for i in sixcities: #loop through the first list
dict[i["_id"]]=i["count"]
for j in travelcities: #loop through second list
if j["_id"] == i["_id"]:
dict[i["_id"]]=j["count"]/(dict[i["_id"]]) #ratio
Do note that if the city does not exist in the second list the answer remains the count of the city in the first list. Handle this corner case in the way you want.
I have an array of 10 unique Object IDs named Arr
I have 10,000 documents in a collection named xyz.
How can I find documents using Object IDs in the array Arr from the collection xyz with only one request?
There are $all and $in operators but are used to query fields with an array.
Or do I need to make requests equal to the length of Arr and get individual document using findOne?
EDIT:
I'm expecting something like this:
db.getCollection("xyz").find({"_id" : [array containing 10 unique IDs]})
....for which the result callback will contain an array of all the matched IDs of query array.
According to the documentation here: https://docs.mongodb.com/manual/reference/operator/query/in/
You should use the following query:
db.getCollection("xyz").find({"Arr" : { $in: [123, 456, 789 ] }});
I need help.. Is there any method available to fetch documents between a range of indexes while using find in mongo.. Like [2:10] (from 2 to 10) ?
If you are talking about the "index" position within an array in your document then you want the $slice operator. The first argument being the index to start with and the second is how many to return. So from a 0 index position 2 is the "third" index:
db.collection.find({},{ "list": { "$slice": [ 2, 8 ] })
Within a collection itself if you use the .limit() an .skip() modifiers to move through the range in the collection:
db.collection.find({}).skip(2).limit(8)
Keep in mind that in the collection context MongoDB has no concept of "ordered" records and is dependent on the query and/or sort order that is given