Sort mongodb query on multiple fields - mongodb

I would like to sort a mongodb query that search for bloggers.
Here the document structure (simplified) of a Blogger :
{
posts : {
hashtags : [{
hashtag : String,
weight : Number
}]
},
globalMark : Number
}
People can search bloggers via an input text. Eg: They can write "fashion travel" and click on search button.
I would like as result to show up Bloggers who have posts that contain hashtags that match /fashion/i and /travel/i, sorted by relevancy. The relevancy depends on the globalMark and hashtag weight.
I know how to show up them skipping hashtag weight but don't know how to include this weight in my query....
Here my current query :
Blogger.find({
"$and" : [{
"posts.hashtags.hashtag" : {$regex: /fashion/i}
}, {
"posts.hashtags.hashtag" : {$regex: /travel/i}
}]
})
.sort("-globalMark")
How can I handle this weight ?
BIG THANKS !

First think: MongoDB is not a for search with like operator (if your data is big you will have some latency).
Second think: hashtags value need to be object or objects in array
Third: You can use
db.collectionName.find({
$and: [
{"posts.hashtags.hashtag" : {$regex: /fashion/i}},
{"posts.hashtags.hashtag" : {$regex: /travel/i}}
]
}).sort({
globalMark: 1, "posts.hashtags.weight": 1
})

I have made it using mongodb aggregation and $project pipeline stage. Basically $project let you modify result doc so I have used it to build a score regarding different fields then sort the aggregation on this built score.
Here the doc :
https://docs.mongodb.com/manual/reference/operator/aggregation/project/

Related

Exporting nested data from a MongoDB

I am trying to export nested fields from MongoDB to a CSV.
From the below code, I would like to extract the scale name (e.g. “Security” & “Power”) and the raw_score (e.g. 2 & 3, respectively) fields. These four fields would be stored in four columns in a CSV, where each column is an extract field.
"results" : {
"scales" : [
{
"scale" : {
"name" : "Security",
"code" : "SEC",
"multiplier" : 1
},
"raw_score" : 2
},
{
"scale" : {
"name" : "Power",
"code" : "POW",
"multiplier" : -1
},
"raw_score" : 3
}
],
In the past I have been successful using dot notation to extract nested fields (a working example below from a previous extraction), yet I am unsure how do to extract fields that share the same name.
mongoexport -d production_hoganx_collector_061817 -c records --type=csv -o col_liwc_summary_061817.csv -f user_id,post_analysis.liwc_scores.tone
How can I extract the name and raw_score fields using the mongoexport command? I have tried to export the database to a JSON file and then extract the data via R, however this method takes too long to complete.
If mongoexport is not suitable, I am open to hearing alternatives!
Many thanks,
I'm assuming this is a one time thing so I suggest using an aggregate to build a new collection with the scales array unwinded.
Unwind fans out a document in n documents, where n is the amount of elements in the unwind specified array-type field. So for example if you had a document like this one:
{
name: "Some name",
email: ["somename#somedomain.com", "name#someotherdomain.com"]
}
An unwind on the email field would result in two documents:
{
name: "Some name",
email: "somename#somedomain.com"
},
{
name: "Some name",
email: "name#someotherdomain.com"
}
So in your case I think you should use that to unwind your scales field like this:
db.collection.aggregate([
{$match: yourCondition},
{$unwind: "$scales"},
{$project: {
_id: false,
scales: true,
... other fields ...
}},
{$out: "unwindedcollection"}
]);
At this point you should be able to use mongoexport from the new collection generated (unwindedcollection), using the dot notation you used before.
Be sure to set false on _id, otherwise you'll end up with a duplicate _id error. You don't want to project that field so it creates new ids when inserting in the new collection you're dumping your aggregate results.
I'll leave the links to the docs of the concepts I used for this:
aggregate: https://docs.mongodb.com/manual/reference/method/db.collection.aggregate/
$project: https://docs.mongodb.com/manual/reference/operator/aggregation/project/
$unwind: https://docs.mongodb.com/manual/reference/operator/aggregation/unwind/
$out: https://docs.mongodb.com/manual/reference/operator/aggregation/out/

Mongodb: Combine $in and $nin

In my collection posts I've documents like this
[{
_id : ObjectId("post-object-id"),
title : 'Post #1',
category_id : ObjectId("category-object-id")
}]
I need to make some queries where I those a range of posts based on their category_id (can be multiple ids) but exclude some of them.
I've tried with the query (in shell):
db.posts.find({$and: [
{_id: { $nin : ['ObjectId("post-object-id")']}},
{category_id : { $in : ['ObjectId("category-object-id")']}}
]})
I returns 0 if count().
However, if I change the category_id attribute and remove the $in and just include one ID it work, like this:
db.posts.find({$and: [
{_id: { $nin : ['ObjectId("58a1af81613119002d42ef06")']}},
{category_id : ObjectId("58761634bfb31efd5ce6e88d")}
]})
but this solution only enables me to find by one category.
How would I got about combining $in and $nin with objectId's in the same manner as above?
This will work, just remove single quotes around ObjectId
db.posts.find({$and: [
{_id: { $nin : [ObjectId("post-object-id")]}},
{category_id : { $in : [ObjectId("category-object-id")]}}
]})
You should not put single quotes around ObjectId, it make them strings

How to update multiple documents with multiple condtions in MongoDB?

I have a MongoDB collections with various documents. Now I have the input document Ids in an array of cars which I want to update. Something like this.
req.body =
{ cars: [ '584cf6c126df866138a29408', '5852819e9693c27c136104bd' ],
name: 'Home' },
{ cars: [ '584d638242795854a091cbbf', '5842e09e372b786355ba50e7' ],
name: 'Office' } ]
Expected Operation
db.cars.update({_id : req.body[i].cars}, {name : req.body[i].name}, {new : true});
Expected Result
All four documents with ids are updated with their name field in the document.
Now one way to update cars is to have an async.each applied on the array and an aysnc.each on these two documents. That's the longer way of doing it. I was hoping if I have one async.each for these two arrays and somehow could cramp both the documents in a single query it would make the code look more elegant.
I have already gone through several pages and still haven't found anything wanted to know if this is possible in mongo or not?
At first you may need to convert your car ids String type to mongodb ObjectId that means like:
cars: [ ObjectId'584cf6c126df866138a29408'), ObjectId'5852819e9693c27c136104bd') ]
then use $in operator to match with documents.
can try this.
var ObjectId = require('mongodb').ObjectID;
var cars = req.body[i].cars.map(function (id) {
return new ObjectId(id);
})
db.cars.update({_id : {$in: cars}},
{$set : {name : req.body[i].name}},
{multi : true},
function(err, docs) {
console.log(docs);
});
Try using $in of mongodb in find query while doing update. See query below:
Model.update({_id : {$in: req.body[i].cars}},
{$set : {name : req.body[i].name}},
{multi : true});
So this way you have to run 2 queries to update the names.
See $in-doc to know more about the uses.
Hope this will help you.

How to get a specific embedded document inside a MongoDB collection? [duplicate]

This question already has answers here:
Retrieve only the queried element in an object array in MongoDB collection
(18 answers)
Closed 4 years ago.
I have a collection Notebook which has embedded array document called Notes. The sample
document looks like as shown below.
{
"_id" : ObjectId("4f7ee46e08403d063ab0b4f9"),
"name" : "MongoDB",
"notes" : [
{
"title" : "Hello MongoDB",
"content" : "Hello MongoDB"
},
{
"title" : "ReplicaSet MongoDB",
"content" : "ReplicaSet MongoDB"
}
]
}
I want to find out only note which has title "Hello MongoDB". I am not getting what should
be the query. Can anyone help me.
You can do this with mongo version higher 2.2
the query like this:
db.coll.find({ 'notes.title': 'Hello MongoDB' }, {'notes.$': 1});
you can try with $elemMatch like Justin Jenkins
Outdated answer: See the other answers.
I don't believe what you are asking is possible, at least without some map-reduce maybe?
See here: Filtering embedded documents in MongoDB
That answer suggests you change your schema, to better suit how you'd like to work with the data.
You can use a either "dot notation" or $elemMatch to get back the correct, document that has the matching "note title" ...
> db.collection.find({ "notes.title" : "Hello MongoDB"}, { "notes.title" : 1"});
or ...
> db.collection.find({ "notes" : { "$elemMatch" : { "title" : "Hello MongoDB"} }});
But you will get back the whole array, not just the array element that caused the match.
Also, something to think about ... with your current setup it woud be hard to do any operations on the items in the array.
If you don't change your schema (as the answer linked to suggests) ... I would consider adding "ids" to each element in the array so you can do things like delete it easily if needed.
You can do this in MongoDb version 3.2+ with aggregation.
Query:
db.Notebook.aggregate(
{
$project: {
"notes": {
$filter: {
input: "$notes",
as: "note",
cond: {
$eq: [ "$$note.title", "Hello MongoDB" ]
}
}
}
}
}
)
Result:
{
"_id" : ObjectId("4f7ee46e08403d063ab0b4f9"),
"notes" : [
{
"title" : "Hello MongoDB",
"content" : "Hello MongoDB"
}
]
}
$$ used here to access the variable. I used here to access the newly created note variable inside the $filter.
You can find additional details in the official documentation about $filter, $eq and $$.
$filter: Selects a subset of an array to return based on the specified condition. Returns an array with only those elements that match the condition. The returned elements are in the original order.
$eq: Compares two values and returns true/false when the values are equivalent or not (...).
$$: Variables can hold any BSON type data. To access the value of the variable, use a string with the variable name prefixed with double dollar signs ($$).
Note:
Justin Jenkin's answer is outdated and kop's answer here doesn't return multiple documents from the collection. With this aggregation query, you can return multiple documents if needed.
I needed this and wanted to post to help someone.
You can use $ or $elemMatch. The $ operator and the $elemMatch operator project a subset of elements from an array based on a condition.
The $elemMatch projection operator takes an explicit condition argument. This allows you to project based on a condition not in the query.
db.collection.find(
{
// <expression>
},
{
notes: {
$elemMatch: {
title: 'Hello MongoDB'
}
},
name: 1
}
)
The $ operator projects the array elements based on some condition from the query statement.
db.collection.find(
{
'notes.title': 'Hello MongoDB'
},
{
'notes.title.$': 1,
name: 1
}
)
You can perform the query like this:
db.coll.find({ 'notes.title': 'Hello MongoDB' });
You can also refer to the docs for more details.

Get documents with tags in list, ordered by total number of matches

Given the following MongoDB collection of documents :
{
title : 'shirt one'
tags : [
'shirt',
'cotton',
't-shirt',
'black'
]
},
{
title : 'shirt two'
tags : [
'shirt',
'white',
'button down collar'
]
},
{
title : 'shirt three'
tags : [
'shirt',
'cotton',
'red'
]
},
...
How do you retrieve a list of items matching a list of tags, ordered by the total number of matched tags? For example, given this list of tags as input:
['shirt', 'cotton', 'black']
I'd want to retrieve the items ranked in desc order by total number of matching tags:
item total matches
-------- --------------
Shirt One 3 (matched shirt + cotton + black)
Shirt Three 2 (matched shirt + cotton)
Shirt Two 1 (matched shirt)
In a relational schema, tags would be a separate table, and you could join against that table, count the matches, and order by the count.
But, in Mongo... ?
Seems this approach could work,
break the input tags into multiple "IN" statements
query for items by "OR"'ing together the tag inputs
i.e. where ( 'shirt' IN items.tags ) OR ( 'cotton' IN items.tags )
this would return, for example, three instances of "Shirt One", 2 instances of "Shirt Three", etc
map/reduce that output
map: emit(this._id, {...});
reduce: count total occurrences of _id
finalize: sort by counted total
But I'm not clear on how to implement this as a Mongo query, or if this is even the most efficient approach.
As i answered in In MongoDB search in an array and sort by number of matches
It's possible using Aggregation Framework.
Assumptions
tags attribute is a set (no repeated elements)
Query
This approach forces you to unwind the results and reevaluate the match predicate with unwinded results, so its really inefficient.
db.test_col.aggregate(
{$match: {tags: {$in: ["shirt","cotton","black"]}}},
{$unwind: "$tags"},
{$match: {tags: {$in: ["shirt","cotton","black"]}}},
{$group: {
_id:{"_id":1},
matches:{$sum:1}
}},
{$sort:{matches:-1}}
);
Expected Results
{
"result" : [
{
"_id" : {
"_id" : ObjectId("5051f1786a64bd2c54918b26")
},
"matches" : 3
},
{
"_id" : {
"_id" : ObjectId("5051f1726a64bd2c54918b24")
},
"matches" : 2
},
{
"_id" : {
"_id" : ObjectId("5051f1756a64bd2c54918b25")
},
"matches" : 1
}
],
"ok" : 1
}
Right now, it isnt possible to do unless you use MapReduce. The only problem with MapReduce is that it is slow (compared to a normal query).
The aggregation framework is slated for 2.2 (so should be available in 2.1 dev release) and should make this sort of thing much easier to do without MapReduce.
Personally, I do not think using M/R is an efficient way to do it. I would rather query for all the documents and do those calculations on the application side. It is easier and cheaper to scale your app servers than it is to scale your database servers so let the app servers do the number crunching. Of those, this approach may not work for you given your data access patterns and requirements.
An even simpler approach may be to just include a count property in each of your tag objects and whenever you $push a new tag to the array, you also $inc the count property. This is a common pattern in the MongoDB world, at least until the aggregation framework.
I'll second #Bryan in saying that MapReduce is the only possible way at the moment (and it's far from perfect). But, in case you desperately need it, here you go :-)
var m = function() {
var searchTerms = ['shirt', 'cotton', 'black'];
var me = this;
this.tags.forEach(function(t) {
searchTerms.forEach(function(st) {
if(t == st) {
emit(me._id, {matches : 1});
}
})
})
};
var r = function(k, vals) {
var result = {matches : 0};
vals.forEach(function(v) {
result.matches += v.matches;
})
return result;
};
db.shirts.mapReduce(m, r, {out: 'found01'});
db.found01.find();