MongoDb $projection query on C# - mongodb

I need help on how to build a MongoDB query from the C# Driver. What I'm trying to make is a datediff in milliseconds and then filter those results where the datediff in milliseconds is greater or equal than an specific number.
The mongodb query that I use in the mongo shell is:
db.getCollection('Coll').aggregate(
[
{$project : {
"dateInMillis" : {$subtract: [ new Date(), "$UpdateDate" ]},
"Param2": "$Param2",
"Param3": "$Param3"}
},
{$match :{ dateInMillis : { $gte : 2662790910}}}
],
{
allowDiskUse : true
});
Which would be the equivalente C# expression?
I've been trying to make the query in many different ways without any result.

I finally found the way to make the aggregate query through the mongodb c# driver. I don't know if its the most efficient way but it's working.
var project = new BsonDocument()
{
{
"$project",
new BsonDocument
{
{"dateInMillis", new BsonDocument
{
{
"$subtract", new BsonArray() {new BsonDateTime(DateTime.UtcNow), "$UpdateDate" }
}
}
},
{
"Param2", "$Param2"
},
{
"Param3", "$Param3"
},
{
"_id", 0
}
}
}
};
var match = new BsonDocument()
{
{
"$match",
new BsonDocument
{
{
"dateInMillis",
new BsonDocument {
{
"$gte",
intervalInMilliseconds
}
}
}
}
}
};
var collection = db.GetCollection<CollClass>("Coll");
var pipeline = new[] { project, match };
var resultPipe = collection.Aggregate<CollClassRS>(pipeline);

Related

[MongoDb][net core]Improve performance of Count and Group By for more than 50 million record

I'm using MongoDB.Driver(.net core) with MongoDB. I'm writing aggregate data and I see that mongo count and group is very very slow. My API run around 2 minutes.
Is any way to improve the performance of the Count and group method? Or any alternative solutions for query?
public async Task<(List<string>, List<int>, int)> CountByType()
{
var collection = _connectionData.GetCollection<abc>("abc");
var records_actor = await collection
.Aggregate()
.Group(new BsonDocument { { "_id", "$actor_type" }, { "count", new BsonDocument("$sum", 1) } })
.Project(new BsonDocument{
{ "count", 1 },
{ "_id", 0 },
{ "name_type", new BsonDocument(
"$switch", new BsonDocument(
"branches", new BsonArray{
new BsonDocument{
{ "case",new BsonDocument(
"$eq",new BsonArray{"$_id",2})
},
{ "then", "System" }
},
new BsonDocument{
{ "case",new BsonDocument(
"$eq",new BsonArray{"$_id",1})
},
{ "then", "App" }
}
})
)
}
})
.Sort(new BsonDocument { { "count", -1 } })
.ToListAsync();
}
I have index for the field actor type but the index doesn't improve anything

What query uses less resources in MongoDB?

I am getting familiarized with Lucene and MongoDB Atlas search, and I have a question about query efficiency.
Which one of those queries uses fewer resources?
If there are better queries for performing the below task, please let me know.
I want to return all movies (sample_mflix) that match on a title value. The movies must be for a specific year (should not return any movie that is not for that year), and I would like to return movies with "$gte" values for movies.awards.nominations & movies.awards.wins.
The first query seems more complex (which seems to increase resource utilization - query complexity?). This query also is not returning values for that year only. That makes me think that there is probably a better way to do this with Atlas search.
The second query uses the $search and a $match in different stages. It has a simple Lucene search (which might return more movies than the first query?), and the match operator will filter the results. The second query is more precise - from my tests, it respects the year constraint. If I apply a limit stage, would this be a better solution?
If those queries were executed in the same scenario, which one would be more efficient, and why (apologies, the second query is formatted for .net driver)?
new BsonArray
{
new BsonDocument("$search",
new BsonDocument
{
{ "index", "nostoreindex" },
{ "compound",
new BsonDocument
{
{ "must",
new BsonDocument("near",
new BsonDocument
{
{ "path", "year" },
{ "origin", 2000 },
{ "pivot", 1 }
}) },
{ "must",
new BsonDocument("text",
new BsonDocument
{
{ "query", "poor" },
{ "path", "title" }
}) },
{ "should",
new BsonDocument("range",
new BsonDocument
{
{ "path", "awards.nominations" },
{ "gte", 1 }
}) },
{ "should",
new BsonDocument("range",
new BsonDocument
{
{ "path", "awards.wins" },
{ "gte", 1 }
}) }
} }
})
}
VS
var searchStage =
new BsonDocument("$search",
new BsonDocument
{
{ "index", "nostoreindex" },
{ "text",
new BsonDocument
{
{ "query", title },
{ "path", "title" }
} }
});
var matchStage = new BsonDocument("$match",
new BsonDocument("$and",
new BsonArray
{
new BsonDocument("year",
new BsonDocument("$eq", year)),
new BsonDocument("awards.nominations",
new BsonDocument("$gte", nominations)),
new BsonDocument("awards.wins",
new BsonDocument("$gte", awards))
})
);
When using Atlas Search, it is better to avoid using a succeeding $match filter after your $search stage. This is because all data will need to be looked up in your mongod by id, which can be quite slow.
So, generally, you are trying to keep your search and filters "in Lucene" if possible, to avoid extra IO and comparisons.
In your case, you are using near which will return all results in order descending from near. You should use range instead which can filter those results and speed up your query.
near is used to score your results higher if they are closer to a specific value, which can simulate a sort. For example, if you want to score results with higher 'awards.wins' you may wish to add a near : { origin: 10000, pivot: 1} then the closer the value is to 10000 the higher the score.
new BsonArray
{
new BsonDocument("$search",
new BsonDocument
{
{ "index", "nostoreindex" },
{ "compound",
new BsonDocument
{
{ "must",
new BsonDocument("range",
new BsonDocument
{
{ "path", "year" },
{ "gte", 2000 },
{ "lte", 2000 }
}) },
{ "must",
new BsonDocument("text",
new BsonDocument
{
{ "query", "poor" },
{ "path", "title" }
}) },
{ "should",
new BsonDocument("range",
new BsonDocument
{
{ "path", "awards.nominations" },
{ "gte", 1 }
}) },
{ "should",
new BsonDocument("range",
new BsonDocument
{
{ "path", "awards.wins" },
{ "gte", 1 }
}) }
} }
})
}

Find and change all date type fields in mongodb collection

I have a collection with multiple date type fields. I know I can change them based on their key, but is there a way to find all fields that have date as a type and change all of them in one script?
UPDATE
Many thanks to chridam for helping me out. Based upon his code I came up with this solution. (Note: I have mongo 3.2.9, and some code snippets from chridam's answer just wouldn't run. It might be valid but it didn't work for me.)
map = function() {
for (var key in this) {
if (key != null && this[key] != null && this[key] instanceof Date){
emit(key, null);
}
}
}
collectionName = "testcollection_copy";
mr = db.runCommand({
"mapreduce": collectionName,
"map": map,
"reduce": function() {},
"out": "map_reduce_test" // out is required
})
dateFields = db[mr.result].distinct("_id")
printjson(dateFields)
//updating documents
db[collectionName].find().forEach(function (document){
for(var i=0;i<dateFields.length;i++){
document[dateFields[i]] = new NumberLong(document[dateFields[i]].getTime());
}
db[collectionName].save(document);
});
Since projection didn't work, I used the above code for updating the documents.
My only question is why to use bulkWrite?
(Also, getTime() seemed better than substracting dates.)
An operation like this would involve two tasks; one to get a list of fields with the date type via MapReduce and the next to update the collection via aggregation or Bulk write operations.
NB: The following methodology assumes all the date fields are at the root level of the document and not embedded nor subdocuments.
MapReduce
The first thing you need is to run the following mapReduce operation. This will help you determine if each property with every document in the collection is of date type and returns a distinct list of the date fields:
// define helper function to determine if a key is of Date type
isDate = function(dt) {
return dt && dt instanceof Date && !isNaN(dt.valueOf());
}
// map function
map = function() {
for (var key in this) {
if (isDate(value[key])
emit(key, null);
}
}
// variable with collection name
collectionName = "yourCollectionName";
mr = db.runCommand({
"mapreduce": collectionName,
"map": map,
"reduce": function() {}
})
dateFields = db[mr.result].distinct("_id")
printjson(dateFields)
//output: [ "validFrom", "validTo", "registerDate"" ]
Option 1: Update collection via aggregation framework
You can use the aggregation framework to update your collection, in particular the $addFields operator available in MongoDB version 3.4 and newer. If your MongoDB server version does not support this, you can update your collection with the other workaround (as described in the next option).
The timestamp is calculated by using the $subtract arithmetic aggregation operator with the date field as minuend and the date since epoch new Date("1970-01-01") as subtrahend.
The resulting documents of the aggregation pipeline are then written to the same collection via the $out operator thus updating the collection with the new fields.
In essence, you'd want to end up running the following aggregation pipeline which converts the date fields to timestamps using the above algorithm:
pipeline = [
{
"$addFields": {
"validFrom": { "$subtract": [ "$validFrom", new Date("1970-01-01") ] },
"validTo": { "$subtract": [ "$validTo", new Date("1970-01-01") ] },
"registerDate": { "$subtract": [ "$registerDate", new Date("1970-01-01") ] }
}
},
{ "$out": collectionName }
]
db[collectionName].aggregate(pipeline)
You can dynamically create the above pipeline array given the list of the date fields as follows:
var addFields = { "$addFields": { } },
output = { "$out": collectionName };
dateFields.forEach(function(key){
var subtr = ["$"+key, new Date("1970-01-01")];
addFields["$addFields"][key] = { "$subtract": subtr };
});
db[collectionName].aggregate([addFields, output])
Option 2: Update collection via Bulk
Since this option is a workaround when $addFields operator from above is not supported, you can use the $project pipeline to create the new timestamp fields with the same $subtract implementation but instead of writing the results to the same collection, you can iterate the cursor from the aggregate results using forEach() method and with each document, update the collection using the bulkWrite() method.
The following example shows this approach:
ops = []
pipeline = [
{
"$project": {
"validFrom": { "$subtract": [ "$validFrom", new Date("1970-01-01") ] },
"validTo": { "$subtract": [ "$validTo", new Date("1970-01-01") ] },
"registerDate": { "$subtract": [ "$registerDate", new Date("1970-01-01") ] }
}
}
]
db[collectionName].aggregate(pipeline).forEach(function(doc) {
ops.push({
"updateOne": {
"filter": { "_id": doc._id },
"update": {
"$set": {
"validFrom": doc.validFrom,
"validTo": doc.validTo,
"registerDate": doc.registerDate
}
}
}
});
if (ops.length === 500 ) {
db[collectionName].bulkWrite(ops);
ops = [];
}
})
if (ops.length > 0)
db[collectionName].bulkWrite(ops);
Using the same method as Option 1 above to create the pipeline and the bulk method objects dynamically:
var ops = [],
project = { "$project": { } },
dateFields.forEach(function(key){
var subtr = ["$"+key, new Date("1970-01-01")];
project["$project"][key] = { "$subtract": subtr };
});
setDocFields = function(doc, keysList) {
setObj = { "$set": { } };
return keysList.reduce(function(obj, key) {
obj["$set"][key] = doc[key];
return obj;
}, setObj )
}
db[collectionName].aggregate([project]).forEach(function(doc) {
ops.push({
"updateOne": {
"filter": { "_id": doc._id },
"update": setDocFields(doc, dateFields)
}
});
if (ops.length === 500 ) {
db[collectionName].bulkWrite(ops);
ops = [];
}
})
if (ops.length > 0)
db[collectionName].bulkWrite(ops);

$elemMatch equivalent in spring data mongodb

I need to know the equivalent code in spring data mongo db to the code below:-
db.inventory.find( {
qty: { $all: [
{ "$elemMatch" : { size: "M", num: { $gt: 50} } },
{ "$elemMatch" : { num : 100, color: "green" } }
] }
} )
I am able to get the answer. This can be done in Spring data mongodb using following code
Query query = new Query();
query.addCriteria(Criteria.where("qty").elemMatch(Criteria.where("size").is("M").and("num").gt(50).elemMatch(Criteria.where("num").is(100).and("color").is("green"))));
I think query in your answer generated below query
{ "qty" : { "$elemMatch" : { "num" : 100 , "color" : "green"}}}
I think thats not you need.
Its only check last elemMatch expression not all.
Try with this.
query = new Query();
Criteria first = Criteria.where("qty").elemMatch(Criteria.where("size").is("M").and("num").gt(50));
Criteria two = Criteria.where("qty").elemMatch(Criteria.where("num").is(100).and("color").is("green"));
query.addCriteria(new Criteria().andOperator(first, two));
Just implemented $all with $elemMatch with spring data Criteria API:
var elemMatch1 = new Criteria()
.elemMatch(Criteria.where("size").is("M").and("num").gt(50));
var elemMatch2 = new Criteria()
.elemMatch(Criteria.where("num").is(100).and("color").is("green"));
var criteria = Criteria.where("qty")
.all(elemMatch1.getCriteriaObject(), elemMatch2.getCriteriaObject());
mongoTemplate.find(Query.query(criteria), Inventory.class);
Note: important part is calling getCriteriaObject method inside Criteria.all(...) for each Criteria.elemMatch(...) element.
Just to make #Vaibhav answer a bit more clearer.
given Document in DB
{
"modified": true,
"items": [
{
"modified": true,
"created": false
},
{
"modified": false,
"created": false
},
{
"modified": true,
"created": true
}
]
}
You could do following Query if you need items where both attribute of an item are true.
Query query = new Query();
query.addCriteria(Criteria.where("modified").is(true));
query.addCriteria(Criteria.where("items")
.elemMatch(Criteria.where("modified").is(true)
.and("created").is(true)));
here an example how to query with OR in elemMatch
Query query = new Query();
query.addCriteria(Criteria.where("modified").is(true));
query.addCriteria(Criteria.where("items")
.elemMatch(new Criteria().orOperator(
Criteria.where("modified").is(true),
Criteria.where("created").is(true))));
Hi i implemented too in Kotlin, its is extended about if statements to create dynamic query :)
val map = segmentFilter.segmentValueMap.map {
val segmentCriteria = where("segment").isEqualTo(it.key)
if (it.value.isNotEmpty()) {
segmentCriteria.and("segmentValue").`in`(it.value)
}
if (sovOverviewInDateRange != null) {
segmentCriteria.and("date").lte(sovOverviewInDateRange.recent).gte(sovOverviewInDateRange.older)
}
Criteria().elemMatch(segmentCriteria)
}
criteriaQuery.and("sovOverview").all(map.map { it.criteriaObject })
sovOverviews: {
$all: [
{
$elemMatch: {
segment: "Velikost",
segmentValue: "S"
}
},
{
$elemMatch: {
segment: "Kategorie"
}
}
]
}

Error while using aggregation with limit property on mongodb in asp.net MVC4.0?

I use MongoDB and MVC 4.0.
The below code gave me an error, I tried many different ways but it always shows this error:
"Command 'aggregate' failed: exception: A pipeline stage specification
object must contain exactly one field. (response: { "errmsg" :
"exception: A pipeline stage specification object must contain exactly
one field.", "code" : 16435, "ok" : 0.0 })"
My code:
var matchSumcount2 = new BsonDocument
{
{
"$group",
new BsonDocument
{
{ "_id", new BsonDocument
{
{
"Device","$Device"
}
}
},
{
"Clicks",new BsonDocument
{
{
"$sum","$Clicks"
}
}
},
{
"Day",new BsonDocument
{
{
"$sum",1
}
}
}
}
},
{
"$limit",50
}
};
var database = MongoDbManager.GetDatabase();
var pipeline = new[] { matchSumcount2 };
var list = database.GetCollection("rnd").Aggregate(pipeline);
I only want the first 50 records and then perform the aggregation.
What I am doing wrong here? Any suggestion or code sample to do this?
I made the comment above, but you didn't understand. I apologize for not being more clear. I'll use code examples to show you what is wrong.
You are doing this (effectively):
{
{ $group: { _id: { Device: "$Device" } } },
{ $limit: 50 }
}
But that is wrong. $group and $limit should not be siblings in a document. They should be elements in an array.
[
{ $group: { _id: { Device: "$Device" } } },
{ $limit: 50 }
]
Like I mentioned in the comment, I cannot see the start of your code, so I can only make an assumtpion based on the end. Your first line is probably a new BsonDocument(). That is wrong. It should be new BsonArray();
var pipeline = new BsonArray();
pipeline.Add(new BsonDocument(
{
{ "$group", new BsonDocument { { "_id", new BsonDocument { { "Device", "$Device" } } } } }
});
pipeline.Add(new BsonDocument("$limit", 50));