mongodb: pull element from nested array where date is before now - mongodb

I am having an issue which deemed to be so simple. Given nested documents that have an expires field. One example document:
{
_id: ObjectId(),
stuff: [
{
name: 'egg',
expires: ISODate("2019-07-19T12:52:56.163Z")
},
{
name: 'potato',
expires: ISODate("2019-07-19T12:52:56.163Z")
}
]
}
I thought I could use a query like that:
db.collection.update({"_id": ObjectId("578411d30af77c226c52b940")}, {
"$pull": {
"stuff.expires": { "$lt": ISODate() }
}
});
possibly being able to apply that to multiple documents at once, but even when trying to update a single document I run into that error:
Cannot use the part (expires) of (stuff.expires) to traverse the element
I tried a ton of modifications, but I was not able to find a way to make this work or a similar example (which seems to be quite odd when searching for mongodb stuff).
If there is no way to update multiple documents at once, I would be happy if there would be a way to remove all expired items from a single document in an atomic query. The query does not need to work with older mongodb versions - latest version is fine.

I think you can achieve what you want using this query:
db.collection.update({}, {
$pull: {
stuff: {
expires: {
$lt: ISODate()
}
}
}
}, {
multi: true
})
The above query will target all documents, and will pull every stuff object which expires property is lower than your ISODate()
the multi:true is the option to allow to update multiple documents

Related

MongoDB paginate 2 collections together on common field

I've two mongo collections - File and Folder.
Both have some common fields like name, createdAt etc. I've a resources API that returns a response having items from both collections, with a type property added. type can be file or folder
I want to support pagination and sorting in this list, for example sort by createdAt. Is it possible with aggregation, and how?
Moving them to a container collection is not a preferred option, as then I have to maintain the container collection on each create/update/delete on either of the collection.
I'm using mongoose too, if it has got any utility function for this, or a plugin.
In this case, you can use $unionWith. Something like:
Folder.aggregate([
{ $project: { name: 1, createdAt: 1 } },
{
$unionWith: {
coll: "files", pipeline: [ { $project: { name: 1, createdAt: 1 } } ]
}
},
... // your sorting go here
])

MongoDB querying aggregation in one single document

I have a short but important question. I am new to MongoDB and querying.
My database looks like the following: I only have one document stored in my database (sorry for blurring).
The document consists of different fields:
two are blurred and not important
datum -> date
instance -> Array with an Embedded Document Object; Our instance has an id, two not important fields and a code.
Now I want to query how many times an object in my instance array has the group "a" and a text "sample"?
Is this even possible?
I only found methods to count how many documents have something...
I am using Mongo Compass, but i can also use Pymongo, Mongoengine or every other different tool for querying the mongodb.
Thank you in advance and if you have more questions please leave a comment!
You can try this
db.collection.aggregate([
{
$unwind: "$instance"
},
{
$unwind: "$instance.label"
},
{
$match: {
"instance.label.group": "a",
"instance.label.text": "sample",
}
},
{
$group: {
_id: {
group: "$instance.label.group",
text: "$instance.label.text"
},
count: {
$sum: 1
}
}
}
])

MongoDb - add index on 'calculated' fields

I have a query that includes an $expr-operator with a $cond in it.
Basically, I want to have objects with a timestamp from a certain year. If the timestamp is not set, I'll use the creation date instead.
{
$expr: {
$eq: [
{
$cond: {
'if': {
TimeStamp: {
$type: 'null'
}
},
then: {
$year: '$Created'
},
'else': {
$year: '$TimeStamp'
}
}
},
<wanted-year>
]
}
}
It would be nice to have this query using a index. But is it possible to do so? Should I just add index to both TimeStamp and Created-fields? Or is it possible to create an index for a Year-field that doesn't really exist on the document itself...?
Not possible
Indexes are stored on disk before executing the query.
Workaround: On-Demand Materialized Views
You store in separate collection your calculated data (with indexes)
This can't be done today without precomputing that information and storing it in a field on the document. The closest alternative would probably be to use MongoDB 4.2's aggregation pipeline-powered updates to precompute and store a createdOrTimestamp field whenever your documents are updated. You could then create an index on createdOrTimestamp that would be used when querying for documents that match a certain year.
What this would look like when updating or after inserting your document:
db.collection.update({ _id: ObjectId("5e8523e7ea740b14fb16b5c3") }, [
{
$set: {
createdOrTimestamp: {
$cond: {
if: {$gt: ['$TimeStamp', null]},
then: '$TimeStamp',
else: '$Created'
}
}
}
}
])
If documents already exist, you could also send off an updateMany operation with that aggregation to get that computed field into all your existing documents.
It would be really nice to be able to define computed fields declaratively on a collection just like indexes, so that they take care of keeping themselves up to date!

How to query nested documents in MongoDB?

I am very new to database and MongoDB. I tried to stored financial statment information in the database with each document representing a company.
I have created a nested documents (maybe not a good way), like the following diagram. the outest level contains Annual statement, Basic info, Key Map, and Interim Statement. And within Annual statement, there are different dates. And within dates, we have different types of statements (INC, BAL, CAS), and then the inner level contains the real data.
My question is how can I query the db to give me all documents contain 2017 statements (for example)?
The year is now formated as YYYY-MM_DD. but I only wants to filter YYYY.
I highly discourage to use variable (date, INC here) as field name. It will be (much more) harded to query, update, you cannot use index. So it's a very bad idea in most of case, even if it can be 'acceptable' (but bad practice) in case of a few numbers of static values (INC, BAL, CAS).
My advice will be to change your schema for something easier to use like for example :
[
{
"annual_statement": [
{ "date": ISODate("2017-12-31"),
"INC":{
"SREV": 1322.5,
"RTLR": 1423.4,
...
},
"BAL":{...},
"CAS":{...}
},
{ "date": ISODate("2017-12-31"),
"INC":{
"SREV": 1322.5,
"RTLR": 1423.4,
...
},
"BAL":{...},
"CAS":{...}
}
]
}
]
To query this schema, use the following :
db.collection.find({
"annual_statement": {
$elemMatch: {
date: {
$lt: ISODate("2018-01-01"),
$gte: ISODate("2017-01-01"),
}
}
}
})
Will return you whole documents where at least on date is in 2017.
Adding projection like following will return you only matching annual statements :
db.collection.find({
"annual_statement": {
$elemMatch: {
date: {
$lt: ISODate("2018-01-01"),
$gte: ISODate("2017-01-01"),
}
}
}
},
{
"annual_statement": {
$elemMatch: {
date: {
$lt: ISODate("2018-01-01"),
$gte: ISODate("2017-01-01"),
}
}
}
})
Use $exists (note I'm using python - pymongo driver syntax below - yours may differ)
https://docs.mongodb.com/manual/reference/operator/query/exists/
Example: Find all "2017" "INC" records, by company.
year_exists=db.collection.find({'Annual.2017-12-31': {'$exists':True}})
for business in year_exists:
bus_name = business['BasicInfo']['CompanyName']
financials_INC = business['Annual']['2017-12-31']['INC'])
print(bus_name, financials_INC)

Refine/Restructure data from Mongodb query

Im using NodeJs, MongoDB Native 2.0+
The following query fetch one client document containing arrays of embedded staff and services.
db.collection('clients').findOne({_id: sessId}, {"services._id": 1, "staff": {$elemMatch: {_id: reqId}}}, callback)
Return a result like this:
{
_id: "5422c33675d96d581e09e4ca",
staff:[
{
name: "Anders"
_id: "5458d0aa69d6f72418969428"
// More fields not relevant to the question...
}
],
services: [
{
_id: "54578da02b1c54e40fc3d7c6"
},
{
_id: "54578da42b1c54e40fc3d7c7"
},
{
_id: "54578da92b1c54e40fc3d7c9"
}
]
}
Note that each embedded object in services actually contains several fields, but _id is the only field returned by means of the projection of the query.
From this returned data I start by "pluck" all id's from services and save them in an array later used for validation. This is by no means a difficult operation... but I'm curious... Is there an easy way to do some kind of aggregation instead of find, to get an array of already plucked objectId's directly from the DB. Something like this:
{
_id: "5422c33675d96d581e09e4ca",
staff:[
{
name: "Anders"
_id: "5458d0aa69d6f72418969428"
// More fields not relevant to the question...
}
],
services: [
"54578da02b1c54e40fc3d7c6",
"54578da42b1c54e40fc3d7c7",
"54578da92b1c54e40fc3d7c9"
]
}
One way of doing it is to first,
$unwind the document based on the staff field, this is done to
select the intended staff. This step is required due to the
unavailability of the $elemMatch operator in the aggregation
framework.
There is an open ticket here: Jira
Once the document with the correct staff is selected, $unwind, based on $services.
The $group, together $pushing all the services _id together in an array.
This is then followed by a $project operator, to show the intended fields.
db.clients.aggregate([
{$match:{"_id":sessId}},
{$unwind:"$staff"},
{$match:{"staff._id":reqId}},
{$unwind:"$services"},
{$group:{"_id":"$_id","services_id":{$push:"$services._id"},"staff":{$first:"$staff"}}},
{$project:{"services_id":1,"staff":1}}
])