Is there a way to remove certain amounts of elements from start of an array in MongoDB?
Suppose I don't know about element's details (like id or uuid) but I know that I want to remove the first N elements from the start of it. Is there a way to do it in mongoDB? I know I can fetch the whole document and process it in my own programming language environment but I thought it would be nicer if mongoDB already implemented a way to achieve it atomically by its own query language.
There is a $pop operator to remove a single element from the array either from top or bottom position,
and there is a closed jira support request SERVER-4798 regarding multiple pop operations, but in comment they have suggested to use update with aggregation pipeline option.
So you can try update with aggregation pipeline starting from MongoDB 4.2,
$slice, pass negative number and it will slice elements from 0 index
let n = 2;
db.collection.updateOne(
{}, // your query
[{
$set: { arr: { $slice: ["$arr", -n] } }
}]
)
Playground
What #turivishal metioned is true however it only works if array's size is always 4. For it to work for all array sizes we should consider size of the array in aggregation, so:
let n = 2;
db.collection.update({},
[
{
$set: {
arr: {
$slice: [
"$arr",
{
$subtract: [
-n,
{
$size: "$arr"
}
]
},
]
}
}
}
])
Playground
Related
Here is the data structure for each document in the collection. The datastructure is fixed.
{
'_id': 'some-timestamp',
'RESULT': [
{
'NUMERATION': [ // numeration of divisions
{
// numeration of producttypes
'DIVISIONX': [{'PRODUCTTYPE': 'product xy', COUNT: 100}]
}
]
}
]
}
The query result should be in the same structure but only contain producttypes matching a regular expression.
I tried using an nested $elemMatchoperator but this doesn't get me any closer. I don't know how I can iterate each value in the producttypes array for each division.
How can I do that? Then I could apply $pop, $in and $each.
I looked at:
Querying an array of arrays in MongoDB
https://docs.mongodb.com/manual/reference/operator/update/each/
https://docs.mongodb.com/manual/reference/operator/update/pop/
... and more
The solution I want to avoid is writing something like this:
collection.find().forEach(function(x) { /* more for eaches */ })
Edit:
Here is an example document to copy:
{"_id":"5ab550d7e85d5930b0879cbe","RESULT":[{"NUMERATION":[{"DIVISION":[{"PRODUCTTYPE":"Book","COUNT":10},{"PRODUCTTYPE":"Giftcard","COUNT":"300"}]}]}]}
E.g. the query result should only return the entry with the giftcard:
{"_id":"5ab550d7e85d5930b0879cbe","RESULT":[{"NUMERATION":[{"DIVISION":[{"PRODUCTTYPE":"Giftcard","COUNT":"300"}]}]}]}
Using the forEach approach the result is in the correct format. I'm still looking for a better way which does not involve the use of that function - therefore I will not mark this as an answer.
But for now this works fine:
db.collection.find().forEach(
function(wholeDocument) {
wholeDocument['RESULT'].forEach(function (resultEntry) {
resultEntry['NUMERATION'].forEach(function (numerationEntry) {
numerationEntry['DIVISION'].forEach(function(divisionEntry, index) {
// example condition (will be replaced by regular expression evaluation)
if(divisionEntry['PRODUCTTYPE'] != 'Giftcard'){
numerationEntry['DIVISION'].splice(index, 1);
}
})
})
})
print(wholeDocument);
}
)
UPDATE
Thanks to Rahul Raj's comments I have read up the aggregation with the $redact operator. A prototype of the solution to the issue is this query:
db.getCollection('DeepStructure').aggregate( [
{ $redact: {
$cond: {
if: { $ne: [ "$PRODUCTTYPE", "Giftcard" ] },
then: "$$DESCEND",
else: "$$PRUNE"
}
}
}
]
)
I hope you're trying to update nested array.
You need to use positional operators $[] or $ for that.
If you use $[], you will be able to remove all matching nested array elements.
And if you use $, only the first matching array element will get removed.
Use $regex operator to pass on your regular expression.
Also, you need to use $pull to remove array elements based on matching condition. In your case, its regular expression. Note that $elemMatch is not the correct one to use with $pull as arguments to $pull are direct queries to the array.
db.collection.update(
{/*additional matching conditions*/},
{$pull: {"RESULT.$[].NUMERATION.$[].DIVISIONX":{PRODUCTTYPE: {$regex: "xy"}}}},
{multi: true}
)
Just replace xy with your regular expression and add your own matching conditions as required. I'm not quite sure about your data set, but I came up with the above answer based on my assumptions from the given info. Feel free to change according to your requirements.
Currently I use the following find query to get the latest document of a certain ID
Conditions.find({
caveId: caveId
},
{
sort: {diveDate:-1},
limit: 1,
fields: {caveId: 1, "visibility.visibility":1, diveDate: 1}
});
How can I use the same using multiple ids with $in for example
I tried it with the following query. The problem is that it will limit the documents to 1 for all the found caveIds. But it should set the limit for each different caveId.
Conditions.find({
caveId: {$in: caveIds}
},
{
sort: {diveDate:-1},
limit: 1,
fields: {caveId: 1, "visibility.visibility":1, diveDate: 1}
});
One solution I came up with is using the aggregate functionality.
var conditionIds = Conditions.aggregate(
[
{"$match": { caveId: {"$in": caveIds}}},
{
$group:
{
_id: "$caveId",
conditionId: {$last: "$_id"},
diveDate: { $last: "$diveDate" }
}
}
]
).map(function(child) { return child.conditionId});
var conditions = Conditions.find({
_id: {$in: conditionIds}
},
{
fields: {caveId: 1, "visibility.visibility":1, diveDate: 1}
});
You don't want to use $in here as noted. You could solve this problem by looping through the caveIds and running the query on each caveId individually.
you're basically looking at a join query here: you need all caveIds and then lookup last for each.
This is a problem of database schema/denormalization in my opinion: (but this is only an opinion!):
You could as mentioned here, lookup all caveIds and then run the single query for each, every single time you need to look up last dives.
However I think you are much better off recording/updating the last dive inside your cave document, and then lookup all caveIds of interest pulling only the lastDive field.
That will give you immediately what you need, rather than going through expensive search/sort queries. This is at the expense of maintaining that field in the document, but it sounds like it should be fairly trivial as you only need to update the one field when a new event occurs.
Let's say I have four documents in my collection:
{u'a': {u'time': 3}}
{u'a': {u'time': 5}}
{u'b': {u'time': 4}}
{u'b': {u'time': 2}}
Is it possible to sort them by the field 'time' which is common in both 'a' and 'b' documents?
Thank you
No, you should put your data into a common format so you can sort it on a common field. It can still be nested if you want but it would need to have the same path.
You can use use aggregation and the following code has been tested.
db.test.aggregate({
$project: {
time: {
"$cond": [{
"$gt": ["$a.time", null]
}, "$a.time", "$b.time"]
}
}
}, {
$sort: {
time: -1
}
});
Or if you also want the original fields returned back: gist
Alternatively you can sort once you get the result back, using a customized compare function ( not tested,for illustration purpose only)
db.eval(function() {
return db.mycollection.find().toArray().sort( function(doc1, doc2) {
var time1 = doc1.a? doc1.a.time:doc1.b.time,
time2 = doc2.a?doc2.a.time:doc2.b.time;
return time1 -time2;
})
});
You can, using the aggregation framework.
The trick here is to $project a common field to all the documents so that the $sort stage can use the value in that field to sort the documents.
The $ifNull operator can be used to check if a.time exists, it
does, then the record will be sorted by that value else, by b.time.
code:
db.t.aggregate([
{$project:{"a":1,"b":1,
"sortBy":{$ifNull:["$a.time","$b.time"]}}},
{$sort:{"sortBy":-1}},
{$project:{"a":1,"b":1}}
])
consequences of this approach:
The aggregation pipeline won't be covered by any of the index you
create.
The performance will be very poor for very large data sets.
What you could ideally do is to ask the source system that is sending you the data to standardize its format, something like:
{"a":1,"time":5}
{"b":1,"time":4}
That way your query can make use of the index if you create one on the time field.
db.t.ensureIndex({"time":-1});
code:
db.t.find({}).sort({"time":-1});
i have a mongoDB collection named col that has documents that look like this
{
{
intField:123,
strField:'hi',
arrField:[1,2,3]
},
{
intField:12,
strField:'hello',
arrField:[1,2,3,4]
},
{
intField:125,
strField:'hell',
arrField:[1]
}
}
Now i want to remove documents from collection col in which size of the array field is less than 2.
So i wrote a query that looks like this
db.col.remove({'arrField':{"$size":{"$lt":2}}})
Now this query doesnt do anything. i checked with db.col.find() and it returns all the documents. Whats wrong with this query?
With MongoDB 2.2+ you can use numeric indexes in condition object keys to do this:
db.col.remove({'arrField.2': {$exists: 0}})
This will remove any document that doesn't have at least 3 elements in arrField.
From the documentation for $size:
You cannot use $size to find a range of sizes (for example: arrays with more than 1 element).
The docs recommend maintaining a separate size field (so in this case, arrFieldSize) with the count of the items in the array if you want to try this sort of thing.
Note that for some queries, it may be feasible to just list all the counts you want in or excluded using (n)or conditions.
In your example, the following query will give all documents with less than 2 array entries:
db.col.find({
"$or": [
{ "arrField": {"$exists" => false} },
{ "arrField": {"$size" => 1} },
{ "arrField": {"$size" => 0} }
]
})
The following should work
db.col.remove({$where: "this.arrField.length < 2"})
Mongo supports arrays of documents inside documents. For example, something like
{_id: 10, "coll": [1, 2, 3] }
Now, imagine I wanted to insert an arbitrary value at an arbitrary index
{_id: 10, "coll": [1, {name: 'new val'}, 2, 3] }
I know you can update values in place with $ and $set, but nothing for insertion. it kind of sucks to have to replace the entire array just for inserting at a specific index.
Starting with version 2.6 you finally can do this. You have to use $position operator. For your particular example:
db.students.update(
{ _id: 10},
{ $push: {
coll: {
$each: [ {name: 'new val'} ],
$position: 1
}
}}
)
The following will do the trick:
var insertPosition = 1;
var newItem = {name: 'new val'};
db.myCollection.find({_id: 10}).forEach(function(item)
{
item.coll = item.coll.slice(0, insertPosition).concat(newItem, item.coll.slice(insertPosition));
db.myCollection.save(item);
});
If the insertPosition is variable (i.e., you don't know exactly where you want to insert it, but you know you want to insert it after the item with name = "foo", just add a for() loop before the item.coll = assignment to find the insertPosition (and add 1 to it, since you want to insert it AFTER name = "foo".
Handy answer (not selected answer, but highest rated) from this similar post:
Can you have mongo $push prepend instead of append?
utilizes $set to insert 3 at the first position in an array, called "array". Sample from related answer by Sergey Nikitin:
db.test.update({"_id" : ObjectId("513ad0f8afdfe1e6736e49eb")},
{'$set': {'array.-1': 3}})
Regarding your comment:
Well.. with concurrent users this is going to be problematic with any database...
What I would do is the following:
Add a last modified timestamp to the document. Load the document, let the user modify it and use the timstamp as a filter when you update the document and also update the timestamp in one step. If it updates 0 documents you know it was modified in the meantime and you can ask the user to reload it.
Using the $position operator this can be done starting from version 2.5.3.
It must be used with the $each operator. From the documentation:
db.collection.update( <query>,
{ $push: {
<field>: {
$each: [ <value1>, <value2>, ... ],
$position: <num>
}
}
}
)