java Mongo DB update field on condition - mongodb

I am using spring-boot-starter-data-mongodb and I am performing an update Many to update several records in one operation
UpdateResult result = collection.updateMany(query, update);
The update field is defined like this
Bson update = Updates.combine(
Updates.set(field1, 1),
Updates.set(field2, 2);
But now I want to define a conditional update for field3. if field3 == null then field3=3
How can I achieve this? Thanks for your help

Related

How to update multiple rows with different values in a single execution in mongodb?

We know that in mysql instead of executing the below queries 3 times we can do the same in a single exection given in 1> and 2>
UPDATE feedbacks SET _id = '5c6a8bcfce1454086fefb879' WHERE user_rol = '26-02-2018';
UPDATE feedbacks SET _id = '5c6a89d3ce1454086fefb877' WHERE user_rol = '26-02-2017';
UPDATE feedbacks SET _id = '5c6a896ece1454086fefb876' WHERE user_rol = '26-02-2016';
1>
INSERT INTO feedbacks (_id, added_on)
VALUES
('5c6a8bcfce1454086fefb879', '26-02-2018'),
('5c6a89d3ce1454086fefb877', '26-02-2017'),
('5c6a896ece1454086fefb876', '26-02-2016')
ON DUPLICATE KEY UPDATE added_on = VALUES(added_on)
2>
UPDATE feedbacks
SET added_on = CASE
WHEN _id = '5c6a8bcfce1454086fefb879' THEN '26-02-2018'
WHEN _id = '5c6a89d3ce1454086fefb877' THEN '26-02-2017'
WHEN _id = '5c6a896ece1454086fefb876' THEN '26-02-2016'
END
WHERE _id IN ('5c6a8bcfce1454086fefb879', '5c6a89d3ce1454086fefb877', '5c6a896ece1454086fefb876')
Now my question is can we have any way to do the same (updating multiple rows with different values in a single execution) in mongodb?
In mongo with same condition or query, you can update multiple records by setting the 'multi' flag as true. Please refer link.
https://docs.mongodb.com/manual/reference/method/db.collection.update/
But, conditions are different this will not work.
For Bulk Update, You can refer:
https://docs.mongodb.com/manual/reference/method/Bulk.find.update/#Bulk.find.update

Yii2 Update many records with difference conditions

I want update many records with yii2 for insert use batchinsert() method and for update I don't know!!!
update my_tbl set fld1 = 321321 where res = 46513,
fld1 = 89876 where res = 54646,
fld1 = 64564 where res = 54654;
There is no bacthupdate() function in Yii2
Only batchinsert() is available
If you want you can iterate and update command or perform a sequence of single update like separate command.
Not also structured solution are actually available in Yii2

MongoDB 2.6 aggregation updates the $out collection

I'm currently using MongoDB 2.6 through MongoHQ. I've several mapreduces jobs which crunch raw data from a collection (c1) to produce a new collection (c2).
I've also an aggregation pipeline which parses (c2) to generate a new collection (c3) with the great $out operator.
However, I need to add extra fields to (c3) outside of the aggregation pipeline and keep them even after a new run of the aggregation but it seems that aggregation, based on the _id key just overwrite the content without updating it. So if I've previously add an extra field like foo : 'bar' to (c3) and I re-run the aggregation, I will loose the foo field.
Based on documentation (http://docs.mongodb.org/manual/reference/operator/aggregation/out/#pipe._S_out)
Replace Existing Collection
If the collection specified by the $out operation already exists, then upon completion of the aggregation, the $out stage atomically replaces the existing collection with the new results collection. The $out operation does not change any indexes that existed on the previous collection. If the aggregation fails, the $out operation makes no changes to the pre-existing collection.
Is there a better way or a tricky one :-) to update the $out collection instead of overwriting records with same _id ? I could write a python script or javascript to do the job but I would to avoid doing many database calls and in a smarter way as aggregation. May be it is not possible, so I will look for a different and more 'classical' path.
Thanks for your help
Well, not directly with the $out operator as much with the mapReduce output this is pretty much an "overwrite" operation (though mapReduce does have "merge" and "reduce" modes as well).
But since you have a MongoDB 2.6 version you do actually return a "cursor". So while the "client/server" interaction may not be as optimal as you want but you also have "bulk update" operations so you can do something along the lines of:
var cursor = db.collection.aggregate([
// pipeline here
]);
var batch = [];
while ( cursor.hasNext() ) {
var doc = cursor.next();
var updoc = {
"q": { "_id": doc._id },
"u": {
// only new fields except for
"$setOnInsert": {
// the fields you expect to add from before
},
"upsert": true
}
};
batch.push(updoc);
// try to do sensible under 16MB updates, number may vary
if ( ( batch.length % 500 ) == 0 ) {
db.runCommand({
"update": "newcollection",
"updates": batch
});
batch = []; // reset the content
}
}
db.runCommand({
"update": "newcollection",
"updates": batch
});
And of course, though there will be many naysayers, and not without reason because you really need to weigh up the consequences ( which are very real ), you can always wrap what is essentially a JavaScript call with db.eval() in order to get the full server side execution.
But where possible ( and that is unless you have a completely remote database solution ), then it is generally advised to take the "client/server" option, but keep the process as "close" ( in networking terms ) to the server as possible.
Unlike Map reduce it seems as though the $out operator in the aggregation framework has a very specific set of pre-defined behaviours ( http://docs.mongodb.org/manual/reference/operator/aggregation/out/#behaviors ), however, it does seem that the $out option could change, I did not find a JIRA relating to this specific case however others have posted changes ( https://jira.mongodb.org/browse/SERVER-13201 ).
As for solving your problem now, you either are forced to revert back to Map Reduce (I don't know the scenario from where this is being run) or aggregate in a certain manner that allows you to feed in the new data and the old data you need.
Most common way of achieving this might be to update the original rows with the new data, maybe by aggregating the original row back down to itself.
Thanks for all your messages.
As I do not want to use cursor (requests consuming) I try to get the job by combining 2 map reduces jobs and one aggregation. It is quite 'fat' but it works and could give some idea for others.
Of course, I would be very pleased hearing from you other great alternatives.
So, I have a collection c1 which is the result of a previous mapreduce job as you could see by the value object.
c1 : { id:'xxxx', value:{ language:'...', keyword: '...', params: '...', field1: val1, field2: val2}}
the xxxx unique ID key is the concatenation of the value.language , value.keyword and value.params as follow :
*xxxx = _*
I've got another collection c2 : { _id : ObjectID, language:'...', keyword:'...', field1: val1, field2: val2, labels: 'yyyyy'} which is quite a
projection of the c1 collection but with an extra field labels which is a string with different labels comma separated. This c2 collection is a central repository for all combination of language and keywords with their attached field values.
Target
The target is to group all records from the c1 collection based on the
group key _, make some calculations on
other fields and store the result to the c2 collection but by keeping
the old 'labels' field from c2 with the same key. So fields1 & 2 of
this c2 collection will be recalculated each time we launch the whole
batch but the labels field will stay unchanged.
As described in my first message, by using aggregation or mapreduce jobs you could not reach this target as the 'labels' field will be removed.
As I do not want to use cursors and other foreach loop which are very network and database resquests consuming (I have a big collection and I use a MongoHQ service)
I try to solve the problem by using mapreduce and aggregation jobs.
1st Phase
So, firstly I run a mapreduce job (m1) which is a sort of copy of the c2 collection but clearing the value of field1 & 2 to 0. The result will be store in a c3 collection.
function m1Map(){
language = this['value']['language'];
keyword = this['value']['keyword'];
labels = this['labels'];
key = language + '_' + keyword;
emit(key,{'language':language,'keyword':keyword,'field1': 0, 'field2': 0.0, 'labels' : labels});
}
function m1Reduce(key,values){
language = values[0]['language'];
keyword = values[0]['keyword'];
labels = values[0]['labels'];
return {'language':language,'keyword':keyword,'field1': 0, 'field2': 0.0, 'labels' : labels}};
}
So now, c3 is a copy of c2 collection with field1&2 set to 0. Here is the shape of this collection :
c3 : { id:'', value:{ language:'...', keyword: '...', field1: 0, field2: 0.0, labels: '...'}}
2nd Phase
In a second step I run a mapreduce job (m2) which group the c1 collection value by the key _ and I project an extra field 'labels' with a fixed value 'x' in my example. This 'x' value is never used on the c2 collection, that is a special value. The output of this m2 mapreduce job will be stored in the same previous c3 collection with a 'reduce' option in the out directive. The python script will be described further.
function m2Map(){
language = this['value']['language'];
keyword = this['value']['keyword'];
field1 = this['value']['field1'];
field2 = this['value']['field2'];
key = language + '_' + keyword;
emit(key,{'language':language,'keyword':keyword,'field1': field1, 'field2': field2, 'labels' : 'x'});
}
Then I make some calculations on the Reduce function :
function m2Reduce(key,values){
// Init
language = values[0]['language'];
keyword = values[0]['keyword'];
field1 = 0;
field2 = 0;
bLabel = 0;
for (var i = 0; i < values.length; i++){
if (values[i]['labels'] == 'x') {
// We know these emit values are coming from the map and not from previous value on the c2 collection
// 'x' is never used on the c2 collection
field1 += parseInt(values[i]['field1']);
field2 += parseFloat(values[i]['field2']);
} else {
// these values are from the c2 collection
if (bLabel == 0) {
// we keep the former value for the 'labels' field
labels = values[i]['labels'];
bLabel = 1;
} else {
// we concatenate the 'labels' field if we have 2 records but theorytically it is impossible as c2 has only one record by unique key
// anyway, a good check afterwards :-)
labels += ','+values[i]['labels'];
}
}
}
if (bLabel == 0) {
// if values are only coming from the map emit, we force again the 'x' value for labels, it these values are re-used in another reduce call
labels = 'x';
}
return {'language':language,'keyword':keyword, 'field1': field1, 'field2': field2, 'labels' : labels};
}
The Python mapreduce script which calls the two m1 & m2 mapreduce jobs
(see pymongo for import : http://api.mongodb.org/python/2.7rc0/installation.html)
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from pymongo import MongoClient
from pymongo import MongoReplicaSetClient
from bson.code import Code
from bson.son import SON
# MongoHQ
uri = 'mongodb://user:passwd#url_node1:port,url_node2:port/mydb'
client = MongoReplicaSetClient(uri,replicaSet='set-xxxxxxx')
db = client.mydb
coll1 = db.c1
coll2 = db.c2
#Load map and reduce functions
m1_map = Code(open('m1Map.js','r').read())
m1_reduce = Code(open('m1Reduce.js','r').read())
m2_map = Code(open('m2Map.js','r').read())
m2_reduce = Code(open('m2Reduce.js','r').read())
#Run the map-reduce queries
results = coll2.map_reduce(m1_map,m1_reduce,"c3",query={})
results = coll1.map_reduce(m2_map,m2_reduce,out=SON([("reduce", "c3")]),query={})
3rd Phase
At this point, we have a c3 collection which is complete with all field 1 & 2 computed values and the labels kept. So now, we have to run a last aggregation pipeline to copy the c3 content (in a mapreduce form with a compound value) to a more classical collection c2 with flatten fields without the value object.
db.c3.aggregate([{$project : { _id: 0, keyword: '$value.keyword', language: '$value.language', field1: '$value.field1', field2 : '$value.field2', labels : '$value.labels'}},{$out:'c2'}])
Et voilĂ  ! The target is reached. This solution is quite long with 2 mapreduce jobs and one aggregation pipeline but this is an alternative solution for those who do not want to use consuming cursor or external loop.
Thanks.

AREL: writing complex update statements with from clause

I tried looking for an example of using Arel::UpdateManager to form an update statement with a from clause (as in UPDATE t SET t.itty = "b" FROM .... WHERE ...), couldn.t find any. The way I've seen it, Arel::UpdateManager sets the main engine on initialization and allows to set the various fields and values to update. Is there actually a way to do this?
Another aside would be to find out how to express Postgres posix regex matching into ARel, but this might be impossible by now.
As far as I see the current version of arel gem is not support FROM keyword for the sql query. You can generate a query using the SET, and WHERE keywords only, like:
UPDATE t SET t.itty = "b" WHERE ...
and the code, which copies a value from field2 to field1 for the units table, will be like:
relation = Unit.all
um = Arel::UpdateManager.new(relation.engine)
um.table(relation.table)
um.ast.wheres = relation.wheres.to_a
um.set(Arel::Nodes::SqlLiteral.new('field1 = "field2"'))
ActiveRecord::Base.connection.execute(um.to_sql)
Exactly you can use the additional method to update a relation. So we create the Arel's UpdateManager, assigning to it the table, where clause, and values to set. Values shell be passed to the method as an argument. Then we need to add FROM keyword to the generated SQL request, we add it only if we have access to external table of the specified one by the UPDATE clause itself. And at the last we executes the query. So we get:
def update_relation!(relation, values)
um = Arel::UpdateManager.new(relation.engine)
um.table(relation.table)
um.ast.wheres = relation.wheres.to_a
um.set(values)
sql = um.to_sql
# appends FROM field to the query if needed
m = sql.match(/WHERE/)
tables = relation.arel.source.to_a.select {|v| v.class == Arel::Table }.map(&:name).uniq
tables.shift
sql.insert(m.begin(0), "FROM #{tables.join(",")} ") if m && !tables.empty?
# executes the query
ActiveRecord::Base.connection.execute(sql)
end
The you can issue the the relation update as:
values = Arel::Nodes::SqlLiteral.new('field1 = "field2", field2 = NULL')
relation = Unit.not_rejected.where(Unit.arel_table[:field2].not_eq(nil))
update_relation!(relation, values)

Getting a MongoDB document's field value into a variable

I am using mongo's shell and want to do what is basically equivalent to "SQL's select col INTO var" and then use the value of var to look up other rows in the same table or others (Joins). For example, in PL/SQL I will declare a variable called V_Dno. I also have a table called Emp(EID, Name, Sal, Dno). I can access the value of Dno for employee 100 as, "Select Dno into V_Dno from Emp where EID = 100). In MongoDB, when I find the needed employee (using its _id), I end up with a document and not a value (a field). In a sense, I get equivalent to the entire row in SQL and not just a column. I am doing the following to find the given emp:
VAR V_Dno = db.emp.find ({Eid : 100}, {Dno : 1});
The reason I want to do this to traverse from one document into the other using the value of a field. I know I can do it using the DBRef, but I wanted to see if I could tie documents together using this method.
Can someone please shed some light on this?
Thanks.
find returns a cursor that lets you iterate over the matching documents. In this case you'd want to use findOne instead as it directly returns the first matching doc, and then use dot notation to access the single field.
var V_Dno = db.emp.findOne({Eid : 100}, {Dno : 1}).Dno;
Using your query as a starting point:
var vdno = db.emp.findOne({Eid: 100, Dno :1})
This returns a document from the emp collection where the Eid = 100 and the Dno = 1. Now that I have this document in the vdno variable I can "join" it to another collection. Lets say you have a Department collection, a document in the department collection has a manual reference to the _id field in the emp collection. You can use the following to filter results from the department collection based on the value in your variable.
db.department.find({"employee._id":vdno._id})