Collection modified within cursor.foreach() is removed after completion - mongodb

I'm trying to iterate through a collection to build a new collection (hits_col) with counts of entries from the first collection. The code I've written so far appears to work as the iteration is happening, however, once the .forEach() method is finished the new collection (hits_col) gets removed.
RAW_COL.find({}, {fields: {created_time: 1}}).forEach(function (doc) {
var date = moment.unix(doc.created_time).format("YYYYMMD");
var hitCOUNT = hits_COL.findOne({'_id': date});
try {
if(tags === undefined) {
hits_COL.insert({'_id': date, 'hits': 1}, function (err, id) {
if(err == null) console.log("Entry " + id + " was created.");
else console.log(err);
});
} else {
hitCOUNT.hits = hitCOUNT.hits + 1;
hits_COL.update({'_id': date}, {'hits': tags.hits});
}
} catch (err) {throw err;}
}
While RAW_COL is iterating I can go to my collection and check the current entries and all is well.
meteor:PRIMARY> db.hits.find()
{ "_id" : "20160121", "hits" : 7887 }
{ "_id" : "20160120", "hits" : 7417 }
{ "_id" : "20160122", "hits" : 7533 }
{ "_id" : "20160124", "hits" : 8047 }
{ "_id" : "20160123", "hits" : 8262 }
{ "_id" : "20160125", "hits" : 7579 }
{ "_id" : "20160126", "hits" : 2111 }
{ "_id" : "20160119", "hits" : 7594 }
{ "_id" : "20160118", "hits" : 7788 }
{ "_id" : "20160117", "hits" : 7746 }
{ "_id" : "20160116", "hits" : 7609 }
{ "_id" : "20160115", "hits" : 3348 }
However, after the forEach() function is finished the collection is removed or something and the same mongo call returns nothing.
meteor:PRIMARY> db.hits.find()
What am I missing here?
Thanks for any and all help!

The above code was proceeded with
Meteor.startup(function () { hits_COL.remove({}); });
Which is called after the the forEach() call.

Related

PyMongo bulk_write UpdateOne only runs last operation

Got a weird bug that I can't quite figure out.
I have some pymongo code that looks like this:
from pymongo import UpdateOne
client = pymongo.MongoClient()
...
def update_image_locations(user_key, dataset_key, preset_name,
keys_and_coords):
db = docdb_client.db
col = db.col
operations = []
query = {'ownerKey': user_key, 'imageInfo.datasetKey': dataset_key}
for key_and_coords in keys_and_coords:
query['key'] = key_and_coords['key']
operations.append(
pymongo.UpdateOne(
query, {
'$set': {
'imageInfo.presets.%s.coords' % preset_name:
key_and_coords['coords']
}
}))
print(operations)
if len(operations) > 0:
print(col.bulk_write(operations, ordered=False).bulk_api_result)
# This section fails with a KeyError.
cursor = col.find({
'ownerKey': user_key,
'imageInfo.datasetKey': dataset_key
}, {'imageInfo': 1}
)
for doc in cursor:
print(doc['imageInfo']['presets'])
If I print out the bulk_write output, I get the following.
{'writeErrors': [], 'writeConcernErrors': [], 'nInserted': 0, 'nUpserted': 0, 'nMatched': 65, 'nModified': 65, 'nRemoved': 0, 'upserted': []}
which as far as I can tell is exactly what I expect.
However, I get KeyError failures for all but the last document in the collection when I try to iterate through the documents that should ostensibly have the new field. If I then go into the actual mongodb shell, I can confirm that only the last operation from the bulk_write seems to have actually gone off.
Based on the bulk_api_result I would expect that all of the documents would be updated, instead of only the last one. What's going on?
EDIT:
As requested, before and after queries. I'm not showing the full doc because there's a lot of vector embedding info that's going to muddle things.
Query:
> db.user_uploads.find({}, {'imageInfo.presets': 1})
Before:
{ "_id" : ObjectId("6074792104cc23375a8f979a"), "imageInfo" : { } }
{ "_id" : ObjectId("6074792104cc23375a8f979b"), "imageInfo" : { } }
{ "_id" : ObjectId("6074792104cc23375a8f979c"), "imageInfo" : { } }
{ "_id" : ObjectId("6074792104cc23375a8f979d"), "imageInfo" : { } }
{ "_id" : ObjectId("6074792104cc23375a8f979e"), "imageInfo" : { } }
{ "_id" : ObjectId("6074792104cc23375a8f979f"), "imageInfo" : { } }
{ "_id" : ObjectId("6074792104cc23375a8f97a0"), "imageInfo" : { } }
{ "_id" : ObjectId("6074792104cc23375a8f97a1"), "imageInfo" : { } }
{ "_id" : ObjectId("6074792104cc23375a8f97a2"), "imageInfo" : { } }
{ "_id" : ObjectId("6074792104cc23375a8f97a3"), "imageInfo" : { } }
After:
{ "_id" : ObjectId("6074792104cc23375a8f979a"), "imageInfo" : { } }
{ "_id" : ObjectId("6074792104cc23375a8f979b"), "imageInfo" : { } }
{ "_id" : ObjectId("6074792104cc23375a8f979c"), "imageInfo" : { } }
{ "_id" : ObjectId("6074792104cc23375a8f979d"), "imageInfo" : { } }
{ "_id" : ObjectId("6074792104cc23375a8f979e"), "imageInfo" : { } }
{ "_id" : ObjectId("6074792104cc23375a8f979f"), "imageInfo" : { } }
{ "_id" : ObjectId("6074792104cc23375a8f97a0"), "imageInfo" : { } }
{ "_id" : ObjectId("6074792104cc23375a8f97a1"), "imageInfo" : { } }
{ "_id" : ObjectId("6074792104cc23375a8f97a2"), "imageInfo" : { } }
{ "_id" : ObjectId("6074792104cc23375a8f97a3"), "imageInfo" : { "presets" : { "preset_one" : { "coords" : [ 2.229365348815918, 1.4654869735240936 ] } } } }
Turns out the answer has to do with how the query is constructed. Specifically, this works:
for key_and_coords in keys_and_coords:
query = {'key': key_and_coords['key']}
operations.append(
pymongo.UpdateOne(
query, {
'$set': {
'imageInfo.presets.%s.coords' % preset_name:
key_and_coords['coords']
}
}))
and this fails:
query = {}
for key_and_coords in keys_and_coords:
query['key'] = key_and_coords['key']
operations.append(
pymongo.UpdateOne(
query, {
'$set': {
'imageInfo.presets.%s.coords' % preset_name:
key_and_coords['coords']
}
}))
I think what's happening here is some async javascript-esque magic, where the query object is passed by reference to the bulk operation which then executes them once all of the bulk operations are in place. Since the query is passed by reference, the actual key value gets overwritten each time until the last one (which is also why only the last object is updated). Unfortunately this was tough to catch because printing out the queries and the operations both looked fine, but the async kicked in at execution. Still, not really an issue with pymongo after all.
Thanks to everyone who responded!

How to set value from different field or collection from specific document in Mongo shell

I have colection "ttn_data" with doc:
{
"_id" : ObjectId("Some_different_ID"),
"dev_id" : "e0e1e20102030405",
"payload_fields" : {"temp_C" : 28.308}
}
and collection "records" with doc
{
"_id" : ObjectId("5ed8af72c377d5b209597981"),
"temp_C_different" : ""
}
I would like to set temp_C_different value from temp_C in ttn_data collection so the return after update query would be
{
"_id" : ObjectId("5ed8af72c377d5b209597981"),
"temp_C_different" : "28.308"
}
I try this method:
try { db.records.updateMany( { "_id" : ObjectId("5ed8af72c377d5b209597981") },
{ $set: { "temp_C_different" : db.ttn_data.temp_C.value } } ); }
catch (e) { print(e); }
but it sets "temp_C_different"value to some metadata info from database. What is the write way to do that kind of update?

Remove by _id inside a nested array, inside of a collection

this is my mongoDb footballers collection :
[
{
"_id" : ObjectId("5d83b4a7e5511f28847f1884"),
"prenom" : "djalil",
"pseudo" : "dja1000",
"email" : "djalil#gmail.com",
"selectionned" : [
{
"_id" : "5d83af3be5511f28847f187f",
"role" : "footballeur",
"prenom" : "Gilbert",
"pseudo" : "Gilbert",
},
{
"_id" : "5d83b3d5e5511f28847f1883",
"role" : "footballeur",
"prenom" : "Xavier",
"pseudo" : "xav4544",
}
]
},
{
"_id" : ObjectId("5d83afa8e5511f28847f1880"),
"prenom" : "Rolande",
"pseudo" : "Rolande4000",
"email" : "rolande#gmail.com",
"selectionned" : [
{
"_id" : "5d83b3d5e5511f28847f1883",
"role" : "footballeur",
"prenom" : "Xavier",
"pseudo" : "xav4544",
}
]
}
}
How could I delete each selectionned people who has the 5d83b3d5e5511f28847f1883 _id through all of the collection?
I do need xavier to deseappear from any 'selectionned' array , just like doing a 'delete cascade' in SQL language
This is what I've tried with no luck :
function delete_fb_from_all(fb){
var ObjectId = require('mongodb').ObjectID; //working
var idObj = ObjectId(fb._id); //working
try {
db.collection('footballers').remove( { "selectionned._id" : idObj } );
console.log('All have been erased');
} catch (e) {
console.log(e);
}
}
And this too is not working :
db.collection('footballers.selectionned').remove( { "_id" : idObj } );
i really dont know how to do this.
i'm trying out this right now :
db.collection.update({'footballers.selectionned': idObj }, {$pull: {footballers:{ selectionned: idObj}}})
This is the error :
TypeError: db.collection.update is not a function
I think that the solution is maybe there :
https://docs.mongodb.com/manual/reference/operator/update/pull/#pull-array-of-documents
EDIT 1
i'm currently trying ou this :
var ObjectId = require('mongodb').ObjectID; //working
var idObj = ObjectId(fb._id); //working
try {
db.collection('footballers').update(
{ },
{ $pull: { selectionned: { _id: idObj } } },
{ multi: true }
)
} catch (e) {
console.log(e);
}
SOLVED :
Specifiying the email, it is now working, I guess the problem was comin from the _id field :
try {
db.collection('footballers').update(
{ },
{ $pull: { selectionned: { email: fb.email } } },
{ multi: true }
)
} catch (e) {
console.log(e);
}
Object ID :
The issue is may be on your object id creation. No need to make string-id with mongoDB object id.
// No need
var ObjectId = require('mongodb').ObjectID;
var idObj = ObjectId(fb._id);
// do as normal string
db.collection('footballers').remove( { "selectionned._id" : fb._id } );

MongoDB MapReduce producing different results for each document

This is a follow-up from this question, where I tried to solve this problem with the aggregation framework. Unfortunately, I have to wait before being able to update this particular mongodb installation to a version that includes the aggregation framework, so have had to use MapReduce for this fairly simple pivot operation.
I have input data in the format below, with multiple daily dumps:
"_id" : "daily_dump_2013-05-23",
"authors_who_sold_books" : [
{
"id" : "Charles Dickens",
"original_stock" : 253,
"customers" : [
{
"time_bought" : 1368627290,
"customer_id" : 9715923
}
]
},
{
"id" : "JRR Tolkien",
"original_stock" : 24,
"customers" : [
{
"date_bought" : 1368540890,
"customer_id" : 9872345
},
{
"date_bought" : 1368537290,
"customer_id" : 9163893
}
]
}
]
}
I'm after output in the following format, that aggregates across all instances of each (unique) author across all daily dumps:
{
"_id" : "Charles Dickens",
"original_stock" : 253,
"customers" : [
{
"date_bought" : 1368627290,
"customer_id" : 9715923
},
{
"date_bought" : 1368622358,
"customer_id" : 9876234
},
etc...
]
}
I have written this map function...
function map() {
for (var i in this.authors_who_sold_books)
{
author = this.authors_who_sold_books[i];
emit(author.id, {customers: author.customers, original_stock: author.original_stock, num_sold: 1});
}
}
...and this reduce function.
function reduce(key, values) {
sum = 0
for (i in values)
{
sum += values[i].customers.length
}
return {num_sold : sum};
}
However, this gives me the following output:
{
"_id" : "Charles Dickens",
"value" : {
"customers" : [
{
"date_bought" : 1368627290,
"customer_id" : 9715923
},
{
"date_bought" : 1368622358,
"customer_id" : 9876234
},
],
"original_stock" : 253,
"num_sold" : 1
}
}
{ "_id" : "JRR Tolkien", "value" : { "num_sold" : 3 } }
{
"_id" : "JK Rowling",
"value" : {
"customers" : [
{
"date_bought" : 1368627290,
"customer_id" : 9715923
},
{
"date_bought" : 1368622358,
"customer_id" : 9876234
},
],
"original_stock" : 183,
"num_sold" : 1
}
}
{ "_id" : "John Grisham", "value" : { "num_sold" : 2 } }
The even indexed documents have the customers and original_stock listed, but an incorrect sum of num_sold.
The odd indexed documents only have the num_sold listed, but it is the correct number.
Could anyone tell me what it is I'm missing, please?
Your problem is due to the fact that the format of the output of the reduce function should be identical to the format of the map function (see requirements for the reduce function for an explanation).
You need to change the code to something like the following to fix the problem, :
function map() {
for (var i in this.authors_who_sold_books)
{
author = this.authors_who_sold_books[i];
emit(author.id, {customers: author.customers, original_stock: author.original_stock, num_sold: author.customers.length});
}
}
function reduce(key, values) {
var result = {customers:[] , num_sold:0, original_stock: (values.length ? values[0].original_stock : 0)};
for (i in values)
{
result.num_sold += values[i].num_sold;
result.customers = result.customers.concat(values[i].customers);
}
return result;
}
I hope that helps.
Note : the change num_sold: author.customers.length in the map function. I think that's what you want

Mongodb Map/Reduce - Multiple Group By

I am trying to run a map/reduce function in mongodb where I group by 3 different fields contained in objects in my collection. I can get the map/reduce function to run, but all the emitted fields run together in the output collection. I'm not sure this is normal or not, but outputting the data for analysis takes more work to clean up. Is there a way to separate them, then use mongoexport?
Let me show you what I mean:
The fields I am trying to group by are the day, user ID (or uid) and destination.
I run these functions:
map = function() {
day = (this.created_at.getFullYear() + "-" + (this.created_at.getMonth()+1) + "-" + this.created_at.getDate());
emit({day: day, uid: this.uid, destination: this.destination}, {count:1});
}
/* Reduce Function */
reduce = function(key, values) {
var count = 0;
values.forEach(function(v) {
count += v['count'];
}
);
return {count: count};
}
/* Output Function */
db.events.mapReduce(map, reduce, {query: {destination: {$ne:null}}, out: "TMP"});
The output looks like this:
{ "_id" : { "day" : "2012-4-9", "uid" : "1234456", "destination" : "Home" }, "value" : { "count" : 1 } }
{ "_id" : { "day" : "2012-4-9", "uid" : "2345678", "destination" : "Home" }, "value" : { "count" : 1 } }
{ "_id" : { "day" : "2012-4-9", "uid" : "3456789", "destination" : "Login" }, "value" : { "count" : 1 } }
{ "_id" : { "day" : "2012-4-9", "uid" : "4567890", "destination" : "Contact" }, "value" : { "count" : 1 } }
{ "_id" : { "day" : "2012-4-9", "uid" : "5678901", "destination" : "Help" }, "value" : { "count" : 1 } }
When I attempt to use mongoexport, I can not separate day, uid, or destination by columns because the map combines the fields together.
What I would like to have would look like this:
{ { "day" : "2012-4-9" }, { "uid" : "1234456" }, { "destination" : "Home"}, { "count" : 1 } }
Is this even possible?
As an aside - I was able to make the output work by applying sed to the file and cleaning up the CSV. More work, but it worked. It would be ideal if I could get it out of mongodb in the correct format.
MapReduce only returns documents of the form {_id:some_id, value:some_value}
see: How to change the structure of MongoDB's map-reduce results?