Group By key Meteor Collection - mongodb

Hello I've searched a lot before asking the question and still have not found any decent answer to my question.
I have a collection (it's copying from MSSQL table every x second) like this: https://ekhmoi.tinytake.com/sf/MzU2MTcwXzIwNDcxNTg
As you can see there are fields which has the same key (MessageId).
My goal is some kind of grouping them, by taking MessageId + Message(of each record which has the same MessageId) and finally i will insert it to new Collection.
so final result should look like this:
https://ekhmoi.tinytake.com/sf/MzU2MTc3XzIwNDcyMDY
Any idea how can i do this ?

You can use aggregation for grouping your collection data to get your final result and the process is actually very simple.
First of all run meteor add meteorhacks:aggregate and meteor add mikowals:batch-insert if you have not yet added these two packages.
Assuming CollectionA is the first collection and CollectionB is the second collection. Here is how I would group data from Collection A and write the final result in CollectionB:
let pipeline = [
{$project: {TraceId: 1, MessageId: 1, Message: 1}},
{$group: {
_id: "$MessageId",
Message: {$push: "$Message"},
TraceId: {$first: "$TraceId"}
}},
{$project: {
_id: 0,
MessageId: "$_id",
Message: 1,
TraceId: 1
}}
];
let groupedData = CollectionA.aggregate(pipeline);
CollectionB.batchInsert(groupedData);
Note that this example is just the representation of my idea so it may be not working if you copy paste directly to your code.

Related

BSon Object Response Size problems [duplicate]

This question already has answers here:
Retrieve only the queried element in an object array in MongoDB collection
(18 answers)
MongoDB: count the number of items in an array
(3 answers)
Closed 4 years ago.
I'm trying to test MongoDB's performance (university research) inserting and querying same datas (with same queries) in different ways, but I'm having some troubles with response size, I'll try to explain what I'm doing.
My original file has this format:
[{"field":"aaaa","field2":"bbbbb","field3":"12345"},{"field":"cccc","field2":"ddddd","field3":"12345"},{"field":"ffff","field2":"ggggg","field3":"12345"},{"field":"hhhhh","field2":"iiiii","field3":"12345"},{"field":"jjjj","field2":"kkkkk","field3":"12345"},{"field":"lllll","field2":"mmmmm","field3":"12345"}]
1° Approach - I insert the whole file as a document in Mongo, but it doesn't accept it this way, so I have to add "Array" in front of the file, this way: {"Array":[{..},{..},{..},...]}, once inserted I query it with
db.collection.aggregate([
{ $match: {_cond_},
{ $unwind: "$Array"},
{ $match: {_cond_},
{ $group: {_id: null, count: {$sum:1}, Array: {$push: "$Array"}}},
{ $project: {"Numero HIT": "$count", Array:1}}
])
to retrieve inner file datas and count number of HITS. (_cond_ of course is something like "Array.field": "aaaa" or "Array.field": /something to search/).
2° Approach - I insert each inner document by itself: I split the original file (it's ALL in line) in an array, then i cycle it inserting each element. Then I query it with:
db.collection2.find({field: "aaaa"}) (or field: /something to search/)
I'm using two different collections, one for each approach, each of them of 207/208MB.
Everything seemed fine, then doing a query with 1° approach i got this error:
BSONObj size: 24002272 (0x16E3EE0) is invalid. Size must be between 0 and 16793600(16MB)
I remembered response from MongoDB's query MUST be lower then 16MB, ok, but how is it possible that approach 1 give me the error and the SAME* query in approach 2 doesn't say anything?? And how do I fix it?I mean: ok, the response is >16MB, how do I handle it? I can't do this kind of queries in any way? I hope it was clear what I meant.
Thanks in advance
*with SAME i mean something like:
1° Approach:
db.collection.aggregate([
{ $match: {"Array.field":"aaa", "Array.field3": 12345},
{ $unwind: "$Array"},
{ $match: {"Array.field":"aaa", "Array.field3": 12345},
{ $group: {_id: null, count: {$sum:1}, Array: {$push: "$Array"}}},
{ $project: {"Numero HIT": "$count", Array:1}}
])
2° Approach:
db.collection2.find({field: "aaa", field3: 12345})

Select the last document from mongo collection in meteor

I want the latest document in the query. Below I'm getting those documents whose name is coming from the variable 'personalFullName, then sorting them by a field called 'RecordID' (this field has higher numbers as later date entries), then grab the last one. I want the latest (the one with the largest RecordID number) entry in this query:
Programs.find({ FullName: personalFullName }, { sort: { RecordID: 1 }, limit: 1}).fetch().pop();
I'm getting an error that it's exceeding the call stack size.
If you are comfortable using the meteorhacks:aggregate package then you could always publish the item(s) you want using the mongo aggregate pipeline, perhaps something like this (code is coffeescript):
Meteor.publish 'latestPrograms', (personalFullName)->
return unless personalFullName?
check personalFullName, String
pipeline = [
{$match:{'Fullname': personalFullName}}
{$sort: {'RecordID': 1}}
{$group:{'_id':{Fullname: '$Fullname'}, RecordID:{$last:'$RecordID'}}}
{$limit:1}
]
#added 'latestPrograms', Random.id(), item for item in programs.aggregate pipeline
#ready()
You can then grab the data by subscribing to the latestPrograms pseudo collection. Here is an example using a iron router route:
Router.route '/home',
name: 'home'
template:'homepage'
waitOn:->
Meteor.subscribe 'latestPrograms', personalFullName
data:->
{latestPrograms: latestPrograms.find()}

Meteor collection get last document of each selection

Currently I use the following find query to get the latest document of a certain ID
Conditions.find({
caveId: caveId
},
{
sort: {diveDate:-1},
limit: 1,
fields: {caveId: 1, "visibility.visibility":1, diveDate: 1}
});
How can I use the same using multiple ids with $in for example
I tried it with the following query. The problem is that it will limit the documents to 1 for all the found caveIds. But it should set the limit for each different caveId.
Conditions.find({
caveId: {$in: caveIds}
},
{
sort: {diveDate:-1},
limit: 1,
fields: {caveId: 1, "visibility.visibility":1, diveDate: 1}
});
One solution I came up with is using the aggregate functionality.
var conditionIds = Conditions.aggregate(
[
{"$match": { caveId: {"$in": caveIds}}},
{
$group:
{
_id: "$caveId",
conditionId: {$last: "$_id"},
diveDate: { $last: "$diveDate" }
}
}
]
).map(function(child) { return child.conditionId});
var conditions = Conditions.find({
_id: {$in: conditionIds}
},
{
fields: {caveId: 1, "visibility.visibility":1, diveDate: 1}
});
You don't want to use $in here as noted. You could solve this problem by looping through the caveIds and running the query on each caveId individually.
you're basically looking at a join query here: you need all caveIds and then lookup last for each.
This is a problem of database schema/denormalization in my opinion: (but this is only an opinion!):
You could as mentioned here, lookup all caveIds and then run the single query for each, every single time you need to look up last dives.
However I think you are much better off recording/updating the last dive inside your cave document, and then lookup all caveIds of interest pulling only the lastDive field.
That will give you immediately what you need, rather than going through expensive search/sort queries. This is at the expense of maintaining that field in the document, but it sounds like it should be fairly trivial as you only need to update the one field when a new event occurs.

MongoDB nested Array find and projection

I'm currently working with mongodb (mongoose). One of my example documents is:
myuser{
_id: 7777...,
money: 1000,
ships:[{
_id: 7777...
name: "myshipname",
products:[{
product_id: 7777....,
quantity: 24
}]
}
}
What I aim to do is to get a certain product given user id, ship id and product id, being the result something like: { product_id: 777..,quantity:24}
So far I got to find a certain user ship with:
findOne(userId,ships:{ $elemMatch: {_id:shipId}})
Which returns the information of the ship with shipId inside the array ships from user with userId. However, I cannot find the way to get only a certain product from that ship
What you want can probably best be done using the aggregation framework. Something like:
db.users.aggregate([
{$match: { _id : <user>}},
{$unwind: "$ships"},
{$unwind: "$ships.products"},
{$match: { "ships._id": <ship>}},
{$match: { "ships.products.product_id": <product>}}
]);
Note I'm not on a computer with mongo right now, so my syntax might be a bit off.

Meteor Collection: find element in array

I have no experience with NoSQL. So, I think, if I just try to ask about the code, my question can be incorrect. Instead, let me explain my problem.
Suppose I have e-store. I have catalogs
Catalogs = new Mongo.Collection('catalogs);
and products in that catalogs
Products = new Mongo.Collection('products');
Then, people add there orders to temporary collection
Order = new Mongo.Collection();
Then, people submit their comments, phone, etc and order. I save it to collection Operations:
Operations.insert({
phone: "phone",
comment: "comment",
etc: "etc"
savedOrder: Order //<- Array, right? Or Object will be better?
});
Nice, but when i want to get stats by every product, in what Operations product have used. How can I search thru my Operations and find every operation with that product?
Or this way is bad? How real pro's made this in real world?
If I understand it well, here is a sample document as stored in your Operation collection:
{
clientRef: "john-001",
phone: "12345678",
other: "etc.",
savedOrder: {
"someMetadataAboutOrder": "...",
"lines" : [
{ qty: 1, itemRef: "XYZ001", unitPriceInCts: 1050, desc: "USB Pen Drive 8G" },
{ qty: 1, itemRef: "ABC002", unitPriceInCts: 19995, desc: "Entry level motherboard" },
]
}
},
{
clientRef: "paul-002",
phone: null,
other: "etc.",
savedOrder: {
"someMetadataAboutOrder": "...",
"lines" : [
{ qty: 3, itemRef: "XYZ001", unitPriceInCts: 950, desc: "USB Pen Drive 8G" },
]
}
},
Given that, to find all operations having item reference XYZ001 you simply have to query:
> db.operations.find({"savedOrder.lines.itemRef":"XYZ001"})
This will return the whole document. If instead you are only interested in the client reference (and operation _id), you will use a projection as an extra argument to find:
> db.operations.find({"savedOrder.lines.itemRef":"XYZ001"}, {"clientRef": 1})
{ "_id" : ObjectId("556f07b5d5f2fb3f94b8c179"), "clientRef" : "john-001" }
{ "_id" : ObjectId("556f07b5d5f2fb3f94b8c17a"), "clientRef" : "paul-002" }
If you need to perform multi-documents (incl. multi-embedded documents) operations, you should take a look at the aggregation framework:
For example, to calculate the total of an order:
> db.operations.aggregate([
{$match: { "_id" : ObjectId("556f07b5d5f2fb3f94b8c179") }},
{$unwind: "$savedOrder.lines" },
{$group: { _id: "$_id",
total: {$sum: {$multiply: ["$savedOrder.lines.qty",
"$savedOrder.lines.unitPriceInCts"]}}
}}
])
{ "_id" : ObjectId("556f07b5d5f2fb3f94b8c179"), "total" : 21045 }
I'm an eternal newbie, but since no answer is posted, I'll give it a try.
First, start by installing robomongo or a similar software, it will allow you to have a look at your collections directly in mongoDB (btw, the default port is 3001)
The way I deal with your kind of problem is by using the _id field. It is a field automatically generated by mongoDB, and you can safely use it as an ID for any item in your collections.
Your catalog collection should have a string array field called product where you find all your products collection items _id. Same thing for the operations: if an order is an array of products _id, you can do the same and store this array of products _id in your savedOrder field. Feel free to add more fields in savedOrder if necessary, e.g. you make an array of objects products with additional fields such as discount.
Concerning your queries code, I assume you will find all you need on the web as soon as you figure out what your structure is.
For example, if you have a product array in your savedorder array, you can pull it out like that:
Operations.find({_id: "your operation ID"},{"savedOrder.products":1)
Basically, you ask for all the products _id in a specific operation. If you have several savedOrders in only one operation, you can specify too the savedOrder _id, if you used the one you had in your local collection.
Operations.find({_id: "your_operation_ID", "savedOrder._id": "your_savedOrder_ID"},{"savedOrder.products":1)
ps: to bad-ass coders here, if I'm doing it wrong, please tell me.
I find an answer :) Of course, this is not a reveal for real professionals, but is a big step for me. Maybe my experience someone find useful. All magic in using correct mongo operators. Let solve this problem in pseudocode.
We have a structure like this:
Operations:
1. Operation: {
_id: <- Mongo create this unique for us
phone: "phone1",
comment: "comment1",
savedOrder: [
{
_id: <- and again
productId: <- whe should save our product ID from 'products'
name: "Banana",
quantity: 100
},
{
_id:,
productId: <- Another ID, that we should save if order
name: "apple",
quantity: 50
}
]
And if we want to know, in what Operation user take "banana", we should use mongoDB operator"elemMatch" in Mongo docs
db.getCollection('operations').find({}, {savedOrder: {$elemMatch:{productId: "f5mhs8c2pLnNNiC5v"}}});
In simple, we get documents our saved order have products with id that we want to find. I don't know is it the best way, but it works for me :) Thank you!