I have another problem to solve here. Thinking in arrays sometimes could be very challenging. Here is what I am lined up with. This is what my data looks like: -
{
"_id": { "Firm": "ABC", "year": 2014 },
"Headings": [
{
"costHead": "MNF",
"amount": 500000
},
{
"costHead": "SLS",
"amount": 25000
},
{
"costHead": "OVRHD",
"amount": 100
}
]
}
{
"_id": { "Firm": "CDF", "year": 2015 },
"Headings": [
{
"costHead": "MNF",
"amount": 15000
},
{
"costHead": "SLS",
"amount": 100500
},
{
"costHead": "MNTNC",
"amount": 7500
}
]
}
As you can see, I have a list that has a whole bunch of sub-documents.
Here is what I want to do .. I need to add more elements to this "Headings" list which should be : -
{
"costHead": "FxdCost",
"amount": "$Headings.amount (for costhead MFC) + $Headings.amount (for costhead OVRHD)"
}
I am unsure as to how to produce the above. Here are some challenges: -
I can addToSet the new subdocument I wish to add but the problem is addToSet can only be used in the group stage - which would be expensive (unless of course there is no other way).
Even if I use addToSet, I always have to use the $ operator to refer to elements that I read from my JSON file. Now the element I am trying to add here (costHead: FxdCost) is not present in my JSON file and hence I cannot use the $ operator.
Does anyone have any advice on how to go about this. This is after all basic ETL.
Related
I have a collection in MongoDB containing search history of a user where each document is stored like:
"_id": "user1"
searchHistory: {
"product1": [
{
"timestamp": 1623482432,
"query": {
"query": "chocolate",
"qty": 2
}
},
{
"timestamp": 1623481234,
"query": {
"query": "lindor",
"qty": 4
}
},
],
"product2": [
{
"timestamp": 1623473622,
"query": {
"query": "table",
"qty": 1
}
},
{
"timestamp": 1623438232,
"query": {
"query": "ike",
"qty": 1
}
},
]
}
Here _id of document acts like a foreign key to the user document in another collection.
I have backend running on nodejs and this function is used to store a new search history in the record.
exports.updateUserSearchCount = function (userId, productId, searchDetails) {
let addToSetData = {}
let key = `searchHistory.${productId}`
addToSetData[key] = { "timestamp": new Date().getTime(), "query": searchDetails }
return client.db("mydb").collection("userSearchHistory").updateOne({ "_id": userId }, { "$addToSet": addToSetData }, { upsert: true }, async (err, res) => {
})
}
Now, I want to get search history of a user based on query only using the db.find().
I want something like this:
db.find({"_id": "user1", "searchHistory.somewildcard.query": "some query"})
I need a wildcard which will replace ".somewildcard." to search in all products searched.
I saw a suggestion that we should store document like:
"_id": "user1"
searchHistory: [
{
"key": "product1",
"value": [
{
"timestamp": 1623482432,
"query": {
"query": "chocolate",
"qty": 2
}
}
]
}
]
However if I store document like this, then adding search history to existing document becomes a tideous and confusing task.
What should I do?
It's always a bad idea to save values are keys, for this exact reason you're facing. It heavily limits querying that field, obviously the trade off is that it makes updates much easier.
I personally recommend you do not save these searches in nested form at all, this will cause you scaling issues quite quickly, assuming these fields are indexed you will start seeing performance issues when the arrays get's too large ( few hundred searches ).
So my personal recommendation is for you to save it in a new collection like so:
{
"user_id": "1",
"key": "product1",
"timestamp": 1623482432,
"query": {
"query": "chocolate",
"qty": 2
}
}
Now querying a specific user or a specific product or even a query substring is all very easily supported by creating some basic indexes. an "update" in this case would just be to insert a new document which is also much faster.
If you still prefer to keep the nested structure, then I recommend you do switch to the recommended structure you posted, as you mentioned updates will become slightly more tedious, but you can still do it quite easily using arrayFilters for updating a specific element or just using $push for adding a new search
I've got a bunch of documents like this in my collection.
{
"success": true,
"timestamp": 1519296206,
"base": "EUR",
"date": "2018-07-04",
"rates": {
"AUD": 1.566015,
"CAD": 1.560132,
"CHF": 1.154727,
"CNY": 7.827874,
"GBP": 0.882047,
"JPY": 132.360679,
"USD": 1.23396,
}
}
I would like to only get date and the entire rates subdocument like below. I know I could add rates.AUD, rates.CAD etc. to the projection but that would make the projection extremely big and just unbearable to read and hard to maintain as a new field (or currency in this case) might get added in the future.
{
"date": "2018-07-04",
"rates": {
"AUD": 1.566015,
"CAD": 1.560132,
"CHF": 1.154727,
"CNY": 7.827874,
"GBP": 0.882047,
"JPY": 132.360679,
"USD": 1.23396,
}
}
Is there any projection similar to {date: 1, "rates.*". 1} that works like described above?
Maybe this?
db.col.aggregate([ {
$project: {
date: 1,
rates: 1
}
}])
I am getting requests from different devices as Json. Some of them show temperature as "T", some other as "temp" and it can be different in other devices. is that possible to define in MongoDB to put all of these values in single field "temperature"?
Doesn't matter if it is "temp" or "T" or "tempC", just put all of them in "temperature" field.
Here is an example of my data:
[
{ "ip": "12:3B:6A:1A:E6:8B", "type": 0, "t": 37},
{ "ip": "22:33:66:1A:E6:8B", "type": 1, "temperature": 40},
{ "ip": "1A:3C:6A:1A:E6:8B", "type": 1, "temp": 30}
]
I want to put temp, t and temperature in Temperature field in my collection.
You can use $ifNull operator to control which value should be transferred into your output, like below:
db.col.aggregate([
{
$addFields: { Temperature: { $ifNull: [ { $ifNull: [ "$t", "$temperature"] }, "$temp" ] } }
},
{
$project: {
t: 0,
temperature: 0,
temp: 0
}
}
])
This will merge that three fields into one Temperature taking first not empty value. Additionally if you want to update your collection, you can add $out as a last aggregation stage like { $out: col } but keep in mind that it will entirely replace your source collection.
I think mongodb supports regular expression but they are meant to search datas, not to insert them based on fieldname matches.
I am quite sure you shall use some kind of facade in front of your database to achieve that.
Seems this question is a very popular one, but I haven't found an answer (or at least not one I was able to understand).
I'm having a flat file that I would like to store in Mongo using some nestings. Though it is relatively easy to achieve with an insert and unique content I have the need to update the content on a regular base, so would need to be able to use the update commands also.
My flat file looks as follows :
Model,Category,Organisation,CountryCode,CountryWarranty,PeriodCode,PeriodQty
Model1,Category1,Org1,Code1,2Y,201707,1
Model1,Category1,Org1,Code1,2Y,201708,2
Model1,Category1,Org1,Code1,1Y,201709,3
Model1,Category1,Org1,Code2,2Y,201707,7
Model1,Category1,Org1,Code2,2Y,201708,8
Model1,Category1,Org1,Code2,5Y,201709,7
Model1,Category1,Org2,Code3,2Y,201707,5
Model1,Category1,Org2,Code3,4Y,201708,6
Model1,Category1,Org2,Code3,2Y,201709,7
...
Model_n,Category_n,Org_n,Code_n,3Y,201802,20
and what I like to achieve is the following :
{
"_id": "Model1",
"Model_category": "Category1",
"Product_Sales": [
{
"Organisation": "Org1",
"Country": [
{
"Code": "Code1",
"Guarantee_Years": "2Y",
"Period": [
{"Code": 201707,"Qty": 1},
{"Code": 201708,"Qty": 2},
{"Code": 201709,"Qty": 3}
]
}, {
"Code": "Code2",
"Guarantee_Years": "2Y",
"Period": [
{"Code": 201707,"Qty": 7},
{"Code": 201708,"Qty": 8},
{"Code": 201709,"Qty": 7}
]
}
]
}, {
"Organisation": "Org2",
"Country": [
{
"Code": "Code3",
"Guarantee_Years": "2Y",
"Period": [
{"Code": 201707,"Qty": 5},
{"Code": 201708,"Qty": 6},
{"Code": 201709,"Qty": 7}
]
}
]
}
]
}
Below a snippet of what I tried, note that the syntax is specific to my development environment so I know it is not workable or proper mongo, but it's about the basic idea. Any example using the console will do fine for me
concat("{update: "master_Sales",
updates: [
{
q:{"_id":", %{_id},""},
u:{$addToSet: {
"Product_Sales.Organisation": "", %{org}, "",
"Product_Sales.Organisation.Country": [
-- more here but have no clue --
]
}}
, upsert: true}
]}"
)
Adding my organisations works fine, but as soon as I want to add a second level (nested within an org) it goes wrong.
So in essence I want to be able to add this flat content to my Mongo in a nested array structure, and each time one of the values is changed in the future (say the quantity is updated, or a new country is added) that the line is added / updated so I am not forced to do a full refresh and insert each time a line is modified.
What would be the best approach to deal with this?
say the quantity is updated, or a new country is added
You can try below update query in 3.6 mongo version.
For updating Qty for Organisation/Country Code/Period Code - Org1/Code1/201707
db.collection.update(
{ },
{ "$set": { "Product_Sales.$[org].Country.$[country].Period.$[period].Qty" : 2 } },
{ arrayFilters: [ { "org.organisation": "Org1" }, { "country.Code": "Code1" }, { "period.Code": 201707 } ] }
)
For adding new Country to Organisation Org2
db.collection.update(
{ },
{ "$push": { "Product_Sales.$[org].Country" : {"Code": "Code4","Guarantee_Years": "2Y"} } },
{ arrayFilters: [ { "org.organisation": "Org2"} ] }
)
Can be simplified to
db.collection.update(
{ "org.organisation": "Org2"},
{ "$push": { "Product_Sales.$.Country" : {"Code": "Code4","Guarantee_Years": "2Y"} } }
)
The document is like below.
{
"title": "Book1",
"dailyactiviescores":[
{
"date": 2013-06-05,
"score": 10,
},
{
"date": 2013-06-06,
"score": 21,
},
]
}
The daily active score is intended to increase once the book is opened by a reader. The first solution comes to mind is use "$" to find whether target date has a score or not, and deal with it.
err = bookCollection.Update(
{"title":"Book1", "dailyactivescore.date": 2013-06-06},
{"$inc":{"dailyactivescore.$.score": 1}})
if err == ErrNotFound {
bookCollection.Update({"title":"Book1"}, {"$push":...})
}
But I cannot help to think is there any way to return the index of an item inside array? If so, I could use one query to do the job rather than two. Like this.
index = bookCollection.Find(
{"title":"Book1", "dailyactivescore.date": 2013-06-06}).Select({"$index"})
if index != -1 {
incTarget = FormatString("dailyactivescore.%d.score", index)
bookCollection.Update(..., {"$inc": {incTarget: 1}})
} else {
//push here
}
Incrementing a field that's not present isn't the issue as doing $inc:1 on it will just create it and set it to 1 post-increment. The issue is when you don't have an array item corresponding to the date you want to increment.
There are several possible solutions here (that don't involve multiple steps to increment).
One is to pre-create all the dates in the array elements with scores:0 like so:
{
"title": "Book1",
"dailyactiviescores":[
{
"date": 2013-06-01,
"score": 0,
},
{
"date": 2013-06-02,
"score": 0,
},
{
"date": 2013-06-03,
"score": 0,
},
{
"date": 2013-06-04,
"score": 0,
},
{
"date": 2013-06-05,
"score": 0,
},
{
"date": 2013-06-06,
"score": 0
}, { etc ... }
]
}
But how far into the future to go? So one option here is to "bucket" - for example, have an activities document "per month" and before the start of a month have a job that creates the new documents for next month. Slightly yucky. But it'll work.
Other options involve slight changes in schema.
You can use a collection with book, date, activity_scores. Then you can use a simple upsert to increment a score:
db.books.update({title:"Book1", date:"2013-06-02", {$inc:{score:1}}, {upsert:true})
This will increment the score or insert new record with score:1 for this book and date and your collection will look like this:
{
"title": "Book1",
"date": 2013-06-01,
"score": 10,
},
{
"title": "Book1",
"date": 2013-06-02,
"score": 1,
}, ...
Depending on how much you simplified your example from your real use case, this might work well.
Another option is to stick with the array but switch to using the date string as a key that you increment:
Schema:
{
"title": "Book1",
"dailyactiviescores":{
{ "2013-06-01":10},
{ "2013-06-02":8}
}
}
Note it's now a subdocument and not an array and you can do:
db.books.update({title:"Book1"}, {"dailyactivityscores.2013-06-03":{$inc:1}})
and it will add a new date into the subdocument and increment it resulting in:
{
"title": "Book1",
"dailyactiviescores":{
{ "2013-06-01":10},
{ "2013-06-02":8},
{ "2013-06-03":1}
}
}
Note it's now harder to "add-up" the scores for the book so you can atomically also update a "subtotal" in the same update statement whether it's for all time or just for the month.
But here it's once again problematic to keep adding days to this subdocument - what happens when you're still around in a few years and these book documents grow hugely?
I suspect that unless you will only be keeping activity scores for the last N days (which you can do with capped array feature in 2.4) it will be simpler to have a separate collection for book-activity-score tracking where each book-day is a separate document than to embed the scores for each day into the book in a collection of books.
According to the docs:
The $inc operator increments a value of a field by a specified amount.
If the field does not exist, $inc sets the field to the specified
amount.
So, if there won't be a score field in the array item, $inc will set it to 1 in your case, like this:
{
"title": "Book1",
"dailyactiviescores":[
{
"date": 2013-06-05,
"score": 10,
},
{
"date": 2013-06-06,
},
]
}
bookCollection.Update(
{"title":"Book1", "dailyactivescore.date": 2013-06-06},
{"$inc":{"dailyactivescore.$.score": 1}})
will result into:
{
"title": "Book1",
"dailyactiviescores":[
{
"date": 2013-06-05,
"score": 10,
},
{
"date": 2013-06-06,
"score": 1
},
]
}
Hope that helps.