MongoDB update_one vs update_many - Improve speed - mongodb

I got a collection of 10000 ca. docs, where each doc has the following format:
{
"_id": {
"$oid": "631edc6e207c89b932a70a26"
},
"name": "Ethereum",
"auditInfoList": [
{
"coinId": "1027",
"auditor": "Fairyproof",
"auditStatus": 2,
"reportUrl": "https://www.fairyproof.com/report/Covalent"
}
],
"circulatingSupply": 122335921.0615,
"cmcRank": 2,
"dateAdded": "2015-08-07T00:00:00.000Z",
"id": 1027,
"isActive": 1,
"isAudited": true,
"lastUpdated": 1662969360,
"marketPairCount": 6085,
"quotes": [
{
"name": "USD",
"price": 1737.1982544180462,
"volume24h": 14326453277.535921,
"marketCap": 212521748520.66168,
"percentChange1h": 0.62330307,
"percentChange24h": -1.08847937,
"percentChange7d": 10.96517745,
"lastUpdated": 1662966780,
"percentChange30d": -13.49374496,
"percentChange60d": 58.25153862,
"percentChange90d": 42.27475921,
"fullyDilluttedMarketCap": 212521748520.66,
"marketCapByTotalSupply": 212521748520.66168,
"dominance": 20.0725,
"turnover": 0.0674117,
"ytdPriceChangePercentage": -53.9168
}
],
"selfReportedCirculatingSupply": 0,
"slug": "ethereum",
"symbol": "ETH",
"tags": [
"mineable",
"pow",
"smart-contracts",
"ethereum-ecosystem",
"coinbase-ventures-portfolio",
"three-arrows-capital-portfolio",
"polychain-capital-portfolio",
"binance-labs-portfolio",
"blockchain-capital-portfolio",
"boostvc-portfolio",
"cms-holdings-portfolio",
"dcg-portfolio",
"dragonfly-capital-portfolio",
"electric-capital-portfolio",
"fabric-ventures-portfolio",
"framework-ventures-portfolio",
"hashkey-capital-portfolio",
"kenetic-capital-portfolio",
"huobi-capital-portfolio",
"alameda-research-portfolio",
"a16z-portfolio",
"1confirmation-portfolio",
"winklevoss-capital-portfolio",
"usv-portfolio",
"placeholder-ventures-portfolio",
"pantera-capital-portfolio",
"multicoin-capital-portfolio",
"paradigm-portfolio",
"injective-ecosystem"
],
"totalSupply": 122335921.0615
}
Im pulling updated version of it and, to aviod duplicates, im doing the following by using 'update_one'
for doc in new_doc_list:
CRYPTO_TEMPORARY_LIST.update_one(
{ "name" : doc['name']},
{ "$set": {
"lastUpdated": doc['lastUpdated']
}
},
upsert=True)
The problem is it's too slow.
I'm trying to figure out how to improve speed by using update_many but can't figure out how to set it up.
I Basically want to update every document x name. Completely change the doc and not the "lastUpdated" field would b even better.
Thanks guys <3

Related

Mongodb projection exclude one of my field when another is present

I am running a simple query using mongoDB compass using filter and project and I'm having a behaviour I can't explain.
Here is my filter:
{
"$and": [
{
"_id": ObjectId('611ee5ee6b93815ee436969e')
},
{
"type": "article"
}
]
}
I get the following result:
{
"_id": {
"$oid": "611ee5ee6b93815ee436969e"
},
"type": "article",
"history": [],
"liked": [],
"parentId": "61105f00cc11ec10406fd1c4",
"permissionList": [],
"title": "Test",
"wikiId": "610de623fbfa1e58cdba9d2c"
}
As expected I get all the fields, in particular type and wikiId
However if i add the following projection:
{
"_id": 1,
"wikiId": 1,
"parentId": 1,
"title": 1,
"type": 1,
"permissionList": 1,
"liked": 1,
"history": 1
}
I would expect the same result, however i get:
{
"_id": {
"$oid": "611ee5ee6b93815ee436969e"
},
"type": "article",
"history": [],
"liked": [],
"parentId": "61105f00cc11ec10406fd1c4",
"permissionList": [],
"title": "Test"
}
This time i don't have the field wikiId, however it was requested in the projection.
And what's bug me is that if I do this projection instead:
{
"_id": 1,
"wikiId": 1,
"parentId": 1,
"title": 1,
"permissionList": 1,
"liked": 1,
"history": 1
}
Then I got the wikiId field as expected in the result again.
Anyone can provide me an insight of what is going on with those queries and where i'm mistaken.
Edit 1: The fact I want to use a projection is that depending on the type field I can use different document with different field.
In my Java code I'm using #BsonDiscriminator(key = "type") but when I explicitly want a kind of document I'm creating the appropriate projection to be sure. However in this case I just wanted to simplify the issue I'm facing to the simplest.
Thanks
in compass by default filters are having and condition so your filter can be refactored as below:
{ "_id": ObjectId('611ee5ee6b93815ee436969e'), "type": "article" }
And regarding projection you can just simply mention fields to Exclude or include like below:
{
"_id": 1,
"wikiId": 1,
"parentId": 1,
"title": 1,
"type": 1,
"permissionList": 1,
"liked": 1,
"history": 1
}
if you need all fields then there is no need to mention anything by default in compass you get all fields, but lets say you want all but few fields(_id,wikiId) then use below projection:
{
"_id": 0,
"wikiId": 0,
}
Also in compass you can try clicking find again or refresh button to see as sometimes it does not reflects current filter as as we change filter it fetches data so just hit refresh or find and see it should work.

Updating Mongo DB collection field from object to array of objects

I had to change one of the fields of my collection in mongoDB from an object to array of objects containing a lot of data. New documents get inserted without any problem, but when attempted to get old data, it never maps to the original DTO correctly and runs into errors.
subject is the field that was changed in Students collection.
I was wondering is there any way to update all the records so they all have the same data type, without losing any data.
The old version of Student:
{
"_id": "5fb2ae251373a76ae58945df",
"isActive": true,
"details": {
"picture": "http://placehold.it/32x32",
"age": 17,
"eyeColor": "green",
"name": "Vasquez Sparks",
"gender": "male",
"email": "vasquezsparks#orbalix.com",
"phone": "+1 (962) 512-3196",
"address": "619 Emerald Street, Nutrioso, Georgia, 6576"
},
"subject":
{
"id": 0,
"name": "math",
"module": {
"name": "Advanced",
"semester": "second"
}
}
}
This needs to be updated to the new version like this:
{
"_id": "5fb2ae251373a76ae58945df",
"isActive": true,
"details": {
"picture": "http://placehold.it/32x32",
"age": 17,
"eyeColor": "green",
"name": "Vasquez Sparks",
"gender": "male",
"email": "vasquezsparks#orbalix.com",
"phone": "+1 (962) 512-3196",
"address": "619 Emerald Street, Nutrioso, Georgia, 6576"
},
"subject": [
{
"id": 0,
"name": "math",
"module": {
"name": "Advanced",
"semester": "second"
}
},
{
"id": 1,
"name": "history",
"module": {
"name": "Basic",
"semester": "first"
}
},
{
"id": 2,
"name": "English",
"module": {
"name": "Basic",
"semester": "second"
}
}
]
}
I understand there might be a way to rename old collection, create new and insert data based on old one in to new one. I was wondering for some direct way.
The goal is to turn subject into an array of 1 if it is not already an array, otherwise leave it alone. This will do the trick:
update args are (predicate, actions, options).
db.foo.update(
// Match only those docs where subject is an object (i.e. not turned into array):
{$expr: {$eq:[{$type:"$subject"},"object"]}},
// Actions: set subject to be an array containing $subject. You MUST use the pipeline version
// of the update actions to correctly substitute $subject in the expression!
[ {$set: {subject: ["$subject"] }} ],
// Do this for ALL matches, not just first:
{multi:true});
You can run this converter over and over because it will ignore converted docs.
If the goal is to convert and add some new subjects, preserving the first one, then we can set up the additional subjects and concatenate them into one array as follows:
var mmm = [ {id:8, name:"CORN"}, {id:9, name:"DOG"} ];
rc = db.foo.update({$expr: {$eq:[{$type:"$subject"},"object"]}},
[ {$set: {subject: {$concatArrays: [["$subject"], mmm]} }} ],
{multi:true});

Loopback 3 query by Property of a embedded model

I'm using loopback 3 to build a backend with mongoDB.
So i have 2 models: Object and Attachment. Object have a relation Embeds2Many to Attachment.
Objects look like that in mongoDB
[
{
"fieldA": "valueA1",
"attachments": [
{
"id": 1,
"url": "abc.com/image1"
},
{
"id": 2,
"url": "abc.com/image2"
}
]
},
{
"fieldA": "valueA2",
"attachments": [
{
"id": 4,
"url": "abc.com/image4"
},
{
"id": 5,
"url": "abc.com/image5"
}
]
}
]
The question is: how can i get Objects with attachments.id=4 over the RestAPI?
I have try with the where and include filter. But it didn't work. It look like, that this function is not implemented in loopback3, right?
I have found the solution. It only works on Mongodb, Cloudant and Memory database.
{
"filter": {
"where": {
"attachments.id": 4
}
}
}

MongoDB query for Find 2 levels object element

I have a big issue, i don't know what to do...
What I wanna is to find all objects with Object2 name. I have Object 2 with name element.
What I wanna is to find all objects with the value X in the element name inside Object2. in the example is the value name is ="IWANTALLOBJECTSWITHTHISNAME"
the Json structure.
"objects": [
{
"_id": "5c69a62cf9acf00d00dbc02d",
"date": "2222-02-24T00:00:00.000Z",
"description": "22",
"Object1": {
"_id": "5c69a62cf9acf00d00dbc02b",
"date": "2222-02-24T00:00:00.000Z",
"user": "5c30fd5890bbd24a1c46c7ee",
"positionsObject1": [
{
"id": 1,
"Object2": {
"_id":"5c69a62cf9acf00d00dbc02c",
"name": "IWANTALLOBJECTSWITHTHISNAME"
},
"description": "22",
"value": 22
}
],
"id": 13,
"__v": 0
},
"user": "5c30fd5890bbd24a1c46c7ee",
"id": 7,
"__v": 0
}
]
I'm new in mongoDB and this query is really really hard. I tried everything. Thank very much for the help.
You can specify the path using dot notation:
db.col.find({ "objects.Object1.positionsObject1.Object2.name": "IWANTALLOBJECTSWITHTHISNAME" })

MongoDB - Project specific element from array (big data)

I got a big array with data in the following format:
{
"application": "myapp",
"buildSystem": {
"counter": 2361.1,
"hostname": "host.com",
"jobName": "job_name",
"label": "2361",
"systemType": "sys"
},
"creationTime": 1517420374748,
"id": "123",
"stack": "OTHER",
"testStatus": "PASSED",
"testSuites": [
{
"errors": 0,
"failures": 0,
"hostname": "some_host",
"properties": [
{
"name": "some_name",
"value": "UnicodeLittle"
},
<MANY MORE PROPERTIES>,
{
"name": "sun",
"value": ""
}
],
"skipped": 0,
"systemError": "",
"systemOut": "",
"testCases": [
{
"classname": "IdTest",
"name": "has correct representation",
"status": "PASSED",
"time": "0.001"
},
<MANY MORE TEST CASES>,
{
"classname": "IdTest",
"name": "normalized values",
"status": "PASSED",
"time": "0.001"
}
],
"tests": 8,
"time": 0.005,
"timestamp": "2018-01-31T17:35:15",
"title": "IdTest"
}
<MANY MORE TEST SUITES >,
]}
Where I can distinct three main structures with big data: TestSuites, Properties, and TestCases. My task is to sum all times from each TestSuite so that I can get the total duration of the test. Since the properties and TestCases are huge, the query cannot complete. I would like to select only the "time" value from TestSuites, but it kind of conflicts with the "time" of TestCases in my query:
db.my_tests.find(
{
application: application,
creationTime:{
$gte: start_date.valueOf(),
$lte: end_date.valueOf()
}
},
{
application: 1,
creationTime: 1,
buildSystem: 1,
"testSuites.time": 1,
_id:1
}
)
Is it possible to project only the "time" properties from TestSuites without loading the whole schema? I already tried testSuites: 1, testSuites.$.time: 1 without success. Please notice that TestSuites is an array of one element with a dictionary.
I already checked this similar post without success:
Mongodb update the specific element from subarray
Following code prints duration of each TestSuite:
query = db.my_collection.aggregate(
[
{$match: {
application: application,
creationTime:{
$gte: start_date.valueOf(),
$lte: end_date.valueOf()
}
}
},
{ $project :
{ duration: { $sum: "$testSuites.time"}}
}
]
).forEach(function(doc)
{
print(doc._id)
print(doc.duration)
}
)
Is it possible to project only the "time" properties from TestSuites
without loading the whole schema? I already tried testSuites: 1,
testSuites.$.time
Answering to your problem of prejecting only the time property of the testSuites document you can simply try projecting it with "testSuites.time" : 1 (you need to add the quotes for the dot notation property references).
My task is to sum all times from each TestSuite so that I can get the
total duration of the test. Since the properties and TestCases are
huge, the query cannot complete
As for your task, i suggest you try out the mongodb's aggregation framework for your calculations documents tranformations. The aggregations framework option {allowDiskUse : true} will also help you if you are proccessing "large" documents.