Wrong data entered on bulk write - mongodb

I'm writing bulk data into monogdb using golang. Suppose we have below data
[
{
id:1,
name:"abc"
}
{
id:2,
name:"cde"
}
....upto 1000
]
to save this data I'm doing bulk Write operation on this by using below code
mongoSession, ctx := config.ConnectDb("myFirstDatabase")
defer mongoSession.Disconnect(ctx)
var operations []mongo.WriteModel
operationA := mongo.NewInsertOneModel()
getCollection := mongoSession.Database("myFirstDatabase").Collection("transactions")
for k, v := range transaction {
operationA.SetDocument(transaction[k])
operations = append(operations, operationA)
fmt.Println(operationA)
}
bulkOption := options.BulkWriteOptions{}
// bulkOption.SetOrdered(true)
_, err = getCollection.BulkWrite(ctx, operations, &bulkOption)
operations is type of []mongo.WriteModel in for loop I'm appending the single single doc to operations but on the other hand when I'm verifying in mongodb that all the documents are there then there is only last document that written 1000 times in mongodb. Please let me know which code of line is wrong in the above snippet.
Thanks in advance!

You are referencing the same variable and updating it again, so after iteration, you're having only the last records * the number of times loop ran.
operationA.SetDocument(transaction[k]) will set the value again n again to same variable resulting only last value be used operations = append(operations, operationA)
for k, v := range transaction {
var operationA := mongo.NewInsertOneModel() // create variable with local scope to fix the issue
operationA.SetDocument(transaction[k])
operations = append(operations, operationA)
fmt.Println(operationA)
}

Related

Golang: how to check if collection.Find didn't find any documents?

I'm using Go Mongo documentation where it is explicitly written that with FindOne function, if no documents are matching the filter, ErrNoDocuments will be returned, however, this error is not returned if I use Find function and no documents are found. Is there a way to check that the cursor is empty without getting a list of all returned documents and then checking if the list is empty?
You may simply call Cursor.Next() to tell if there are more documents. If you haven't iterated over any yet, this will tell if there's at least one result document.
Note that this will cause the first batch of the results to fetch though (but will not decode any of the result documents into any Go values).
Also note that Cursor.Next() would return false if an error would occur or the passed context.Context would expire.
Example:
var c *mongo.Collection // Acquire collection
curs, err := c.Find(ctx, bson.M{"your-query": "here"})
// handle error
hasResults := curs.Next(ctx)
if hasResults {
// There are result documents
}
// Don't forget to close the cursor!
Although if you intend to decode the results, you might as well just call Cursor.All() and check the length of the result slice:
curs, err := c.Find(ctx, bson.M{"your-query": "here"})
// handle error
var results []YourType
err := curs.All(ctx, &results)
// Handle error
if len(results) > 0 {
// There are results
}
// Note: Cursor.All() closes the cursor

Add Fields to MongoDB Inner Object

I'm trying to append fields to an object in my mongodb collection. So far this is what my document in MongoDB looks like.
My users can have multiple devices so I'm trying to append more fields to the devices object. I have tried to use $push for an array instead of an object but I didn't like how I would have to access the data later on when I retrieve it from the database.
So I started to use $set. $set works great because it gives me the format in which I want my data to save in the db but it will continually override my one key value pair in the devices object every time and I don't want that to happen.
db.go
func AddDeviceToProfile(uid string, deviceId int, deviceName string) {
client := ConnectClient()
col := client.Database(uid).Collection("User")
idString := strconv.Itoa(deviceId)
filter := bson.M{"uid": uid}
update := bson.M{
"$set": bson.M{"devices": bson.M{idString: deviceName}}, <------ Need to fix this
}
option := options.FindOneAndUpdate()
_ = col.FindOneAndUpdate(context.TODO(), filter, update, option)
log.Print("Device Added")
_ = client.Disconnect(context.TODO())
}
I have looked into using $addFields but I don't know if I was doing it correctly I just replaced $set above and added $addFields and I also tried it this way
update := bson.M{
"devices": bson.M{"$addFields": bson.M{idString: deviceName}},
}
What I want my document to look like
Instead of using $push or $addFields what you need is $set directive.
To specify a field in an embedded document, use dot notation.
For the document matching the criteria _id equal to 100, the following operation updates the make field in the devices document:
db.products.update(
{ _id: 100 },
{ $set: { "devices.make": "zzz" } }
)
Converting them to Go syntax is easy as well. What you are doing is correct. The following should work or might require a little bit of tweaking.
func AddDeviceToProfile(uid string, deviceId int, deviceName string) {
client := ConnectClient()
col := client.Database(uid).Collection("User")
idString := strconv.Itoa(deviceId)
filter := bson.M{"uid": uid}
update := bson.M{"$set": bson.M{"devices." + idString: deviceName}}
option := options.FindOneAndUpdate()
_ = col.FindOneAndUpdate(context.TODO(), filter, update, option)
log.Print("Device Added")
_ = client.Disconnect(context.TODO())
}

Fail to get value with tier.Next(&result) when using golang mongo without indexes in collection, get value if index stetted

I get a question when using go mongo operation.
My code is like this:
iter = coll.Find(filter).Sort("-timestamp").Skip(12510).Limit(10).Iter()
for iter.Next(&result){
....
}
I have 12520 documents in collection, but fail to get value with iter.Next(), if I have not set the index of timestamp in MongoDB.
If I set index of "timestamp", it seems work, and I can get value in result.
So, what happened?
You need to decode your data first then iterate it
here item is your struct of data you get from MongoDB
if err := iter.Decode(&item); err != nil {
return status.Errorf(
codes.Aborted,
fmt.Sprintln(errormsg.ERR_MSG_DATA_CANT_DECODE, err))
}
then do iteration it will works !!!

Efficient paging in MongoDB using mgo

I've searched and found no Go solution to the problem, not with or without using mgo.v2, not on StackOverflow and not on any other site. This Q&A is in the spirit of knowledge sharing / documenting.
Let's say we have a users collection in MongoDB modeled with this Go struct:
type User struct {
ID bson.ObjectId `bson:"_id"`
Name string `bson:"name"`
Country string `bson:"country"`
}
We want to sort and list users based on some criteria, but have paging implemented due to the expected long result list.
To achieve paging of the results of some query, MongoDB and the mgo.v2 driver package has built-in support in the form of Query.Skip() and Query.Limit(), e.g.:
session, err := mgo.Dial(url) // Acquire Mongo session, handle error!
c := session.DB("").C("users")
q := c.Find(bson.M{"country" : "USA"}).Sort("name", "_id").Limit(10)
// To get the nth page:
q = q.Skip((n-1)*10)
var users []*User
err = q.All(&users)
This however becomes slow if the page number increases, as MongoDB can't just "magically" jump to the xth document in the result, it has to iterate over all the result documents and omit (not return) the first x that need to be skipped.
MongoDB provides the right solution: If the query operates on an index (it has to work on an index), cursor.min() can be used to specify the first index entry to start listing results from.
This Stack Overflow answer shows how it can be done using a mongo client: How to do pagination using range queries in MongoDB?
Note: the required index for the above query would be:
db.users.createIndex(
{
country: 1,
name: 1,
_id: 1
}
)
There is one problem though: the mgo.v2 package has no support specifying this min().
How can we achieve efficient paging that uses MongoDB's cursor.min() feature using the mgo.v2 driver?
Unfortunately the mgo.v2 driver does not provide API calls to specify cursor.min().
But there is a solution. The mgo.Database type provides a Database.Run() method to run any MongoDB commands. The available commands and their documentation can be found here: Database commands
Starting with MongoDB 3.2, a new find command is available which can be used to execute queries, and it supports specifying the min argument that denotes the first index entry to start listing results from.
Good. What we need to do is after each batch (documents of a page) generate the min document from the last document of the query result, which must contain the values of the index entry that was used to execute the query, and then the next batch (the documents of the next page) can be acquired by setting this min index entry prior to executing the query.
This index entry –let's call it cursor from now on– may be encoded to a string and sent to the client along with the results, and when the client wants the next page, he sends back the cursor saying he wants results starting after this cursor.
Doing it manually (the "hard" way)
The command to be executed can be in different forms, but the command name (find) must be first in the marshaled result, so we'll use bson.D (which preserves order in contrast to bson.M):
limit := 10
cmd := bson.D{
{Name: "find", Value: "users"},
{Name: "filter", Value: bson.M{"country": "USA"}},
{Name: "sort", Value: []bson.D{
{Name: "name", Value: 1},
{Name: "_id", Value: 1},
},
{Name: "limit", Value: limit},
{Name: "batchSize", Value: limit},
{Name: "singleBatch", Value: true},
}
if min != nil {
// min is inclusive, must skip first (which is the previous last)
cmd = append(cmd,
bson.DocElem{Name: "skip", Value: 1},
bson.DocElem{Name: "min", Value: min},
)
}
The result of executing a MongoDB find command with Database.Run() can be captured with the following type:
var res struct {
OK int `bson:"ok"`
WaitedMS int `bson:"waitedMS"`
Cursor struct {
ID interface{} `bson:"id"`
NS string `bson:"ns"`
FirstBatch []bson.Raw `bson:"firstBatch"`
} `bson:"cursor"`
}
db := session.DB("")
if err := db.Run(cmd, &res); err != nil {
// Handle error (abort)
}
We now have the results, but in a slice of type []bson.Raw. But we want it in a slice of type []*User. This is where Collection.NewIter() comes handy. It can transform (unmarshal) a value of type []bson.Raw into any type we usually pass to Query.All() or Iter.All(). Good. Let's see it:
firstBatch := res.Cursor.FirstBatch
var users []*User
err = db.C("users").NewIter(nil, firstBatch, 0, nil).All(&users)
We now have the users of the next page. Only one thing left: generating the cursor to be used to get the subsequent page should we ever need it:
if len(users) > 0 {
lastUser := users[len(users)-1]
cursorData := []bson.D{
{Name: "country", Value: lastUser.Country},
{Name: "name", Value: lastUser.Name},
{Name: "_id", Value: lastUser.ID},
}
} else {
// No more users found, use the last cursor
}
This is all good, but how do we convert a cursorData to string and vice versa? We may use bson.Marshal() and bson.Unmarshal() combined with base64 encoding; the use of base64.RawURLEncoding will give us a web-safe cursor string, one that can be added to URL queries without escaping.
Here's an example implementation:
// CreateCursor returns a web-safe cursor string from the specified fields.
// The returned cursor string is safe to include in URL queries without escaping.
func CreateCursor(cursorData bson.D) (string, error) {
// bson.Marshal() never returns error, so I skip a check and early return
// (but I do return the error if it would ever happen)
data, err := bson.Marshal(cursorData)
return base64.RawURLEncoding.EncodeToString(data), err
}
// ParseCursor parses the cursor string and returns the cursor data.
func ParseCursor(c string) (cursorData bson.D, err error) {
var data []byte
if data, err = base64.RawURLEncoding.DecodeString(c); err != nil {
return
}
err = bson.Unmarshal(data, &cursorData)
return
}
And we finally have our efficient, but not so short MongoDB mgo paging functionality. Read on...
Using github.com/icza/minquery (the "easy" way)
The manual way is quite lengthy; it can be made general and automated. This is where github.com/icza/minquery comes into the picture (disclosure: I'm the author). It provides a wrapper to configure and execute a MongoDB find command, allowing you to specify a cursor, and after executing the query, it gives you back the new cursor to be used to query the next batch of results. The wrapper is the MinQuery type which is very similar to mgo.Query but it supports specifying MongoDB's min via the MinQuery.Cursor() method.
The above solution using minquery looks like this:
q := minquery.New(session.DB(""), "users", bson.M{"country" : "USA"}).
Sort("name", "_id").Limit(10)
// If this is not the first page, set cursor:
// getLastCursor() represents your logic how you acquire the last cursor.
if cursor := getLastCursor(); cursor != "" {
q = q.Cursor(cursor)
}
var users []*User
newCursor, err := q.All(&users, "country", "name", "_id")
And that's all. newCursor is the cursor to be used to fetch the next batch.
Note #1: When calling MinQuery.All(), you have to provide the names of the cursor fields, this will be used to build the cursor data (and ultimately the cursor string) from.
Note #2: If you're retrieving partial results (by using MinQuery.Select()), you have to include all the fields that are part of the cursor (the index entry) even if you don't intend to use them directly, else MinQuery.All() will not have all the values of the cursor fields, and so it will not be able to create the proper cursor value.
Check out the package doc of minquery here: https://godoc.org/github.com/icza/minquery, it is rather short and hopefully clean.

Use mgo aggregate iterator data in upsert without unmarshaling

First of all, I am very new to go :)
I am trying to do an aggregate + upsert in mongo using go and mgo driver.
My code looks something like this:
pipe := c.Pipe([]bson.M{{"$match": bson.M{"name":"John"}}})
iter := pipe.Iter()
resp := []bson.M{}
for iter.Next(&resp) {
//
// read "value.sha1" from each response
// do a:
// otherCollection.Upsert(bson.M{"value.sha1": mySha1}, resp)
//
}
The response from the aggregate collection can have lot's of formats, so I can't define a struct for it.
I just need to get one of the fields from the response, which is a sha1, and update another collection with the response received, based on the sha1 condition.
Can anybody point me in the right direction?
Maybe I misunderstood you but you can simply access returned documents as a map. Something like this:
pipe := c.Pipe([]bson.M{})
iter := pipe.Iter()
resp := bson.M{} // not array as you are using iterator which returns single document
for iter.Next(&resp) {
otherCollection.Upsert(bson.M{"value.sha1": result["value"].(bson.M)["sha1"]}, resp)
}