Efficient paging in MongoDB using mgo

Efficient paging in MongoDB using mgo - mongodb

I've searched and found no Go solution to the problem, not with or without using mgo.v2, not on StackOverflow and not on any other site. This Q&A is in the spirit of knowledge sharing / documenting.
Let's say we have a users collection in MongoDB modeled with this Go struct:
type User struct {
ID bson.ObjectId `bson:"_id"`
Name string `bson:"name"`
Country string `bson:"country"`
}
We want to sort and list users based on some criteria, but have paging implemented due to the expected long result list.
To achieve paging of the results of some query, MongoDB and the mgo.v2 driver package has built-in support in the form of Query.Skip() and Query.Limit(), e.g.:
session, err := mgo.Dial(url) // Acquire Mongo session, handle error!
c := session.DB("").C("users")
q := c.Find(bson.M{"country" : "USA"}).Sort("name", "_id").Limit(10)
// To get the nth page:
q = q.Skip((n-1)*10)
var users []*User
err = q.All(&users)
This however becomes slow if the page number increases, as MongoDB can't just "magically" jump to the xth document in the result, it has to iterate over all the result documents and omit (not return) the first x that need to be skipped.
MongoDB provides the right solution: If the query operates on an index (it has to work on an index), cursor.min() can be used to specify the first index entry to start listing results from.
This Stack Overflow answer shows how it can be done using a mongo client: How to do pagination using range queries in MongoDB?
Note: the required index for the above query would be:
db.users.createIndex(
{
country: 1,
name: 1,
_id: 1
}
)
There is one problem though: the mgo.v2 package has no support specifying this min().
How can we achieve efficient paging that uses MongoDB's cursor.min() feature using the mgo.v2 driver?

Unfortunately the mgo.v2 driver does not provide API calls to specify cursor.min().
But there is a solution. The mgo.Database type provides a Database.Run() method to run any MongoDB commands. The available commands and their documentation can be found here: Database commands
Starting with MongoDB 3.2, a new find command is available which can be used to execute queries, and it supports specifying the min argument that denotes the first index entry to start listing results from.
Good. What we need to do is after each batch (documents of a page) generate the min document from the last document of the query result, which must contain the values of the index entry that was used to execute the query, and then the next batch (the documents of the next page) can be acquired by setting this min index entry prior to executing the query.
This index entry –let's call it cursor from now on– may be encoded to a string and sent to the client along with the results, and when the client wants the next page, he sends back the cursor saying he wants results starting after this cursor.
Doing it manually (the "hard" way)
The command to be executed can be in different forms, but the command name (find) must be first in the marshaled result, so we'll use bson.D (which preserves order in contrast to bson.M):
limit := 10
cmd := bson.D{
{Name: "find", Value: "users"},
{Name: "filter", Value: bson.M{"country": "USA"}},
{Name: "sort", Value: []bson.D{
{Name: "name", Value: 1},
{Name: "_id", Value: 1},
},
{Name: "limit", Value: limit},
{Name: "batchSize", Value: limit},
{Name: "singleBatch", Value: true},
}
if min != nil {
// min is inclusive, must skip first (which is the previous last)
cmd = append(cmd,
bson.DocElem{Name: "skip", Value: 1},
bson.DocElem{Name: "min", Value: min},
)
}
The result of executing a MongoDB find command with Database.Run() can be captured with the following type:
var res struct {
OK int `bson:"ok"`
WaitedMS int `bson:"waitedMS"`
Cursor struct {
ID interface{} `bson:"id"`
NS string `bson:"ns"`
FirstBatch []bson.Raw `bson:"firstBatch"`
} `bson:"cursor"`
}
db := session.DB("")
if err := db.Run(cmd, &res); err != nil {
// Handle error (abort)
}
We now have the results, but in a slice of type []bson.Raw. But we want it in a slice of type []*User. This is where Collection.NewIter() comes handy. It can transform (unmarshal) a value of type []bson.Raw into any type we usually pass to Query.All() or Iter.All(). Good. Let's see it:
firstBatch := res.Cursor.FirstBatch
var users []*User
err = db.C("users").NewIter(nil, firstBatch, 0, nil).All(&users)
We now have the users of the next page. Only one thing left: generating the cursor to be used to get the subsequent page should we ever need it:
if len(users) > 0 {
lastUser := users[len(users)-1]
cursorData := []bson.D{
{Name: "country", Value: lastUser.Country},
{Name: "name", Value: lastUser.Name},
{Name: "_id", Value: lastUser.ID},
}
} else {
// No more users found, use the last cursor
}
This is all good, but how do we convert a cursorData to string and vice versa? We may use bson.Marshal() and bson.Unmarshal() combined with base64 encoding; the use of base64.RawURLEncoding will give us a web-safe cursor string, one that can be added to URL queries without escaping.
Here's an example implementation:
// CreateCursor returns a web-safe cursor string from the specified fields.
// The returned cursor string is safe to include in URL queries without escaping.
func CreateCursor(cursorData bson.D) (string, error) {
// bson.Marshal() never returns error, so I skip a check and early return
// (but I do return the error if it would ever happen)
data, err := bson.Marshal(cursorData)
return base64.RawURLEncoding.EncodeToString(data), err
}
// ParseCursor parses the cursor string and returns the cursor data.
func ParseCursor(c string) (cursorData bson.D, err error) {
var data []byte
if data, err = base64.RawURLEncoding.DecodeString(c); err != nil {
return
}
err = bson.Unmarshal(data, &cursorData)
return
}
And we finally have our efficient, but not so short MongoDB mgo paging functionality. Read on...
Using github.com/icza/minquery (the "easy" way)
The manual way is quite lengthy; it can be made general and automated. This is where github.com/icza/minquery comes into the picture (disclosure: I'm the author). It provides a wrapper to configure and execute a MongoDB find command, allowing you to specify a cursor, and after executing the query, it gives you back the new cursor to be used to query the next batch of results. The wrapper is the MinQuery type which is very similar to mgo.Query but it supports specifying MongoDB's min via the MinQuery.Cursor() method.
The above solution using minquery looks like this:
q := minquery.New(session.DB(""), "users", bson.M{"country" : "USA"}).
Sort("name", "_id").Limit(10)
// If this is not the first page, set cursor:
// getLastCursor() represents your logic how you acquire the last cursor.
if cursor := getLastCursor(); cursor != "" {
q = q.Cursor(cursor)
}
var users []*User
newCursor, err := q.All(&users, "country", "name", "_id")
And that's all. newCursor is the cursor to be used to fetch the next batch.
Note #1: When calling MinQuery.All(), you have to provide the names of the cursor fields, this will be used to build the cursor data (and ultimately the cursor string) from.
Note #2: If you're retrieving partial results (by using MinQuery.Select()), you have to include all the fields that are part of the cursor (the index entry) even if you don't intend to use them directly, else MinQuery.All() will not have all the values of the cursor fields, and so it will not be able to create the proper cursor value.
Check out the package doc of minquery here: https://godoc.org/github.com/icza/minquery, it is rather short and hopefully clean.

Related

Golang: how to check if collection.Find didn't find any documents?

I'm using Go Mongo documentation where it is explicitly written that with FindOne function, if no documents are matching the filter, ErrNoDocuments will be returned, however, this error is not returned if I use Find function and no documents are found. Is there a way to check that the cursor is empty without getting a list of all returned documents and then checking if the list is empty?

You may simply call Cursor.Next() to tell if there are more documents. If you haven't iterated over any yet, this will tell if there's at least one result document.
Note that this will cause the first batch of the results to fetch though (but will not decode any of the result documents into any Go values).
Also note that Cursor.Next() would return false if an error would occur or the passed context.Context would expire.
Example:
var c *mongo.Collection // Acquire collection
curs, err := c.Find(ctx, bson.M{"your-query": "here"})
// handle error
hasResults := curs.Next(ctx)
if hasResults {
// There are result documents
}
// Don't forget to close the cursor!
Although if you intend to decode the results, you might as well just call Cursor.All() and check the length of the result slice:
curs, err := c.Find(ctx, bson.M{"your-query": "here"})
// handle error
var results []YourType
err := curs.All(ctx, &results)
// Handle error
if len(results) > 0 {
// There are results
}
// Note: Cursor.All() closes the cursor

UpdateOne, ReplaceOne, FindOneAndReplace - patternmatch, but no upd data

I'm Using Mongo Go Adapter: github.com/mongodb/mongo-go-driver/
I'm trying different patterns but none of them working for me.
//ref struct
type userbase struct {
Name string `bosn:"Name"`
Coins int `bson:"Coins"`
}
//ref code, it's updating _id, but not updating a value
filter := bson.M{"name": "Dinamis"}
update := bson.D{{"$inc", bson.M{"Coins": 1}}}
db := Client.Database("Nothing").Collection("dataUser")
db.UpdateOne(context.Background(), filter, update)
//update filters that i also used
update := bson.D{{"$inc", bson.D{{"Coins", 1},}},}
//simple ways was tryed also
update := &userbase{name, amount} //should i try *userbase{} ?
//Also i'm tryed
ReplaceOne()
FindOneAndReplace()
FindOneAndUpdate()
it's hard to dig deeper b-cuz of luck of actual documentation: https://docs.mongodb.com/ecosystem/drivers/go/

Thanks #Wan Bachtiar for answering this in official MongoDB-go-adapter group.
By default queries in MongoDB is case sensitive on the field name. In
your struct you defined the field to be Name, but in your filter to
specify name. This would result in no documents matching the query
predicates for the the update operation. For example, if you have a
document as below:
{ "_id": ObjectId("..."), "Name": "Dinamis", "Coins": 1 }
You can perform an update to increment the number of Coins using below
snippet:
collection := client.Database("Nothing").Collection("dataUser")
filter := bson.M{"Name": "Dinamis"}
update := bson.D{{"$inc", bson.M{"Coins": 1}}}
result, err := collection.UpdateOne(context.TODO(), filter, update)
Also, note that you have a typo on the bson tag in your struct. It’s
supposed to be bson:"Name" not bosn:"Name". You may find Query
Documents as a useful reference (Select the Go tab to show examples in
Go)
Regards, Wan.

How to encode/decode mongodb cursor?

I need to build a list of "pages" so part of this there will be a cursor. The issue is that I can't find a way to encode(to string) and decode the cursor. Any idea? The Cursor interface has no "encoding" method(there is ID, though undocumented) and there is no way to create a new cursor from a string(or int).
type Cursor interface {
// Get the ID of the cursor.
ID() int64
// Get the next result from the cursor.
// Returns true if there were no errors and there is a next result.
Next(context.Context) bool
Decode(interface{}) error
DecodeBytes() (bson.Reader, error)
// Returns the error status of the cursor
Err() error
// Close the cursor.
Close(context.Context) error
}
Why do I need the cursor encoded?
To provide pagination to the end client through a html or JSON API.

MongoDB does not provide a serializable cursor. Cursor is not serializable. The recommended workaround is to use a range query and sort on a field that generally changes in a consistent direction over time such _id.
function printStudents(startValue, nPerPage) {
let endValue = null;
db.students.find( { _id: { $lt: startValue } } )
.sort( { _id: -1 } )
.limit( nPerPage )
.forEach( student => {
print( student.name );
endValue = student._id;
} );
return endValue;
}
There is a go package minquery that tries to make the cursor query/serialization more convenient. You may find it helpful.

A mongo.Cursor object isn't something you can encode and put away for later use, like what you intend to use it for.
A mongo.Cursor is something you use to iterate over a "live query", a stream of documents. You can't use it to return a batch of documents which you send to your client, and when the client requests more documents (the next page), you decode the stored cursor and continue where you left off. A cursor have a server side resource under the hood, which is kept for 10 minutes (configurable, see cursorTimeoutMillis) or until you close the cursor implicitly. You do not want to keep the cursor "alive" while waiting for the client if he / she needs more documents, especially in an application with large traffic. Your MongoDB would quickly run out of resources. If cursor is closed by timeout, any attempt to read from the cursor will result with error "Cursor not found, cursor id: #####"
The Cursor.Decode() method is not to decode a cursor from some encoded form. It is to decode the next document the cursor designates into a Go value.
That's why there is no magic mongo.NewCursor() or mongo.ParseCursor() or mongo.DecodeCursor() function. A mongo.Cursor is handed to you by executing queries, e.g. with Collection.Find():
func (coll *Collection) Find(ctx context.Context, filter interface{},
opts ...findopt.Find) (Cursor, error)

Need to include $each and $position with $push in MongoDB and golang(mgo)

1.In the back end i m using go lang and for database i use mongoDB. I m trying to find the last document inserted in the embedded array so i can retrieve the document in the last array index without knowing its index.Is is possible??
After researching on this i came to know that its not possible.So i m thinking of using $push,$each and $position.Here i can set the position to 0 so the newly added document will be in 0 so i can retreive it using the index 0.
Here is bson format
{
empid:"L12"
AnnualLeave:[
{
"atotal" : 20,
}
]
}
Here is my schema
type (
Employee struct {
EmpId string
AnnualLeave []*AnnualLeaveInfo
}
AnnualLeaveInfo struct {
ATotal int64
}
I use the mgo statement as follows`enter code here`
c.Update(bson.M{"empid": "string"}, bson.M{"$push": bson.M{"annualleave":bson.M{"$each":
bson.M{"atotal": 4},"$position":0}}
2.Please advice me as well how to decrement the ATotal of the previous document attached and keep it as the value of the atotal of the new document.
Please help me.Thanks

I m trying to find the last document inserted in the embedded array so i can retrieve the document in the last array index without knowing its index.Is is possible?
You can find the last array index by derriving from the array length. Using your example:
type Employee struct {
EmpId string
AnnualLeave []AnnualLeaveInfo
}
type AnnualLeaveInfo struct {
ATotal int64
}
result := Employee{}
err = c.Find(bson.M{"empid": "example employee ID"}).One(&result)
if err != nil {
log.Fatal(err)
}
lastAnnualTotal:= result.AnnualLeave[len(result.AnnualLeave)-1].ATotal
Please advice me as well how to decrement the ATotal of the previous document attached and keep it as the value of the atotal of the new document
Depending on your use case, you could try performing two database operations:
Fetch the last ATotal value from the collection.
Push a new AnnualLeaveInfo document with the new ATotal value.
// Assuming that EmpId is unique
err = c.Update(bson.M{"empid": result.EmpId},
bson.M{"$push": bson.M{"annualleave": bson.M{"atotal": int(latestAnnualTotal-1)}}})
If you require atomic updates, see MongoDB Atomicity and Transactions and Model Data for Atomic Operations.
On another note, it seems that you're trying to do something related to CQRS design patterns. This design pattern may help with calculating your annual leave use case. See also Even Sourcing with MongoDB

MGO TTL indexes creation to selectively delete documents

I'm working with Golang and MongoDB. I have a collection which needs to keep a document which can be persistent or volatile. Hence, if it's set an expire date (as the example expireAt) the document is considered volatile and deleted otherwise it'll be kept in the collection unless it'll be manually deleted.
Reading this doc I've found an index that might work as I need it to.
Basically I need to replicate this kind of index in mgo:
db.log_events.createIndex( { "expireAt": 1 }, { expireAfterSeconds: 0 } )
db.log_events.insert( {
"expireAt": new Date('July 22, 2013 14:00:00'),
"logEvent": 2,
"logMessage": "Success!"
} )
I've read (I'm searching back for the source of this information) that if the expireAt is not a valid date the deletion won't be trigger. Thus I think that all I need to do is to set the expireDate to a valid date when I need it, otherwise I'll leave it to the Go time.Time zero value.
This is my codebase
type Filter struct {
Timestamp time.Time `bson:"createdAt"`
ExpireAt time.Time `bson:"expireAt"`
Body string `bson:"body"`
}
// Create filter from data received via REST API.
var filter Filter
timestamp := time.Now()
if theUserAction == "share" { // This is action will set the document as volatile
filter.ExpireAt = time.Now().Add(24 * time.Hour * 14)
}
filter.Timestamp = timestamp
filter.Body = "A BODY"
// Store filter in database
session, err := mdb.GetMgoSession() // This is a wrapping method that returns a valid mgo session
if err != nil {
return NewErrorInternal("Error connecting to database", err)
}
defer session.Close()
// Get db with global data for legent
collection := session.DB(database).C(filtersCollection)
My question is: how can I set the index thus that it'll delete the document IF the expireAt key is valid?
Reading the mgo documentation about Index Type it doesn't seems like there's a way to replicate the previously stated index, since the library only provides the ExpireAfter field..
Also, it's valid to assume that a zerovalue could be interpreted by mongodb as an invalid date?
From the docs it is January 1, year 1, 00:00:00.000000000 UTC which actually seems like a valid date..
What I've thought so far is doing something like this:
filtIdx := mgo.Index{
Key: []string{"expireAt"},
Unique: false,
Background: true,
Sparse: false,
ExpireAfter: 0,
}

How can I set the index thus that it'll delete the document IF the expireAt key is valid?
An example to set a TTL index using mgo.v2 is as below:
index := mgo.Index{
Key: []string{"expireAt"},
ExpireAfter: time.Second * 120,
}
err = coll.EnsureIndex(index)
Where the above example sets to 120 seconds of expiration. See also Expire Data from Collections by Setting TTL.
Is it still possible to make some documents to not expire at all? Since this is the behaviour I'm looking forward to obtain a collection where some documents do expire while other remain persistent
You can specify omitempty flag for ExpireAt struct field as below:
type Filter struct {
Timestamp time.Time `bson:"createdAt"`
Body string `bson:"body"`
ExpireAt time.Time `bson:"expireAt,omitempty"`
}
Which essentially only include the field if it's not set to a zero value. See more info mgo.v2 bson.Marshal
Now, for example you can insert two documents where one would expire and the other persists. Code example:
var foo Filter
foo.Timestamp = timestamp
foo.Body = "Will be deleted per TTL index"
foo.ExpireAt = time.Now()
collection.Insert(foo)
var bar Filter
bar.Timestamp = timestamp
bar.Body = "Persists until expireAt value is set"
collection.Insert(bar)
Later on, you can set the expireAt field with an Update(), as an example:
newValue := bson.M{"$set": bson.M{"expireAt": time.Now()}}
err = collection.Update(queryFilter, newValue)
Setting a valid time value for expireAt field, would make it qualify for the TTL index. i.e. no longer persists.
Depending on your use case, alternatively you may as well Remove() the document instead of updating and relying on TTL index.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Efficient paging in MongoDB using mgo - mongodb

Related

Golang: how to check if collection.Find didn't find any documents?

UpdateOne, ReplaceOne, FindOneAndReplace - patternmatch, but no upd data

How to encode/decode mongodb cursor?

Need to include $each and $position with $push in MongoDB and golang(mgo)

MGO TTL indexes creation to selectively delete documents

Categories

Resources