Go MongoDB driver not returning all documents - mongodb

I'm having some issues getting the fully expected results from a mongoDB query using the Golang driver.
I'm currently querying a collection with 5791 documents totaling around ~150MB. It seems that when the query gets a large amount of data as the result the cursor does not iterate over the complete set of documents expected.
For example:
Query returns 2290 documents instead of 5791 expected with no error and cursor iterates without error.
Is there anything in the FindOptions for the Collection.Find() perhaps to remove a byte size limit on the query results?
Here is the code I'm using:
func (db *Database) ExecuteQuery(coll string, query bson.M) ([]bson.M, error) {
// Retrieve the appropriate database and collection to query on
collection, ctx, cancel := database.getCollection(coll)
defer cancel()
cursor, err := collection.Find(ctx, query)
if err != nil {
return nil, err
}
var res []bson.M
for cursor.Next(ctx) {
//Create a value into which the single document can be decoded
var elem bson.M
err := cursor.Decode(&elem)
if err != nil {
return nil, err
}
res = append(res, elem)
}
cursor.Close(ctx)
return res, nil
}

Turns out the issue was when I implemented a method to get the a collection from the database I was getting the collection using context.WithTimeout which had a 10 second timeout. This basically capped the time I could execute a query using this context on this collection and so only the number of documents it could fit in that time period were returned.
The code below shows the change where the context is retrieved using context.WithCancel instead in order to let queries take as long as they want.
// getCollection retrieves the appropriate collection to query on and returns the context and context cancelling function tied to it.
func (db *Database) getCollection(coll string) (*mongo.Collection, context.Context, context.CancelFunc) {
mongoDB := db.client.Database(db.Config.DatabaseName)
collection := mongoDB.Collection(coll)
ctx, cancel := context.WithCancel(context.Background())
return collection, ctx, cancel
}

Related

(CursorNotFound) Cursor not found (namespace: 'dbName.collection', id: 885805460243113719)

Following is the code for fetching the results from the db providing collection, filter query, sorting query and number of limit.
func DBFetch(collection *mongo.Collection, filter interface{}, sort interface{}, limit int64) ([]bson.M, error) {
findOptions := options.Find()
findOptions.SetLimit(limit)
findOptions.SetSort(sort)
cursor, err := collection.Find(context.Background(), filter, findOptions)
var result []bson.M
if err != nil {
logger.Client().Error(err.Error())
sentry.CaptureException(err)
cursor.Close(context.Background())
return nil, err
}
if err = cursor.All(context.Background(), &result); err != nil {
logger.Client().Error(err.Error())
sentry.CaptureMessage(err.Error())
return nil, err
}
return result, nil
}
I am using mongo-go driver version 1.8.2
mongodb community version 4.4.7 sharded mongo with 2 shards
Each shard is with 30 CPU in k8 with 245Gb memory having 1 replica
200 rpm for the api
Api fetches the data from mongo and format it and the serves it
We are reading and writing both on primary.
Heavy writes occur every hour approximately.
Getting timeouts in milliseconds ( 10ms-20ms approx. )
As pointed out by #R2D2 in the comment, no cursor timeout error occurs when the default timeout (10 minutes) exceeds and there was no request from go for next set of data.
There are couple of workarounds you can do to mitigate getting this error.
First option is to set batch size for your find query by using the below option. By doing do, you are instructing MongoDB to send data in specified chunks rather than sending more data. Note that this will usually increase the roundtrip time between MongoDB and Go server.
findOptions := options.Find()
findOptions.SetBatchSize(10) // <- Batch size is set to `10`
cursor, err := collection.Find(context.Background(), filter, findOptions)
Furthermore, you can set the NoCursorTimeout option which will keep your MongoDB find query result cursor pointer to stay alive unless you manually close it. This option is a double edge sword since you have to manually close the cursor once you no longer need that cursor, else that cursor will stay in memory for a prolonged time.
findOptions := options.Find()
findOptions.SetNoCursorTimeout(true) // <- Applies no cursor timeout option
cursor, err := collection.Find(context.Background(), filter, findOptions)
// VERY IMPORTANT
_ = cursor.Close(context.Background()) // <- Don't forget to close the cursor
Combine the above two options, below will be your complete code.
func DBFetch(collection *mongo.Collection, filter interface{}, sort interface{}, limit int64) ([]bson.M, error) {
findOptions := options.Find()
findOptions.SetLimit(limit)
findOptions.SetSort(sort)
findOptions.SetBatchSize(10) // <- Batch size is set to `10`
findOptions.SetNoCursorTimeout(true) // <- Applies no cursor timeout option
cursor, err := collection.Find(context.Background(), filter, findOptions)
var result []bson.M
if err != nil {
//logger.Client().Error(err.Error())
//sentry.CaptureException(err)
_ = cursor.Close(context.Background())
return nil, err
}
if err = cursor.All(context.Background(), &result); err != nil {
//logger.Client().Error(err.Error())
//sentry.CaptureMessage(err.Error())
return nil, err
}
// VERY IMPORTANT
_ = cursor.Close(context.Background()) // <- Don't forget to close the cursor
return result, nil
}

Search documents using $gt filter with go-mongodb driver

I'm stuck at a probably simple problem: If I filter this in mongodb compass (filter {dateTime:{$gt: new Date("2020-11-23T12:31:38")}}):
It returns 556 documents.
Trying to create a cursor in Go that have those documents is proving to be quite hard!
I've this right now:
cursor, err := coll.Find(context.Background(), bson.M{"dateTime": bson.M{"$gt": "new Date("+ date + ")"}}, opt)
if err != nil {
fmt.Println("Err creting database: ", err)
return nil, err
}
if cursor.Next(context.Background()) {
fmt.Println("Cursor0!")
cursor.Next(context.Background())
}
cursor1, err := coll.Find(context.Background(), bson.M{}, opt)
if err != nil {
fmt.Println("Err creting database: ", err)
return nil, err
}
if cursor1.Next(context.Background()) {
fmt.Println("Cursor1!")
cursor.Next(context.Background())
}.
I've tried, along other different tries, to put the filter just as bson.M{"dateTime": bson.M{"$gt": date}}, along other similar tryes, but they also returned 0 documents. The date variable have exacly the date used in the mongodb compass filter.
I created another cursor, with no filter, just to control if the connection with mongo is ok, and to see if it returns any documents when it has no filter, and it does return documents. Does anyone knows the answer to this one?
Thanks!!
new Date("2020-11-23T12:31:38") is JavaScript syntax. You need to use the proper Go syntax for creating timestamps.
The problem was that I was dealling with more than 1 Collection, and in one the date was saved as string, and in the other, as date. In the one that the date s saved as string, no surprise, we have to send the date as string too, some logic to when date is in mongo as Date

How to check if a record exists with golang and the offical mongo driver

I'm using the official mongo driver in golang, and am trying to determine if a record exists. Unfortunately the documentation doesn't explain how to do this. I'm attempting to do it with FindOne, but it returns and error when no results are found, and I don't know how to distinguish that error from any other error (short of comparing strings which feels wrong. What's the right way to check if a document exists in mongo with the official golang driver?
Here's my code.
ctx := context.Background()
var result Page
err := c.FindOne(ctx, bson.D{{"url", url}}).Decode(&result)
fmt.Println("err: ", err)
// how do I distinguish which error type here?
if err != nil {
log.Fatal(err)
}
Here's the answer.
var coll *mongo.Collection
var id primitive.ObjectID
// find the document for which the _id field matches id
// specify the Sort option to sort the documents by age
// the first document in the sorted order will be returned
opts := options.FindOne().SetSort(bson.D{{"age", 1}})
var result bson.M
err := coll.FindOne(context.TODO(), bson.D{{"_id", id}}, opts).Decode(&result)
if err != nil {
// ErrNoDocuments means that the filter did not match any documents in the collection
if err == mongo.ErrNoDocuments {
return
}
log.Fatal(err)
}
fmt.Printf("found document %v", result)

Checking if data exists in a mongodb collection in goLang?

If i want to check if there currently exists at least one document in a collection, how would I go about doing this in GoLang?
The most performant way to check if documents exist in a collection is to use the EstimatedDocumentCount function on a collection because it gets an estimate from the collection's metadata.
You can do something like this:
count, err := collection.EstimatedDocumentCount(context.Background())
If the actual count of documents in the collection is important and you need more than just an estimate, it makes sense to look into the MongoDB aggregation framework.
You can do something like this which wraps the aggregation framework:
count, err := collection.CountDocuments(ctx, bson.M{})
if err != nil {
panic(err)
}
if count >= 1 {
fmt.Println("Documents exist in this collection!")
}
You could also try something like the following if you want to use the aggregation framework directly:
cursor, err := episodesCollection.Aggregate(ctx, []bson.D{
bson.D{{"$count", "mycount"}},
})
if err != nil {
panic(err)
}
var counts []bson.M
cursor.All(ctx, &counts)
fmt.Println(counts[0]["mycount"])

fetching the data from a mongodb in golang

I'm trying to fetch data from mongodb in golang using the gopkg.in/mgo.v2 driver, the format of the data is not fixed , as in few rows will be containing some fields which other rows might not.
here is the code for the same
session, err := mgo.Dial("mongodb://root:root#localhost:27017/admin")
db := session.DB("test")
fmt.Println(reflect.TypeOf(db))
CheckError(err,"errpor")
result := make(map[string]string)
//query := make(map[string]string)
//query["_id"] = "3434"
err1 := db.C("mycollection").Find(nil).One(&result)
CheckError(err1,"error")
for k := range result {
fmt.Println(k)
}
Now the data contained in the collection is { "_id" : "3434", "0" : 1 }, however the for loop gives the output as _id , shouldn't there be two keys '_id' and '0' ? or am I doing something wrong here.
oh I found the solution
the "result" variable should be of type bson.M and then you can typecast accordingly as you go deep into the nesting structure.
Give a try with the following piece of code. This will help you fetching matching records from the Database using BSON Object.
Do not forget to rename the Database name and Collection name of your MongoDB in the below code. Also needs to change the query parameter accordingly.
Happy Coding...
package main
import (
"context"
"fmt"
"time"
"go.mongodb.org/mongo-driver/bson"
"go.mongodb.org/mongo-driver/mongo"
"go.mongodb.org/mongo-driver/mongo/options"
)
// This is a user defined method to close resourses.
// This method closes mongoDB connection and cancel context.
func close(client *mongo.Client, ctx context.Context, cancel context.CancelFunc) {
defer cancel()
defer func() {
if err := client.Disconnect(ctx); err != nil {
panic(err)
}
}()
}
// This is a user defined method that returns
// a mongo.Client, context.Context,
// context.CancelFunc and error.
// mongo.Client will be used for further database
// operation. context.Context will be used set
// deadlines for process. context.CancelFunc will
// be used to cancel context and resourse
// assositated with it.
func connect(uri string) (*mongo.Client, context.Context, context.CancelFunc, error) {
ctx, cancel := context.WithTimeout(context.Background(),
30*time.Second)
client, err := mongo.Connect(ctx, options.Client().ApplyURI(uri))
return client, ctx, cancel, err
}
// query is user defined method used to query MongoDB,
// that accepts mongo.client,context, database name,
// collection name, a query and field.
// datbase name and collection name is of type
// string. query is of type interface.
// field is of type interface, which limts
// the field being returned.
// query method returns a cursor and error.
func query(client *mongo.Client, ctx context.Context, dataBase, col string, query, field interface{}) (result *mongo.Cursor, err error) {
// select database and collection.
collection := client.Database(dataBase).Collection(col)
// collection has an method Find,
// that returns a mongo.cursor
// based on query and field.
result, err = collection.Find(ctx, query,
options.Find().SetProjection(field))
return
}
func main() {
// Get Client, Context, CalcelFunc and err from connect method.
client, ctx, cancel, err := connect("mongodb://localhost:27017")
if err != nil {
panic(err)
}
// Free the resource when mainn dunction is returned
defer close(client, ctx, cancel)
// create a filter an option of type interface,
// that stores bjson objects.
var filter, option interface{}
// filter gets all document,
// with maths field greater that 70
filter = bson.D{
{"_id", bson.D{{"$eq", 3434}}},
}
// option remove id field from all documents
option = bson.D{{"_id", 0}}
// call the query method with client, context,
// database name, collection name, filter and option
// This method returns momngo.cursor and error if any.
cursor, err := query(client, ctx, "YourDataBaseName",
"YourCollectioName", filter, option)
// handle the errors.
if err != nil {
panic(err)
}
var results []bson.D
// to get bson object from cursor,
// returns error if any.
if err := cursor.All(ctx, &results); err != nil {
// handle the error
panic(err)
}
// printing the result of query.
fmt.Println("Query Reult")
for _, doc := range results {
fmt.Println(doc)
}
}