Go + MongoDB: polymorphic queries - mongodb

(sorry this question turned out longer than I had thought...)
I'm using Go and MongoDB with the mgo driver. I'm trying to persist and retrieve different structs (implementing a common interface) in the same MongoDB collection. I'm coming from the Java world (where this is very easily done with Spring with literally no config) and I'm having a hard time doing something similar with Go.
I have read every last related article or post or StackExchange question I could find, but still I haven't found a complete solution. This includes:
Unstructured MongoDB collections with mgo
How do you create a new instance of a struct from its type at run time in Go?
Golang reflect: Get Type representation from name?
Here's a simplified setup I use for testing. Suppose two structs S1 and S2, implementing a common interface I. S2 has an implicit field of type S1, meaning that structure-wise S2 embeds a S1 value, and that type-wise S2 implements I.
type I interface {
f1()
}
type S1 struct {
X int
}
type S2 struct {
S1
Y int
}
func (*S1) f1() {
fmt.Println("f1")
}
Now I can save an instance of S1 or S2 easily using mgo.Collection.Insert(), but to properly populate a value using mgo.Collection.Find().One() for example, I need to pass a pointer to an existing value of S1 or S2, which means I already know the type of the object I want to read!!
I want to be able to retrieve a document from the MongoDB collection without knowing if it's a S1, or a S2, or in fact any object that implements I.
Here's where I am so far: instead of directly saving the object I want to persist, I save a Wrapper struct that holds the MongoDB id, an identifier of the type, and the actual value. The type identifier is a concatenation of packageName + "." + typeName, and it is used to lookup the type in a type registry, since there is no native mechanism to map from a type name to a Type object in Go. This means I need to register the types that I want to be able to persist and retrieve, but I could live with that. Here's how it goes:
typeregistry.Register(reflect.TypeOf((*S1)(nil)).Elem())
typeregistry.Register(reflect.TypeOf((*S2)(nil)).Elem())
Here's the code for the type registry:
var types map[string]reflect.Type
func init() {
types = make(map[string]reflect.Type)
}
func Register(t reflect.Type) {
key := GetKey(t)
types[key] = t
}
func GetKey(t reflect.Type) string {
key := t.PkgPath() + "." + t.Name()
return key
}
func GetType(key string) reflect.Type {
t := types[key]
return t
}
The code for saving an object is quite straightforward:
func save(coll *mgo.Collection, s I) (bson.ObjectId, error) {
t := reflect.TypeOf(s)
wrapper := Wrapper{
Id: bson.NewObjectId(),
TypeKey: typeregistry.GetKey(t),
Val: s,
}
return wrapper.Id, coll.Insert(wrapper)
}
The code for retrieving an object is a bit more tricky:
func getById(coll *mgo.Collection, id interface{}) (*I, error) {
// read wrapper
wrapper := Wrapper{}
err := coll.Find(bson.M{"_id": id}).One(&wrapper)
if err != nil {
return nil, err
}
// obtain Type from registry
t := typeregistry.GetType(wrapper.TypeKey)
// get a pointer to a new value of this type
pt := reflect.New(t)
// FIXME populate value using wrapper.Val (type bson.M)
// HOW ???
// return the value as *I
i := pt.Elem().Interface().(I)
return &i, nil
}
This partially works as the returned object is typed correctly, but what i can't figure out is how to populate the value pt with the data retrieved from MongoDB which is stored in wrapper.Val as a bson.M.
I have tried the following but it doesn't work:
m := wrapper.Val.(bson.M)
bsonBytes, _ := bson.Marshal(m)
bson.Unmarshal(bsonBytes, pt)
So basically the remaining problem is: how to populate an unknown structure from a bson.M value? I'm sure there has to be an easy solution...
Thanks in advance for any help.
Here's a Github gist with all the code: https://gist.github.com/ogerardin/5aa272f69563475ba9d7b3194b12ae57

First, you should always check returned errors, always. bson.Marshal() and bson.Unmarshal() return errors which you don't check. Doing so reveals why it doesn't work:
unmarshal can't deal with struct values. Use a pointer
pt is of type reflect.Value (which happens to be a struct), not something you should pass to bson.Unmarshal(). You should pass e.g. a pointer to a struct value you want to unmarshal into (which will be wrapped in an interface{} value). So call Value.Interface() on the value returned by reflect.New():
pt := reflect.New(t).Interface()
You can pass this to bson.Unmarshal():
bsonBytes, err := bson.Marshal(m)
if err != nil {
panic(err)
}
if err = bson.Unmarshal(bsonBytes, pt); err != nil {
panic(err)
}
(In your real code you want to do something else than panic, this is just to show you should always check errors!)
Also note that it is possible to directly convert maps to structs (directly meaning without marshaling and unmarshaling). You may implement it by hand or use a ready 3rd party lib. For details, see Converting map to struct
Also note that there are more clever ways to solve what you want to do. You could store the type in the ID itself, so if you have the ID, you can construct a value of the type to unmarshal into the query result, so you could skip this whole process. It would be a lot more simple and a lot more efficient.
For example you could use the following ID structure:
<type>-<id>
For example:
my.package.S1-123
When fetching / loading this document, you could use reflection to create a value of my.package.S1, and unmarshal into that directly (pass that to Query.One()).

As per #icza 's comments, here's a modified version of getById() that actually works:
func getById(coll *mgo.Collection, id interface{}) (*I, error) {
// read wrapper
wrapper := Wrapper{}
err := coll.Find(bson.M{"_id": id}).One(&wrapper)
if err != nil {
return nil, err
}
// obtain Type from registry
t := typeregistry.GetType(wrapper.TypeKey)
// get a pointer to a new value of this type
pt := reflect.New(t)
// populate value using wrapper.Val
err = mapstructure.Decode(wrapper.V, pt.Interface())
if err != nil {
return nil, err
}
// return the value as *I
i := pt.Elem().Interface().(I)
return &i, nil
}
Conversion from bson.M to the struct is handled by https://github.com/mitchellh/mapstructure instead of marshalling-unmarshaling.

Related

Unable to INSERT/UPDATE data with custom type in postgresql using golang

I am trying to insert/update data in PostgreSQL using jackc/pgx into a table that has column of custom type. This is the table type written as a golan struct:
// Added this struct as a Types in PSQL
type DayPriceModel struct {
Date time.Time `json:"date"`
High float32 `json:"high"`
Low float32 `json:"low"`
Open float32 `json:"open"`
Close float32 `json:"close"`
}
// The 2 columns in my table
type SecuritiesPriceHistoryModel struct {
Symbol string `json:"symbol"`
History []DayPriceModel `json:"history"`
}
I have written this code for inserting data:
func insertToDB(data SecuritiesPriceHistoryModel) {
DBConnection := config.DBConnection
_, err := DBConnection.Exec(context.Background(), "INSERT INTO equity.securities_price_history (symbol) VALUES ($1)", data.Symbol, data.History)
}
But I am unable to insert the custom data type (DayPriceModel).
I am getting an error
Failed to encode args[1]: unable to encode
The error is very long and mostly shows my data so I have picked out the main part.
How do I INSERT data into PSQL with such custom data types?
PS: An implementation using jackc/pgx is preferred but database/SQL would just do fine
I'm not familiar enough with pgx to know how to setup support for arrays of composite types. But, as already mentioned in the comment, you can implement the driver.Valuer interface and have that implementation produce a valid literal, this also applies if you are storing slices of structs, you just need to declare a named slice and have that implement the valuer, and then use it instead of the unnamed slice.
// named slice type
type DayPriceModelList []DayPriceModel
// the syntax for array of composites literal looks like
// this: '{"(foo,123)", "(bar,987)"}'. So the implementation
// below must return the slice contents in that format.
func (l DayPriceModelList) Value() (driver.Value, error) {
// nil slice? produce NULL
if l == nil {
return nil, nil
}
// empty slice? produce empty array
if len(l) == 0 {
return []byte{'{', '}'}, nil
}
out := []byte{'{'}
for _, v := range l {
// This assumes that the date field in the pg composite
// type accepts the default time.Time format. If that is
// not the case then you simply provide v.Date in such a
// format which the composite's field understand, e.g.,
// v.Date.Format("<layout that the pg composite understands>")
x := fmt.Sprintf(`"(%s,%f,%f,%f,%f)",`,
v.Date,
v.High,
v.Low,
v.Open,
v.Close)
out = append(out, x...)
}
out[len(out)-1] = '}' // replace last "," with "}"
return out, nil
}
And when you are writing the insert query, make sure to add an explicit cast right after the placeholder, e.g.
type SecuritiesPriceHistoryModel struct {
Symbol string `json:"symbol"`
History DayPriceModelList `json:"history"` // use the named slice type
}
// ...
_, err := db.Exec(ctx, `INSERT INTO equity.securities_price_history (
symbol
, history
) VALUES (
$1
, $2::my_composite_type[])
`, data.Symbol, data.History)
// replace my_composite_type with the name of the composite type in the database
NOTE#1: Depending on the exact definition of your composite type in postgres the above example may or may not work, if it doesn't, simply adjust the code to make it work.
NOTE#2: The general approach in the example above is valid, however it is likely not very efficient. If you need the code to be performant do not use the example verbatim.

Marshal and umarshal BSON

TL;DR: Does the MongoDB driver provide a function to marshal and unmarshal a single field of a document?
This is a pretty straightforward question, but here's some context:
I have a worker responsible for synchronizing data between 2 separated databases. When it receives an event message, signalizing some document must sync, it selects the document in the primary database, and replicates it in another (it's a whole different database, not a replica set).
The thing is: I don't know the full structure of that document, so to preserve the data, I must unmarshal this document in a map map[string]interface{}, or a bson.M that works in the same fashion. But this seems like a lot of overhead, to unmarshal all this data I'm not even using, only to marshal it back to the other database.
So I thought about creating a structure that would just store the binary value of that document, without performing any marshal or unmarshal in order to reduce the overhead, like this:
type Document = map[string]Field
type Field struct {
Type bsontype.Type
Value []byte
}
func (f Field) MarshalBSONValue() (bsontype.Type, []byte, error) {
return f.Type, f.Value, nil
}
func (f *Field) UnmarshalBSONValue(btype bsontype.Type, value []byte) error {
f.Type = btype
f.Value = value
return nil
}
With this structure I can indeed reduce how much of the data will be parsed, but now, I need to manually unmarshal the one value in this document I'll need to use.
So I'm wondering if the MongoDB driver would have some function such as:
// Hypothetical function to get the value of a BSON
var status string
if err := decodeBSON(doc['status'].Type, doc['status'].Value, &status); err != nil {
return err
}
And
// Hypothetical function to set the value of a BSON
createdAt, err := encodeBSON(bsontype.Date, time.Now())
if err != nil {
return err
}
doc["createdAt"] = Field{Type: bsontype.Date, Value: createdAt}
How can I achieve this?
The Field type in your code is equivalent to the driver's bson.RawValue type. By switching to RawValue, you can decode individual fields using the RawValue.Unmarshal method and encode fields using bson.MarshalValue, which returns the two components (type and data) that you need to construct a new RawValue.
An example of how you can use these methods to change a field based on its original value: The Field type in your code is equivalent to the driver's bson.RawValue type. By switching to RawValue, you can decode individual fields using the RawValue.Unmarshal method and encode fields using bson.MarshalValue, which returns the two components (type and data) that you need to construct a new RawValue.
An example of how you can change a field depending on its original value without unmarshalling all of the original document's fields: https://gist.github.com/divjotarora/06c5188138456070cee26024f223b3ee

What would be the best approach to converting protoc generated structs from bson structs?

I'm writing a RESTful API in Golang, which also has a gRPC api. The API connects to a MongoDB database, and uses structs to map out entities. I also have a .proto definition which matches like for like the struct I'm using for MongoDB.
I just wondered if there was a way to share, or re-use the .proto defined code for the MongoDB calls also. I've noticed the strucs protoc generates has json tags for each field, but obviously there aren't bson tags etc.
I have something like...
// Menu -
type Menu struct {
ID bson.ObjectId `json:"id" bson"_id"`
Name string `json:"name" bson:"name"`
Description string `json:"description" bson:"description"`
Mixers []mixers.Mixer `json:"mixers" bson:"mixers"`
Sections []sections.Section `json:"sections" bson:"sections"`
}
But then I also have protoc generated code...
type Menu struct {
Id string `protobuf:"bytes,1,opt,name=id" json:"id,omitempty"`
Name string `protobuf:"bytes,2,opt,name=name" json:"name,omitempty"`
Description string `protobuf:"bytes,3,opt,name=description" json:"description,omitempty"`
Mixers []*Mixer `protobuf:"bytes,4,rep,name=mixers" json:"mixers,omitempty"`
Sections []*Section `protobuf:"bytes,5,rep,name=sections" json:"sections,omitempty"`
}
Currently I'm having to convert between the two structs depending what I'm doing. Which is tedious and I'm probably quite a considerable performance hit. So is there a better way of converting between the two, or re-using one of them for both tasks?
Having lived with this same issue, there's a couple methods of solving it. They fall into two general methods:
Use the same data type
Use two different struct types and map between them
If you want to use the same data type, you'll have to modify the code generation
You can use something like gogoprotobuf which has an extension to add tags. This should give you bson tags in your structs.
You could also post-process your generated files, either with regular expressions or something more complicated involving the go abstract syntax tree.
If you choose to map between them:
Use reflection. You can write a package that will take two structs and try to take the values from one and apply it to another. You'll have to deal with edge cases (slight naming differences, which types are equivalent, etc), but you'll have better control over edge cases if they ever come up.
Use JSON as an intermediary. As long as the generated json tags match, this will be a quick coding exercise and the performance hit of serializing and deserializing might be acceptable if this isn't in a tight loop in your code.
Hand-write or codegen mapping functions. Depending on how many structs you have, you could write out a bunch of functions that translate between the two.
At my workplace, we ended up doing a bit of all of them: forking the protoc generator to do some custom tags, a reflection based structs overlay package for mapping between arbitrary structs, and some hand-written ones in more performance sensitive or less automatable mappings.
I have played with it and have a working example with:
github.com/gogo/protobuf v1.3.1
go.mongodb.org/mongo-driver v1.4.0
google.golang.org/grpc v1.31.0
First of all I would like to share my proto/contract/example.proto file:
syntax = "proto2";
package protobson;
import "gogoproto/gogo.proto";
option (gogoproto.sizer_all) = true;
option (gogoproto.marshaler_all) = true;
option (gogoproto.unmarshaler_all) = true;
option go_package = "gitlab.com/8bitlife/proto/go/protobson";
service Service {
rpc SayHi(Hi) returns (Hi) {}
}
message Hi {
required bytes id = 1 [(gogoproto.customtype) = "gitlab.com/8bitlife/protobson/custom.BSONObjectID", (gogoproto.nullable) = false, (gogoproto.moretags) = "bson:\"_id\""] ;
required int64 limit = 2 [(gogoproto.nullable) = false, (gogoproto.moretags) = "bson:\"limit\""] ;
}
It contains a simple gRPC service Service that has SayHi method with request type Hi. It includes a set of options: gogoproto.sizer_all, gogoproto.marshaler_all, gogoproto.unmarshaler_all. Their meaning you can find at extensions page. The Hi itself contains two fields:
id that has additional options specified: gogoproto.customtype and gogoproto.moretags
limit with only gogoproto.moretags option
BSONObjectID used in gogoproto.customtype for id field is a custom type that I defined as custom/objectid.go:
package custom
import (
"go.mongodb.org/mongo-driver/bson/bsontype"
"go.mongodb.org/mongo-driver/bson/primitive"
)
type BSONObjectID primitive.ObjectID
func (u BSONObjectID) Marshal() ([]byte, error) {
return u[:], nil
}
func (u BSONObjectID) MarshalTo(data []byte) (int, error) {
return copy(data, (u)[:]), nil
}
func (u *BSONObjectID) Unmarshal(d []byte) error {
copy((*u)[:], d)
return nil
}
func (u *BSONObjectID) Size() int {
return len(*u)
}
func (u *BSONObjectID) UnmarshalBSONValue(t bsontype.Type, d []byte) error {
copy(u[:], d)
return nil
}
func (u BSONObjectID) MarshalBSONValue() (bsontype.Type, []byte, error) {
return bsontype.ObjectID, u[:], nil
}
It is needed because we need to define a custom marshaling and un-marshaling methods for both: protocol buffers and mongodb driver. This allows us to use this type as an object identifier in mongodb. And to "explaine" it to mongodb driver I marked it with a bson tag by using (gogoproto.moretags) = "bson:\"_id\"" option in proto file.
To generate source code from the proto file I used:
protoc \
--plugin=/Users/pstrokov/go/bin/protoc-gen-gogo \
--plugin=/Users/pstrokov/go/bin/protoc-gen-go \
-I=/Users/pstrokov/Workspace/protobson/proto/contract \
-I=/Users/pstrokov/go/pkg/mod/github.com/gogo/protobuf#v1.3.1 \
--gogo_out=plugins=grpc:. \
example.proto
I have tested it on my MacOS with running MongoDB instance: docker run --name mongo -d -p 27017:27017 mongo:
package main
import (
"context"
"log"
"net"
"time"
"gitlab.com/8bitlife/protobson/gitlab.com/8bitlife/proto/go/protobson"
"go.mongodb.org/mongo-driver/bson"
"go.mongodb.org/mongo-driver/mongo"
"go.mongodb.org/mongo-driver/mongo/options"
"google.golang.org/grpc"
)
type hiServer struct {
mgoClient *mongo.Client
}
func (s *hiServer) SayHi(ctx context.Context, hi *protobson.Hi) (*protobson.Hi, error) {
collection := s.mgoClient.Database("local").Collection("bonjourno")
res, err := collection.InsertOne(ctx, bson.M{"limit": hi.Limit})
if err != nil { panic(err) }
log.Println("generated _id", res.InsertedID)
out := &protobson.Hi{}
if err := collection.FindOne(ctx, bson.M{"_id": res.InsertedID}).Decode(out); err != nil { return nil, err }
log.Println("found", out.String())
return out, nil
}
func main() {
ctx, cancel := context.WithCancel(context.Background())
defer cancel()
lis, err := net.Listen("tcp", "localhost:0")
if err != nil { log.Fatalf("failed to listen: %v", err) }
clientOptions := options.Client().ApplyURI("mongodb://localhost:27017")
clientOptions.SetServerSelectionTimeout(time.Second)
client, err := mongo.Connect(ctx, clientOptions)
if err != nil { log.Fatal(err) }
if err := client.Ping(ctx, nil); err != nil { log.Fatal(err) }
grpcServer := grpc.NewServer()
protobson.RegisterServiceServer(grpcServer, &hiServer{mgoClient: client})
go grpcServer.Serve(lis); defer grpcServer.Stop()
conn, err := grpc.Dial(lis.Addr().String(), grpc.WithInsecure())
if err != nil { log.Fatal(err) }; defer conn.Close()
hiClient := protobson.NewServiceClient(conn)
response, err := hiClient.SayHi(ctx, &protobson.Hi{Limit: 99})
if err != nil { log.Fatal(err) }
if response.Limit != 99 { log.Fatal("unexpected limit", response.Limit) }
if response.Id.Size() == 0 { log.Fatal("expected a valid ID of the new entity") }
log.Println(response.String())
}
Sorry for the formatting of the last code snippet :)
I hope this can help.
I'm in the process of testing and may provide code shortly, (ping me if you don't see it and you want it) but https://godoc.org/go.mongodb.org/mongo-driver/bson/bsoncodec looks like the ticket. protoc will make your structs and you don't have to mess with customizing them. Then you can customize the mongo-driver to do the mapping of certain types for you and it looks like their library for this is pretty good.
This is great because if I use the protogen structs then I'd like that to my application core / domain layer. I don't want to be concerned about mongoDB compatibility over there.
So right now, it seems to me that #Liyan Chang 's answer saying
If you want to use the same data type, you'll have to modify the code generation
doesn't necessarily have to be the case. Because you can opt to use 1 datatype.
You can use one generated type and account for seemingly whatever you need to in terms of getting and setting data to the DB with this codec system.
See https://stackoverflow.com/a/59561699/8546258 - the bson struct tags are not an end all be all. looks like codec can totally help with this.
See https://stackoverflow.com/a/58985629/8546258 fo a nice write up about codecs in general.
Please keep in mind these codecs were released in 1.3 of the mongodb go driver. I found this which directed me there: https://developer.mongodb.com/community/forums/t/mgo-setbson-to-mongo-golang-driver/2340/2?u=yehuda_makarov

Go and MongoDB: generic DAO implementation issue

In the current project we use Go and MongoDB via mgo driver.
For every entity we have to implement DAO for CRUD operations, and it's basically copy-paste, e.g.
func (thisDao ClusterDao) FindAll() ([]*entity.User, error) {
session, collection := thisDao.getCollection()
defer session.Close()
result := []*entity.User{} //create a new empty slice to return
q := bson.M{}
err := collection.Find(q).All(&result)
return result, err
}
For every other entity it's all the same but the result type.
Since Go has no generics, how could we avoid code duplication?
I've tried to pass the result interface{} param instead of creating it in the method, and call the method like this:
dao.FindAll([]*entity.User{})
but the collection.Find().All() method need a slice as the input, not just interface:
[restful] recover from panic situation: - result argument must be a slice address
/usr/local/go/src/runtime/asm_amd64.s:514
/usr/local/go/src/runtime/panic.go:489
/home/dds/gopath/src/gopkg.in/mgo.v2/session.go:3791
/home/dds/gopath/src/gopkg.in/mgo.v2/session.go:3818
Then I tried to make this param result []interface{}, but in that case it's impossible to pass []*entity.User{}:
cannot use []*entity.User literal (type []*entity.User) as type []interface {} in argument to thisDao.GenericDao.FindAll
Any idea how could I implement generic DAO in Go?
You should be able to pass a result interface{} to your FindAll function and just pass it along to mgo's Query.All method since the argument's would have the same type.
func (thisDao ClusterDao) FindAll(result interface{}) error {
session, collection := thisDao.getCollection()
defer session.Close()
q := bson.M{}
// just pass result as is, don't do & here
// because that would be a pointer to the interface not
// to the underlying slice, which mgo probably doesn't like
return collection.Find(q).All(result)
}
// ...
users := []*entity.User{}
if err := dao.FindAll(&users); err != nil { // pass pointer to slice here
panic(err)
}
log.Println(users)

golang how to access interface fields

I have a function as below which decodes some json data and returns it as an interface
package search
func SearchItemsByUser(r *http.Request) interface{} {
type results struct {
Hits hits
NbHits int
NbPages int
HitsPerPage int
ProcessingTimeMS int
Query string
Params string
}
var Result results
er := json.Unmarshal(body, &Result)
if er != nil {
fmt.Println("error:", er)
}
return Result
}
I'm trying to access the data fields ( e.g. Params) but for some reasons it says that the interface has no such field. Any idea why ?
func test(w http.ResponseWriter, r *http.Request) {
result := search.SearchItemsByUser(r)
fmt.Fprintf(w, "%s", result.Params)
An interface variable can be used to store any value that conforms to the interface, and call methods that art part of that interface. Note that you won't be able to access fields on the underlying value through an interface variable.
In this case, your SearchItemsByUser method returns an interface{} value (i.e. the empty interface), which can hold any value but doesn't provide any direct access to that value. You can extract the dynamic value held by the interface variable through a type assertion, like so:
dynamic_value := interface_variable.(typename)
Except that in this case, the type of the dynamic value is private to your SearchItemsByUser method. I would suggest making two changes to your code:
Define your results type at the top level, rather than within the method body.
Make SearchItemsByUser directly return a value of the results type instead of interface{}.