Go encoding transform issue - encoding

I have a following code in go:
import (
"log"
"net/http"
"code.google.com/p/go.text/transform"
"code.google.com/p/go.text/encoding/charmap"
)
...
res, err := http.Get(url)
if err != nil {
log.Println("Cannot read", url);
log.Println(err);
continue
}
defer res.Body.Close()
The page I load contain non UTF-8 symbols. So I try to use transform
utfBody := transform.NewReader(res.Body, charmap.Windows1251.NewDecoder())
But the problem is, that it returns error even in this simple scenarion:
bytes, err := ioutil.ReadAll(utfBody)
log.Println(err)
if err == nil {
log.Println(bytes)
}
transform: short destination buffer
It also actually sets bytes with some data, but in my real code I use goquery:
doc, err := goquery.NewDocumentFromReader(utfBody)
Which sees an error and fails with not data in return
I tried to pass "chunks" of res.Body to transform.NewReader and figuried out, that as long as res.Body contains no non-UTF8 data it works well. And when it contains non-UTF8 byte it fails with an error above.
I'm quite new to go and don't really understand what's going on and how to deal with this

Without the whole code along with an example URL it's hard to tell what exactly is going wrong here.
That said, I can recommend the golang.org/x/net/html/charset package for this as it supports both char guessing and converting to UTF 8.
func fetchUtf8Bytes(url string) ([]byte, error) {
res, err := http.Get(url)
if err != nil {
return nil, err
}
contentType := res.Header.Get("Content-Type") // Optional, better guessing
utf8reader, err := charset.NewReader(res.Body, contentType)
if err != nil {
return nil, err
}
return ioutil.ReadAll(utf8reader)
}
Complete example: http://play.golang.org/p/olcBM9ughv

Related

What is the difference between the following two codes when using mongo driver in Golang?

Currently I'm working on a project, and I'm looking for a better way to source code.
I wonder What is Different,
func (db *Database) FindData(ctx context.Context, filter *Data) (*Data, error) {
col := db.client.Database(DefaultDatabase).Collection(COLLECTION_DATA)
var data Data
err := col.FindOne(ctx, filter).Decode(&data)
if err != nil {
return nil, err
}
return &data, nil
}
and
func (db *Database) FindData(ctx context.Context, filter *Data) (*Data, error) {
col := db.client.Database(DefaultDatabase).Collection(COLLECTION_DATA)
res := col.FindOne(ctx, filter)
if err:= res.Err(); err != nil {
return nil, err
}
var data Data
err := res.Decode(&reason)
return &data, err
}
What are the possible differences, and which code is better?
I verified that both sources were the same, and realized that I didn't have to use the long version.
There is no need to implement ERROR processing because ERROR processing of the corresponding result is carried out in Decode().
Thanks for Comment :) Burak Serdar

How to serve a file from mongodb with golang

I am working on a go project where I need to serve files stored in mongodb. The files are stored in a GridFs. I use gopkg.in/mgo.v2 as package to connect and query the db.
I can retrieve the file from the db, that is not hard.
f, err := s.files.OpenId(id)
But how can I serve that file with http?
I work with the JulienSchmidt router to handle all the other restfull requests.
The solutions I find always use static files, not files from a db.
Thanks in advance
Tip: Recommended to use github.com/globalsign/mgo instead of gopkg.in/mgo.v2 (the latter is not maintained anymore).
The mgo.GridFile type implements io.Reader, so you could use io.Copy() to copy its content into the http.ResponseWriter.
But since mgo.GridFile also implements io.Seeker, you may take advantage of http.ServeContent(). Quoting its doc:
The main benefit of ServeContent over io.Copy is that it handles Range requests properly, sets the MIME type, and handles If-Match, If-Unmodified-Since, If-None-Match, If-Modified-Since, and If-Range requests.
Example handler serving a file:
func serveFromDB(w http.ResponseWriter, r *http.Request) {
var gridfs *mgo.GridFS // Obtain GridFS via Database.GridFS(prefix)
name := "somefile.pdf"
f, err := gridfs.Open(name)
if err != nil {
log.Printf("Failed to open %s: %v", name, err)
http.Error(w, "something went wrong", http.StatusInternalServerError)
return
}
defer f.Close()
http.ServeContent(w, r, name, time.Now(), f) // Use proper last mod time
}
its old but i got another solution with goMongo driver by importing
"go.mongodb.org/mongo-driver/mongo/gridfs"
var bucket *gridfs.Bucket //creates a bucket
dbConnection, err := db.GetDBCollection() //connect db with your your
if err != nil {
log.Fatal(err)
}
bucket, err = gridfs.NewBucket(dbConnection)
if err != nil {
log.Fatal(err)
}
name := "br100_update.txt"
downloadStream, err := bucket.OpenDownloadStreamByName(name)
if err != nil {
log.Printf("Failed to open %s: %v", name, err)
http.Error(w, "something went wrong", http.StatusInternalServerError)
return
}
defer func() {
if err := downloadStream.Close(); err != nil {
log.Fatal(err)
}
}()
// Use SetReadDeadline to force a timeout if the download does not succeed in
// 2 seconds.
if err = downloadStream.SetReadDeadline(time.Now().Add(2 * time.Second)); err
!= nil {
log.Fatal(err)
}
// this code below use to read the file
fileBuffer := bytes.NewBuffer(nil)
if _, err := io.Copy(fileBuffer, downloadStream); err != nil {
log.Fatal(err)
}

How to encode/decode a empty string

I am using the GOB encoding for my project and i figured out (after a long fight) that empty strings are not encoded/decoded correctly. In my code i use a errormessage (string) to report any problems, this errormessage is most of the time empty. If i encode a empty string, it become nothing, and this gives me a problem with decoding. I don't want to alter the encoding/decoding because these parts are used the most.
How can i tell Go how to encode/decode empty strings?
Example:
Playground working code.
Playground not working code.
The problem isn't the encoding/gob module, but instead the custom MarshalBinary/UnmarshalBinary methods you've declared for Msg, which can't correctly round trip an empty string. There are two ways you could go here:
Get rid of the MarshalBinary/UnmarshalBinary methods and rely on GOB's default encoding for structures. This change alone wont' be enough because the fields of the structure aren't exported. If you're happy to export the fields then this is the simplest option: https://play.golang.org/p/rwzxTtaIh2
Use an encoding that can correctly round trip empty strings. One simple option would be to use GOB itself to encode the struct fields:
func (m Msg) MarshalBinary() ([]byte, error) {
var b bytes.Buffer
enc := gob.NewEncoder(&b)
if err := enc.Encode(m.x); err != nil {
return nil, err
}
if err := enc.Encode(m.y); err != nil {
return nil, err
}
if err := enc.Encode(m.z); err != nil {
return nil, err
}
return b.Bytes(), nil
}
// UnmarshalBinary modifies the receiver so it must take a pointer receiver.
func (m *Msg) UnmarshalBinary(data []byte) error {
dec := gob.NewDecoder(bytes.NewBuffer(data))
if err := dec.Decode(&m.x); err != nil {
return err
}
if err := dec.Decode(&m.y); err != nil {
return err
}
return dec.Decode(&m.z)
}
You can experiment with this example here: https://play.golang.org/p/oNXgt88FtK
The first option is obviously easier, but the second might be useful if your real example is a little more complex. Be careful with custom encoders though: GOB includes a few features that are intended to detect incompatibilities (e.g. if you add a field to a struct and try to decode old data), which are missing from this custom encoding.

How to read from request then use that result to do POST request then process its results

I'm trying to read from request then use that result to do POST request to another endpoint then process its results then return its results in JSON.
I have below code so far:
// POST
func (u *UserResource) authenticate(request *restful.Request, response *restful.Response) {
Api := Api{url: "http://api.com/api"}
usr := new(User)
err := request.ReadEntity(&usr)
if err != nil {
response.WriteErrorString(http.StatusInternalServerError, err.Error())
return
}
api_resp, err := http.Post(Api.url, "text/plain", bytes.NewBuffer(usr))
if err != nil {
response.WriteErrorString(http.StatusInternalServerError, err.Error())
return
}
defer api_resp.Body.Close()
body, err := ioutil.ReadAll(api_resp.Body)
response.WriteHeader(http.StatusCreated)
err = xml.Unmarshal(body, usr)
if err != nil {
fmt.Printf("error: %v", err)
return
}
// result, err := json.Marshal(usr)
// response.Write(result)
response.WriteEntity(&usr)
fmt.Printf("Name: %q\n", usr.UserName)
}
I'm using Go Restful package for Writes and Reads.
I'm getting this error when I compile the file:
src\login.go:59: cannot use usr (type *User) as type []byte in argument to bytes.NewBuffer
What would be the best way to solve this issue so I can do a POST with payload correctly?
You need to marshal your data structure to slice of bytes. Something like this:
usrXmlBytes, err := xml.Marshal(usr)
if err != nil {
response.WriteErrorString(http.StatusInternalServerError, err.Error())
return
}
api_resp, err := http.Post(Api.url, "text/plain", bytes.NewReader(usrXmlBytes))
http.Post takes an io.Reader as the third argument. You could implement io.Reader on your User type or more simply serialize your data and use the bytes pkg to to implement io.Reader
b, err := json.Marshal(usr)
if err != nil {
response.WriteErrorString(http.StatusInternalServerError, err.Error())
return
}
api_resp, err := http.Post(Api.url, "text/plain", bytes.NewReader(b))

Store Uploaded File in MongoDB GridFS Using mgo without Saving to Memory

noob Golang and Sinatra person here. I have hacked a Sinatra app to accept an uploaded file posted from an HTML form and save it to a hosted MongoDB database via GridFS. This seems to work fine. I am writing the same app in Golang using the mgo driver.
Functionally it works fine. However in my Golang code, I read the file into memory and then write the file from memory to the MongoDB using mgo. This appears much slower than my equivalent Sinatra app. I get the sense that the interaction between Rack and Sinatra does not execute this "middle" or "interim" step.
Here's a snippet of my Go code:
func uploadfilePageHandler(w http.ResponseWriter, req *http.Request) {
// Capture multipart form file information
file, handler, err := req.FormFile("filename")
if err != nil {
fmt.Println(err)
}
// Read the file into memory
data, err := ioutil.ReadAll(file)
// ... check err value for nil
// Specify the Mongodb database
my_db := mongo_session.DB("... database name...")
// Create the file in the Mongodb Gridfs instance
my_file, err := my_db.GridFS("fs").Create(unique_filename)
// ... check err value for nil
// Write the file to the Mongodb Gridfs instance
n, err := my_file.Write(data)
// ... check err value for nil
// Close the file
err = my_file.Close()
// ... check err value for nil
// Write a log type message
fmt.Printf("%d bytes written to the Mongodb instance\n", n)
// ... other statements redirecting to rest of user flow...
}
Question:
Is this "interim" step needed (data, err := ioutil.ReadAll(file))?
If so, can I execute this step more efficiently?
Are there other accepted practices or approaches I should be considering?
Thanks...
No, you should not read the file entirely in memory at once, as that will break when the file is too large. The second example in the documentation for GridFS.Create avoids this problem:
file, err := db.GridFS("fs").Create("myfile.txt")
check(err)
messages, err := os.Open("/var/log/messages")
check(err)
defer messages.Close()
err = io.Copy(file, messages)
check(err)
err = file.Close()
check(err)
As for why it's slower than something else, hard to tell without diving into the details of the two approaches used.
Once you have the file from multipartForm, it can be saved into GridFs using below function. I tested this against huge files as well ( upto 570MB).
//....code inside the handlerfunc
for _, fileHeaders := range r.MultipartForm.File {
for _, fileHeader := range fileHeaders {
file, _ := fileHeader.Open()
if gridFile, err := db.GridFS("fs").Create(fileHeader.Filename); err != nil {
//errorResponse(w, err, http.StatusInternalServerError)
return
} else {
gridFile.SetMeta(fileMetadata)
gridFile.SetName(fileHeader.Filename)
if err := writeToGridFile(file, gridFile); err != nil {
//errorResponse(w, err, http.StatusInternalServerError)
return
}
func writeToGridFile(file multipart.File, gridFile *mgo.GridFile) error {
reader := bufio.NewReader(file)
defer func() { file.Close() }()
// make a buffer to keep chunks that are read
buf := make([]byte, 1024)
for {
// read a chunk
n, err := reader.Read(buf)
if err != nil && err != io.EOF {
return errors.New("Could not read the input file")
}
if n == 0 {
break
}
// write a chunk
if _, err := gridFile.Write(buf[:n]); err != nil {
return errors.New("Could not write to GridFs for "+ gridFile.Name())
}
}
gridFile.Close()
return nil
}