I am working on a rest endpoint which should get the request body stream and consume it. I tried to get the body of request (Content-Type as text/csv or application/octet-stream) and read from it using buffer.
reader := r.Body.(io.Reader)
writer := bufio.NewWriter(outputFile) // we write to
for {
buffer := make([]byte, 4000)
numBytes, err := reader.Read(buffer)
if err == io.EOF {
break
} else if err != nil {
return
}
if read > 0 {
writer.Write(buffer[0:numBytes])
} else {
break
}
}
writer.Flush();
Above is my golang code. I got nothing from the request.Body. However, if I use multipart/form-data I can get the data from the parts. Does http always require form-data for uploading??
Related
I'm looking for the cleanest way in Golang to transfer a message through (i.e. act as an SMTP proxy) while performing some manipulation on the message body html (e.g. adding an open tracking pixel - not yet coded).
The net/mail package includes a method ReadMessage that parses mail headers into a map, and gives you an io.Reader for the body. This is necessary to determine the MIME parts of the body for processing, rather than just io.Copying them through. (the simple stub version of this function, shown in the block comment, does just that).
The following function copies an incoming mail "src" to an outgoing mail stream "dest". (The calling code sets these up as DotReader and DotWriter which takes care of most of the "dot" processing needed for RFC5321.
// Processing of email body via IO stream functions
package main
import (
"bufio"
"io"
"log"
"net/mail"
"strings"
)
/* If you just want to pass through the entire mail headers and body, you can just use
the following alernative:
func MailCopy(dst io.Writer, src io.Reader) (int64, error) {
return io.Copy(dst, src)
}
*/
// MailCopy transfers the mail body from downstream (client) to upstream (server)
// The writer will be closed by the parent function, no need to close it here.
func MailCopy(dst io.Writer, src io.Reader) (int64, error) {
var totalWritten int64
const smtpCRLF = "\r\n"
message, err := mail.ReadMessage(bufio.NewReader(src))
if err != nil {
return totalWritten, err
}
// Pass through headers. The m.Header map does not preserve order, but that should not matter.
for hdrType, hdrList := range message.Header {
for _, hdrVal := range hdrList {
hdrLine := hdrType + ": " + hdrVal + smtpCRLF
log.Print("\t", hdrLine)
bytesWritten, err := dst.Write([]byte(hdrLine))
totalWritten += int64(bytesWritten)
if err != nil {
return totalWritten, err
}
}
}
// Blank line denotes end of headers
bytesWritten, err := io.Copy(dst, strings.NewReader(smtpCRLF))
totalWritten += int64(bytesWritten)
if err != nil {
return totalWritten, err
}
// Copy the body
bytesWritten, err = io.Copy(dst, message.Body)
totalWritten += int64(bytesWritten)
if err != nil {
return totalWritten, err
}
return totalWritten, err
}
It does seem necessary to build this, because there is no net/mail.WriteMessage() method.
the header order is always randomised by Golang's map functionality. This seems harmless in my tests
A forced CRLF needs to be put in between the end of the headers and the body, as per RFCs. DotWriter takes care of the terminating dot.
The function shown above works, I was wondering if there is a better way to do this?
Talk is cheap, so here we go the simple code:
package main
import (
"fmt"
"time"
"net"
)
func main() {
addr := "127.0.0.1:8999"
// Server
go func() {
tcpaddr, err := net.ResolveTCPAddr("tcp4", addr)
if err != nil {
panic(err)
}
listen, err := net.ListenTCP("tcp", tcpaddr)
if err != nil {
panic(err)
}
for {
if conn, err := listen.Accept(); err != nil {
panic(err)
} else if conn != nil {
go func(conn net.Conn) {
buffer := make([]byte, 1024)
n, err := conn.Read(buffer)
if err != nil {
fmt.Println(err)
} else {
fmt.Println(">", string(buffer[0 : n]))
}
conn.Close()
}(conn)
}
}
}()
time.Sleep(time.Second)
// Client
if conn, err := net.Dial("tcp", addr); err == nil {
for i := 0; i < 2; i++ {
_, err := conn.Write([]byte("hello"))
if err != nil {
fmt.Println(err)
conn.Close()
break
} else {
fmt.Println("ok")
}
// sleep 10 seconds and re-send
time.Sleep(10*time.Second)
}
} else {
panic(err)
}
}
Ouput:
> hello
ok
ok
The Client writes to the Server twice. After the first read, the Server closes the connection immediately, but the Client sleeps 10 seconds and then re-writes to the Server with the same already closed connection object(conn).
Why can the second write succeed (returned error is nil)?
Can anyone help?
PS:
In order to check if the buffering feature of the system affects the result of the second write, I edited the Client like this, but it still succeeds:
// Client
if conn, err := net.Dial("tcp", addr); err == nil {
_, err := conn.Write([]byte("hello"))
if err != nil {
fmt.Println(err)
conn.Close()
return
} else {
fmt.Println("ok")
}
// sleep 10 seconds and re-send
time.Sleep(10*time.Second)
b := make([]byte, 400000)
for i := range b {
b[i] = 'x'
}
n, err := conn.Write(b)
if err != nil {
fmt.Println(err)
conn.Close()
return
} else {
fmt.Println("ok", n)
}
// sleep 10 seconds and re-send
time.Sleep(10*time.Second)
} else {
panic(err)
}
And here is the screenshot:
attachment
There are several problems with your approach.
Sort-of a preface
The first one is that you do not wait for the server goroutine
to complete.
In Go, once main() exits for whatever reason,
all the other goroutines still running, if any, are simply
teared down forcibly.
You're trying to "synchronize" things using timers,
but this only works in toy situations, and even then it
does so only from time to time.
Hence let's fix your code first:
package main
import (
"fmt"
"log"
"net"
"time"
)
func main() {
addr := "127.0.0.1:8999"
tcpaddr, err := net.ResolveTCPAddr("tcp4", addr)
if err != nil {
log.Fatal(err)
}
listener, err := net.ListenTCP("tcp", tcpaddr)
if err != nil {
log.Fatal(err)
}
// Server
done := make(chan error)
go func(listener net.Listener, done chan<- error) {
for {
conn, err := listener.Accept()
if err != nil {
done <- err
return
}
go func(conn net.Conn) {
var buffer [1024]byte
n, err := conn.Read(buffer[:])
if err != nil {
log.Println(err)
} else {
log.Println(">", string(buffer[0:n]))
}
if err := conn.Close(); err != nil {
log.Println("error closing server conn:", err)
}
}(conn)
}
}(listener, done)
// Client
conn, err := net.Dial("tcp", addr)
if err != nil {
log.Fatal(err)
}
for i := 0; i < 2; i++ {
_, err := conn.Write([]byte("hello"))
if err != nil {
log.Println(err)
err = conn.Close()
if err != nil {
log.Println("error closing client conn:", err)
}
break
}
fmt.Println("ok")
time.Sleep(2 * time.Second)
}
// Shut the server down and wait for it to report back
err = listener.Close()
if err != nil {
log.Fatal("error closing listener:", err)
}
err = <-done
if err != nil {
log.Println("server returned:", err)
}
}
I've spilled a couple of minor fixes
like using log.Fatal (which is
log.Print + os.Exit(1)) instead of panicking,
removed useless else clauses to adhere to the coding standard of keeping the main
flow where it belongs, and lowered the client's timeout.
I have also added checking for possible errors Close on sockets may return.
The interesting part is that we now properly shut the server down by closing the listener and then waiting for the server goroutine to report back (unfortunately Go does not return an error of a custom type from net.Listener.Accept in this case so we can't really check that Accept exited because we've closed the listener).
Anyway, our goroutines are now properly synchronized, and there is
no undefined behaviour, so we can reason about how the code works.
Remaining problems
Some problems still remain.
The more glaring is you making wrong assumption that TCP preserves
message boundaries—that is, if you write "hello" to the client
end of the socket, the server reads back "hello".
This is not true: TCP considers both ends of the connection
as producing and consuming opaque streams of bytes.
This means, when the client writes "hello", the client's
TCP stack is free to deliver "he" and postpone sending "llo",
and the server's stack is free to yield "hell" to the read
call on the socket and only return "o" (and possibly some other
data) in a later read.
So, to make the code "real" you'd need to somehow introduce these
message boundaries into the protocol above TCP.
In this particular case the simplest approach would be either
using "messages" consisting of a fixed-length and agreed-upon
endianness prefix indicating the length of the following
data and then the string data itself.
The server would then use a sequence like
var msg [4100]byte
_, err := io.ReadFull(sock, msg[:4])
if err != nil { ... }
mlen := int(binary.BigEndian.Uint32(msg[:4]))
if mlen < 0 {
// handle error
}
if mlen == 0 {
// empty message; goto 1
}
_, err = io.ReadFull(sock, msg[5:5+mlen])
if err != nil { ... }
s := string(msg[5:5+mlen])
Another approach is to agree on that the messages do not contain
newlines and terminate each message with a newline
(ASCII LF, \n, 0x0a).
The server side would then use something like
a usual bufio.Scanner loop to get
full lines from the socket.
The remaining problem with your approach is to not dealing with
what Read on a socket returns: note that io.Reader.Read
(that's what sockets implement, among other things) is allowed
to return an error while having had read some data from the
underlying stream. In your toy example this might rightfully
be unimportant, but suppose that you're writing a wget-like
tool which is able to resume downloading of a file: even if
reading from the server returned some data and an error, you
have to deal with that returned chunk first and only then
handle the error.
Back to the problem at hand
The problem presented in the question, I beleive, happens simply because in your setup you hit some TCP buffering problem due to the tiny length of your messages.
On my box which runs Linux 4.9/amd64 two things reliably "fix"
the problem:
Sending messages of 4000 bytes in length: the second call
to Write "sees" the problem immediately.
Doing more Write calls.
For the former, try something like
msg := make([]byte, 4000)
for i := range msg {
msg[i] = 'x'
}
for {
_, err := conn.Write(msg)
...
and for the latter—something like
for {
_, err := conn.Write([]byte("hello"))
...
fmt.Println("ok")
time.Sleep(time.Second / 2)
}
(it's sensible to lower the pause between sending stuff in
both cases).
It's interesting to note that the former example hits the
write: connection reset by peer (ECONNRESET in POSIX)
error while the second one hits write: broken pipe
(EPIPE in POSIX).
This is because when we're sending in chunks worth 4k bytes,
some of the packets generated for the stream manage to become
"in flight" before the server's side of the connection manages
to propagate the information on its closure to the client,
and those packets hit an already closed socket and get rejected
with the RST TCP flag set.
In the second example an attempt to send another chunk of data
sees that the client side already knows that the connection
has been teared down and fails the sending without "touching
the wire".
TL;DR, the bottom line
Welcome to the wonderful world of networking. ;-)
I'd recommend buying a copy of "TCP/IP Illustrated",
read it and experiment.
TCP (and IP and other protocols above IP)
sometimes works not like people expect them to by applying
their "common sense".
I need to implement web service in go that processes tar.gz files and I wonder what is the correct way, what content type I need to define, etc.
plus, I found that a lot of things are handled automatically - on the client side I just post a gzip reader as request body and Accept-Encoding: gzip header is added automatically, and on the server side - I do not need to gunzip the request body, it is already extracted to tar. does that make sense?
Can I rely that it would be like this with any client?
Server:
func main() {
router := mux.NewRouter().StrictSlash(true)
router.Handle("/results", dataupload.NewUploadHandler()).Methods("POST")
log.Fatal(http.ListenAndServe(*address, router))
}
Uploader:
package dataupload
import (
"errors"
log "github.com/Sirupsen/logrus"
"io"
"net/http"
)
// UploadHandler responds to /results http request, which is the result-service rest API for uploading results
type UploadHandler struct {
uploader Uploader
}
// NewUploadHandler creates UploadHandler instance
func NewUploadHandler() *UploadHandler {
return &UploadHandler{
uploader: TarUploader{},
}
}
func (uh UploadHandler) ServeHTTP(writer http.ResponseWriter, request *http.Request) {
retStatus := http.StatusOK
body, err := getBody(request)
if err != nil {
retStatus = http.StatusBadRequest
log.Error("Error fetching request body. ", err)
} else {
_, err := uh.uploader.Upload(body)
}
writer.WriteHeader(retStatus)
}
func getBody(request *http.Request) (io.ReadCloser, error) {
requestBody := request.Body
if requestBody == nil {
return nil, errors.New("Empty request body")
}
var err error
// this part is commented out since somehow the body is already gunzipped - no need to extract it.
/*if strings.Contains(request.Header.Get("Accept-Encoding"), "gzip") {
requestBody, err = gzip.NewReader(requestBody)
}*/
return requestBody, err
}
Client
func main() {
f, err := os.Open("test.tar.gz")
if err != nil {
log.Fatalf("error openning file %s", err)
}
defer f.Close()
client := new(http.Client)
reader, err := gzip.NewReader(f)
if err != nil {
log.Fatalf("error gzip file %s", err)
}
request, err := http.NewRequest("POST", "http://localhost:8080/results", reader)
_, err = client.Do(request)
if err != nil {
log.Fatalf("error uploading file %s", err)
}
}
The code you've written for the client is just sending the tarfile directly because of this code:
reader, err := gzip.NewReader(f)
...
request, err := http.NewRequest("POST", "http://localhost:8080/results", reader)
If you sent the .tar.gz file content directly, then you would need to gunzip it on the server. E.g.:
request, err := http.NewRequest(..., f)
I think that's closer to the behavior you should expect third-party clients to exhibit.
Claerly not, but maybe...
Golang provides a very good support for the http client (and server). This is one of the first language to support http2 and the design of the API clearly shows their concern on having a fast http.
This is why they add Accept-Econding: gzip automatically. That will dramatically reduce the size of the server response and then optimize the transfer.
But the gzip remains an option in http 1 and not all of the client will push this header to your server.
Note that the Content-Type describes the type of data you are sending (here a tar.gz but could be application/json, test/javascript, ...), when the Accept-Encoding describes the way the data has been encoded for the transport
Go will take care of transparently handling the Accept-Encoding for you because it is responsible of the transport of the data. Then it will be up to you to handle the Content-Type because only you know how to give a sense to the content you received
I'm trying to read from request then use that result to do POST request to another endpoint then process its results then return its results in JSON.
I have below code so far:
// POST
func (u *UserResource) authenticate(request *restful.Request, response *restful.Response) {
Api := Api{url: "http://api.com/api"}
usr := new(User)
err := request.ReadEntity(&usr)
if err != nil {
response.WriteErrorString(http.StatusInternalServerError, err.Error())
return
}
api_resp, err := http.Post(Api.url, "text/plain", bytes.NewBuffer(usr))
if err != nil {
response.WriteErrorString(http.StatusInternalServerError, err.Error())
return
}
defer api_resp.Body.Close()
body, err := ioutil.ReadAll(api_resp.Body)
response.WriteHeader(http.StatusCreated)
err = xml.Unmarshal(body, usr)
if err != nil {
fmt.Printf("error: %v", err)
return
}
// result, err := json.Marshal(usr)
// response.Write(result)
response.WriteEntity(&usr)
fmt.Printf("Name: %q\n", usr.UserName)
}
I'm using Go Restful package for Writes and Reads.
I'm getting this error when I compile the file:
src\login.go:59: cannot use usr (type *User) as type []byte in argument to bytes.NewBuffer
What would be the best way to solve this issue so I can do a POST with payload correctly?
You need to marshal your data structure to slice of bytes. Something like this:
usrXmlBytes, err := xml.Marshal(usr)
if err != nil {
response.WriteErrorString(http.StatusInternalServerError, err.Error())
return
}
api_resp, err := http.Post(Api.url, "text/plain", bytes.NewReader(usrXmlBytes))
http.Post takes an io.Reader as the third argument. You could implement io.Reader on your User type or more simply serialize your data and use the bytes pkg to to implement io.Reader
b, err := json.Marshal(usr)
if err != nil {
response.WriteErrorString(http.StatusInternalServerError, err.Error())
return
}
api_resp, err := http.Post(Api.url, "text/plain", bytes.NewReader(b))
noob Golang and Sinatra person here. I have hacked a Sinatra app to accept an uploaded file posted from an HTML form and save it to a hosted MongoDB database via GridFS. This seems to work fine. I am writing the same app in Golang using the mgo driver.
Functionally it works fine. However in my Golang code, I read the file into memory and then write the file from memory to the MongoDB using mgo. This appears much slower than my equivalent Sinatra app. I get the sense that the interaction between Rack and Sinatra does not execute this "middle" or "interim" step.
Here's a snippet of my Go code:
func uploadfilePageHandler(w http.ResponseWriter, req *http.Request) {
// Capture multipart form file information
file, handler, err := req.FormFile("filename")
if err != nil {
fmt.Println(err)
}
// Read the file into memory
data, err := ioutil.ReadAll(file)
// ... check err value for nil
// Specify the Mongodb database
my_db := mongo_session.DB("... database name...")
// Create the file in the Mongodb Gridfs instance
my_file, err := my_db.GridFS("fs").Create(unique_filename)
// ... check err value for nil
// Write the file to the Mongodb Gridfs instance
n, err := my_file.Write(data)
// ... check err value for nil
// Close the file
err = my_file.Close()
// ... check err value for nil
// Write a log type message
fmt.Printf("%d bytes written to the Mongodb instance\n", n)
// ... other statements redirecting to rest of user flow...
}
Question:
Is this "interim" step needed (data, err := ioutil.ReadAll(file))?
If so, can I execute this step more efficiently?
Are there other accepted practices or approaches I should be considering?
Thanks...
No, you should not read the file entirely in memory at once, as that will break when the file is too large. The second example in the documentation for GridFS.Create avoids this problem:
file, err := db.GridFS("fs").Create("myfile.txt")
check(err)
messages, err := os.Open("/var/log/messages")
check(err)
defer messages.Close()
err = io.Copy(file, messages)
check(err)
err = file.Close()
check(err)
As for why it's slower than something else, hard to tell without diving into the details of the two approaches used.
Once you have the file from multipartForm, it can be saved into GridFs using below function. I tested this against huge files as well ( upto 570MB).
//....code inside the handlerfunc
for _, fileHeaders := range r.MultipartForm.File {
for _, fileHeader := range fileHeaders {
file, _ := fileHeader.Open()
if gridFile, err := db.GridFS("fs").Create(fileHeader.Filename); err != nil {
//errorResponse(w, err, http.StatusInternalServerError)
return
} else {
gridFile.SetMeta(fileMetadata)
gridFile.SetName(fileHeader.Filename)
if err := writeToGridFile(file, gridFile); err != nil {
//errorResponse(w, err, http.StatusInternalServerError)
return
}
func writeToGridFile(file multipart.File, gridFile *mgo.GridFile) error {
reader := bufio.NewReader(file)
defer func() { file.Close() }()
// make a buffer to keep chunks that are read
buf := make([]byte, 1024)
for {
// read a chunk
n, err := reader.Read(buf)
if err != nil && err != io.EOF {
return errors.New("Could not read the input file")
}
if n == 0 {
break
}
// write a chunk
if _, err := gridFile.Write(buf[:n]); err != nil {
return errors.New("Could not write to GridFs for "+ gridFile.Name())
}
}
gridFile.Close()
return nil
}