I'm looking for the cleanest way in Golang to transfer a message through (i.e. act as an SMTP proxy) while performing some manipulation on the message body html (e.g. adding an open tracking pixel - not yet coded).
The net/mail package includes a method ReadMessage that parses mail headers into a map, and gives you an io.Reader for the body. This is necessary to determine the MIME parts of the body for processing, rather than just io.Copying them through. (the simple stub version of this function, shown in the block comment, does just that).
The following function copies an incoming mail "src" to an outgoing mail stream "dest". (The calling code sets these up as DotReader and DotWriter which takes care of most of the "dot" processing needed for RFC5321.
// Processing of email body via IO stream functions
package main
import (
"bufio"
"io"
"log"
"net/mail"
"strings"
)
/* If you just want to pass through the entire mail headers and body, you can just use
the following alernative:
func MailCopy(dst io.Writer, src io.Reader) (int64, error) {
return io.Copy(dst, src)
}
*/
// MailCopy transfers the mail body from downstream (client) to upstream (server)
// The writer will be closed by the parent function, no need to close it here.
func MailCopy(dst io.Writer, src io.Reader) (int64, error) {
var totalWritten int64
const smtpCRLF = "\r\n"
message, err := mail.ReadMessage(bufio.NewReader(src))
if err != nil {
return totalWritten, err
}
// Pass through headers. The m.Header map does not preserve order, but that should not matter.
for hdrType, hdrList := range message.Header {
for _, hdrVal := range hdrList {
hdrLine := hdrType + ": " + hdrVal + smtpCRLF
log.Print("\t", hdrLine)
bytesWritten, err := dst.Write([]byte(hdrLine))
totalWritten += int64(bytesWritten)
if err != nil {
return totalWritten, err
}
}
}
// Blank line denotes end of headers
bytesWritten, err := io.Copy(dst, strings.NewReader(smtpCRLF))
totalWritten += int64(bytesWritten)
if err != nil {
return totalWritten, err
}
// Copy the body
bytesWritten, err = io.Copy(dst, message.Body)
totalWritten += int64(bytesWritten)
if err != nil {
return totalWritten, err
}
return totalWritten, err
}
It does seem necessary to build this, because there is no net/mail.WriteMessage() method.
the header order is always randomised by Golang's map functionality. This seems harmless in my tests
A forced CRLF needs to be put in between the end of the headers and the body, as per RFCs. DotWriter takes care of the terminating dot.
The function shown above works, I was wondering if there is a better way to do this?
Related
I've accidentally spotted a bug when parts of a message from previous connection go to the next message.
I have a basic server with client. I have removed all the error handling to avoid bloating the examples too much.
Also I've replaced some Printf's with time.Sleep since I just don't have a chance to break the connection in time to reproduce the bug because it reads the data too fast.
The "package" is a simple structure, where the first 4 bytes is the length and then goes the content.
Client code:
package main
import (
"encoding/binary"
"fmt"
"net"
)
func main() {
conn, _ := net.Dial("tcp", "0.0.0.0:8081")
defer conn.Close()
str := "msadsakdjsajdklsajdklsajdk"
// Creating a package
buf := make([]byte, len(str)+4)
copy(buf[4:], str)
binary.LittleEndian.PutUint32(buf[:4], uint32(len(str)))
for {
_, err := conn.Write(buf)
if err != nil {
fmt.Println(err)
return
}
}
}
Server code:
package main
import (
"encoding/binary"
"fmt"
"net"
"sync"
"time"
)
func ReadConnection(conn net.Conn, buf []byte) (err error) {
maxLen := cap(buf)
readSize := 0
for readSize < maxLen {
// instead of Printf
time.Sleep(time.Nanosecond * 10)
readN, err := conn.Read(buf[readSize:])
if err != nil {
return err
}
readSize += readN
}
return nil
}
func handleConnection(conn net.Conn, waitGroup *sync.WaitGroup) {
waitGroup.Add(1)
defer conn.Close()
defer waitGroup.Done()
fmt.Printf("Serving %s\n", conn.RemoteAddr().String())
var packageSize int32 = 0
int32Buf := make([]byte, 4)
for {
// read the length
conn.Read(int32Buf)
packageSize = int32(binary.LittleEndian.Uint32(int32Buf))
// assuming the length should be 26
if packageSize > 26 {
fmt.Println("Package size error")
return
}
// read the content
packageBuf := make([]byte, packageSize)
if err := ReadConnection(conn, packageBuf); err != nil {
fmt.Printf("ERR: %s\n", err)
return
}
// instead of Printf
time.Sleep(time.Nanosecond * 100)
}
}
func main() {
//establish connection
listener, _ := net.Listen("tcp", "0.0.0.0:8081")
defer listener.Close()
waitGroup := sync.WaitGroup{}
for {
conn, err := listener.Accept()
if err != nil {
break
}
go handleConnection(conn, &waitGroup)
}
waitGroup.Wait()
}
So for some reason, int32Buf receives the last 2 bytes from a previous message (d, k) and the first 2 bytes of the length, resulting in [107, 100, 26, 0] bytes slice, when it should be [26, 0, 0, 0].
And of course, the rest of the data contains remaining two zeroes:
conn.Read(int32Buf)
You need to check the return value of conn.Read and compare it against your expectations. You are assuming in your code that conn.Read will always completely fill the given buffer of 4 bytes.
This assumption is wrong, i.e. it might actually read less data. Specifically it might read only 2 bytes in which case you'll end up with \x1a\x00\x00\x00 in your buffer which still translates to a message length of 26. Only, the first 2 bytes of the message will actually be the last 2 bytes of the length which were not included in the last read. This means after reading the 26 bytes it will not have read the full message. 2 bytes are legt and will be included into the next message - this is what you observed.
To be sure that the exact size of the buffer is read check the return values of conn.Read or use io.ReadFull. After you've done this it works as expected (from the comment):
Ok, now it works perfect
So why does this happened only in context of a new connection? Maybe because the additional load due to another connection changed the behavior slightly but significantly enough. Still, these are not the data read from a different connection but data from the current one contrary to the description in the question. This could be easily checked by using different messages with different clients.
In program bellow I have two routers. One is working at localhost:3000 and acts like a public access point. It also may send requests with data to another local address which is localhost:8000 where data is being processed. Second router is working at localhost:8000 and handles processing requests for the first router.
Problem
The first router sends a request with context to the second using http.NewRequestWithContext() function. The value is being added to the context and the context is added to request. When request arrives to the second router it does not have value that was added previously.
Some things like error handling are not being written to not post a wall of code here.
package main
import (
"bytes"
"context"
"net/http"
"github.com/go-chi/chi"
"github.com/go-chi/chi/middleware"
)
func main() {
go func() {
err := http.ListenAndServe(
"localhost:3000",
GetDataAndSolve(),
)
if err != nil {
panic(err)
}
}()
go func() {
err := http.ListenAndServe( // in GetDataAndSolve() we send requests
"localhost:8000", // with data for processing
InternalService(),
)
if err != nil {
panic(err)
}
}()
// interrupt := make(chan os.Signal, 1)
// signal.Notify(interrupt, syscall.SIGTERM, syscall.SIGINT)
// <-interrupt // just a cool way to close the program, uncomment if you need it
}
func GetDataAndSolve() http.Handler {
r := chi.NewRouter()
r.Use(middleware.Logger)
r.Get("/tasks/str", func(rw http.ResponseWriter, r *http.Request) {
// receiving data for processing...
taskCtx := context.WithValue(r.Context(), "str", "strVar") // the value is being
postReq, err := http.NewRequestWithContext( // stored to context
taskCtx, // context is being given to request
"POST",
"http://localhost:8000/tasks/solution",
bytes.NewBuffer([]byte("something")),
)
postReq.Header.Set("Content-Type", "application/json") // specifying for endpoint
if err != nil { // what we are sending
return
}
resp, err := http.DefaultClient.Do(postReq) // running actual request
// pls, proceed to Solver()
// do stuff to resp
// also despite arriving to middleware without right context
// here resp contains a request with correct context
})
return r
}
func Solver(next http.Handler) http.Handler { // here we end up after sending postReq
return http.HandlerFunc(func(rw http.ResponseWriter, r *http.Request) {
if r.Context().Value("str").(string) == "" {
return // the request arrive without "str" in its context
}
ctxWithResult := context.WithValue(r.Context(), "result", mockFunc(r.Context()))
next.ServeHTTP(rw, r.Clone(ctxWithResult))
})
}
func InternalService() http.Handler {
r := chi.NewRouter()
r.Use(middleware.Logger)
r.With(Solver).Post("/tasks/solution", emptyHandlerFunc)
return r
}
Your understanding of context is not correct.
Context (simplifying to an extent and in reference to NewRequestWithContext API), is just an in-memory object using which you can control the lifetime of the request (Handling/Triggering cancellations).
However your code is making a HTTP call, which goes over the wire (marshaled) using HTTP protocol. This protocol doesn't understand golang's context or its values.
In your scenario, both /tasks/str and /tasks/solution are being run on the same server. What if they were on different servers, probably different languages and application servers as well, So the context cannot be sent across.
Since the APIs are within the same server, maybe you can avoid making a full blown HTTP call and resort to directly invoking the API/Method. It might turn out to be faster as well.
If you still want to send additional values from context, then you'll have to make use of other attributes like HTTP Headers, Params, Body to send across the required information. This can provide more info on how to serialize data from context over HTTP.
I am working on a rest endpoint which should get the request body stream and consume it. I tried to get the body of request (Content-Type as text/csv or application/octet-stream) and read from it using buffer.
reader := r.Body.(io.Reader)
writer := bufio.NewWriter(outputFile) // we write to
for {
buffer := make([]byte, 4000)
numBytes, err := reader.Read(buffer)
if err == io.EOF {
break
} else if err != nil {
return
}
if read > 0 {
writer.Write(buffer[0:numBytes])
} else {
break
}
}
writer.Flush();
Above is my golang code. I got nothing from the request.Body. However, if I use multipart/form-data I can get the data from the parts. Does http always require form-data for uploading??
I need to implement web service in go that processes tar.gz files and I wonder what is the correct way, what content type I need to define, etc.
plus, I found that a lot of things are handled automatically - on the client side I just post a gzip reader as request body and Accept-Encoding: gzip header is added automatically, and on the server side - I do not need to gunzip the request body, it is already extracted to tar. does that make sense?
Can I rely that it would be like this with any client?
Server:
func main() {
router := mux.NewRouter().StrictSlash(true)
router.Handle("/results", dataupload.NewUploadHandler()).Methods("POST")
log.Fatal(http.ListenAndServe(*address, router))
}
Uploader:
package dataupload
import (
"errors"
log "github.com/Sirupsen/logrus"
"io"
"net/http"
)
// UploadHandler responds to /results http request, which is the result-service rest API for uploading results
type UploadHandler struct {
uploader Uploader
}
// NewUploadHandler creates UploadHandler instance
func NewUploadHandler() *UploadHandler {
return &UploadHandler{
uploader: TarUploader{},
}
}
func (uh UploadHandler) ServeHTTP(writer http.ResponseWriter, request *http.Request) {
retStatus := http.StatusOK
body, err := getBody(request)
if err != nil {
retStatus = http.StatusBadRequest
log.Error("Error fetching request body. ", err)
} else {
_, err := uh.uploader.Upload(body)
}
writer.WriteHeader(retStatus)
}
func getBody(request *http.Request) (io.ReadCloser, error) {
requestBody := request.Body
if requestBody == nil {
return nil, errors.New("Empty request body")
}
var err error
// this part is commented out since somehow the body is already gunzipped - no need to extract it.
/*if strings.Contains(request.Header.Get("Accept-Encoding"), "gzip") {
requestBody, err = gzip.NewReader(requestBody)
}*/
return requestBody, err
}
Client
func main() {
f, err := os.Open("test.tar.gz")
if err != nil {
log.Fatalf("error openning file %s", err)
}
defer f.Close()
client := new(http.Client)
reader, err := gzip.NewReader(f)
if err != nil {
log.Fatalf("error gzip file %s", err)
}
request, err := http.NewRequest("POST", "http://localhost:8080/results", reader)
_, err = client.Do(request)
if err != nil {
log.Fatalf("error uploading file %s", err)
}
}
The code you've written for the client is just sending the tarfile directly because of this code:
reader, err := gzip.NewReader(f)
...
request, err := http.NewRequest("POST", "http://localhost:8080/results", reader)
If you sent the .tar.gz file content directly, then you would need to gunzip it on the server. E.g.:
request, err := http.NewRequest(..., f)
I think that's closer to the behavior you should expect third-party clients to exhibit.
Claerly not, but maybe...
Golang provides a very good support for the http client (and server). This is one of the first language to support http2 and the design of the API clearly shows their concern on having a fast http.
This is why they add Accept-Econding: gzip automatically. That will dramatically reduce the size of the server response and then optimize the transfer.
But the gzip remains an option in http 1 and not all of the client will push this header to your server.
Note that the Content-Type describes the type of data you are sending (here a tar.gz but could be application/json, test/javascript, ...), when the Accept-Encoding describes the way the data has been encoded for the transport
Go will take care of transparently handling the Accept-Encoding for you because it is responsible of the transport of the data. Then it will be up to you to handle the Content-Type because only you know how to give a sense to the content you received
I have a following code in go:
import (
"log"
"net/http"
"code.google.com/p/go.text/transform"
"code.google.com/p/go.text/encoding/charmap"
)
...
res, err := http.Get(url)
if err != nil {
log.Println("Cannot read", url);
log.Println(err);
continue
}
defer res.Body.Close()
The page I load contain non UTF-8 symbols. So I try to use transform
utfBody := transform.NewReader(res.Body, charmap.Windows1251.NewDecoder())
But the problem is, that it returns error even in this simple scenarion:
bytes, err := ioutil.ReadAll(utfBody)
log.Println(err)
if err == nil {
log.Println(bytes)
}
transform: short destination buffer
It also actually sets bytes with some data, but in my real code I use goquery:
doc, err := goquery.NewDocumentFromReader(utfBody)
Which sees an error and fails with not data in return
I tried to pass "chunks" of res.Body to transform.NewReader and figuried out, that as long as res.Body contains no non-UTF8 data it works well. And when it contains non-UTF8 byte it fails with an error above.
I'm quite new to go and don't really understand what's going on and how to deal with this
Without the whole code along with an example URL it's hard to tell what exactly is going wrong here.
That said, I can recommend the golang.org/x/net/html/charset package for this as it supports both char guessing and converting to UTF 8.
func fetchUtf8Bytes(url string) ([]byte, error) {
res, err := http.Get(url)
if err != nil {
return nil, err
}
contentType := res.Header.Get("Content-Type") // Optional, better guessing
utf8reader, err := charset.NewReader(res.Body, contentType)
if err != nil {
return nil, err
}
return ioutil.ReadAll(utf8reader)
}
Complete example: http://play.golang.org/p/olcBM9ughv