reusing read buffers when working with sockets - sockets

I'd like to know the proper way to reuse the []byte buffer in go. I declare it like this
buf := make([]byte, 1024)
and then use like this
conn, _ := net.Dial("tcp", addr)
_, err = conn.read(buf)
I heard that declaring a new buffer isn't efficient since it involves memory allocations and that we should reuse existing buffers instead. However I am not sure if I just can pass the buffer again and it will be wiped or it can hold parts of previous messages (especially if the current message from socket is shorter than prev.one)?

The Read method reads up to the len(buf) bytes to the buffer and returns the number of bytes read.
The Read method does not modify length of the caller's slice. It cannot because the slice is passed by value. The application must use the returned length to get a slice of the bytes actually read.
n, err = conn.Read(buf)
bufRead := buf[:n]
The application can call Read multiple times using the the same buffer.
conn, err := net.Dial("tcp", addr)
if err != nil {
// handle error
}
buf := make([]byte, 1024)
for {
n, err = conn.Read(buf)
if err != nil {
// handle error
}
fmt.Printf("Read %s\n", buf[:n]) // buf[:n] is slice of bytes read from conn
}

In practice you rarely use io.Reader.Read(), instead you pipe it down where io.Reader needed in code.
Buffer will not be wiped, you must do it by hand. Or if you want a buffer you can use bufio
conn, _ := net.Dial("tcp", addr)
r:=bufio.NewReader(conn)
which you can
r.WriteTo(io.Writer) //for example for further processing
and you can reset
r.Reset(NewConn)

Package io
import "io"
type Reader
type Reader interface {
Read(p []byte) (n int, err error)
}
Reader is the interface that wraps the basic Read method.
Read reads up to len(p) bytes into p. It returns the number of bytes
read (0 <= n <= len(p)) and any error encountered. Even if Read
returns n < len(p), it may use all of p as scratch space during the
call. If some data is available but not len(p) bytes, Read
conventionally returns what is available instead of waiting for more.
When Read encounters an error or end-of-file condition after
successfully reading n > 0 bytes, it returns the number of bytes read.
It may return the (non-nil) error from the same call or return the
error (and n == 0) from a subsequent call. An instance of this general
case is that a Reader returning a non-zero number of bytes at the end
of the input stream may return either err == EOF or err == nil. The
next Read should return 0, EOF.
Callers should always process the n > 0 bytes returned before
considering the error err. Doing so correctly handles I/O errors that
happen after reading some bytes and also both of the allowed EOF
behaviors.
Implementations of Read are discouraged from returning a zero byte
count with a nil error, except when len(p) == 0. Callers should treat
a return of 0 and nil as indicating that nothing happened; in
particular it does not indicate EOF.
Implementations must not retain p.
Read may use all of the buffer as scratch space during the call.
For example,
buf := make([]byte, 4096)
for {
n, err := r.Read(buf[:cap(buf)])
buf = buf[:n]
if err != nil {
// handle error
}
// process buf
}

Related

How to get handle of all active monitors

How can I get the handle of each monitor? I'll need to know which monitor handle corresponds to each physical monitor. I can find this if I also have the positions and numbers of each monitor. But I'm unable to even get the handles of the monitors.
I've read the documentation for EnumDisplayMonitors dozens of times, but nothing that I have tried will work.
I tried doing this:
oEnumDisplayMonitors := RegisterCallback("EnumMonitorsProc")
DllCall("EnumDisplayMonitors", "Ptr", 0, "Ptr", 0, "Ptr", oEnumDisplayMonitors, "Ptr", 0)
omh := oEnumDisplayMonitors.monitorHandle
h := oEnumDisplayMonitors.hdc
olpr := oEnumDisplayMonitors.lpRect
EnumMonitorsProc(monitorHandle, hdc, lpRect, lParam){
}
But the values for every argument to EnumMonitorsProc are all null.
I have also tried the following, following the example from this post: https://www.autohotkey.com/boards/viewtopic.php?f=6&t=4606
However, the script just aborts as soon as it makes the DllCall("EnumDisplayMonitors",...
Monitors := MDMF_Enum("")
For HMON, M In Monitors {
l := M.Left
t := M.Top
h := HMON
}
MDMF_Enum(HMON := "") {
Static EnumProc := RegisterCallback("MDMF_EnumProc")
Static Monitors := {}
If (HMON = "") ; new enumeration
Monitors := {}
If (Monitors.MaxIndex() = "") ; enumerate
DllCall("EnumDisplayMonitors", "Ptr", 0, "Ptr", 0, "Ptr", EnumProc, "Ptr", &Monitors, "UInt")
Return (HMON = "") ? Monitors : Monitors.HasKey(HMON) ? Monitors[HMON] : False
}
I need the handles for ALL monitors, not just for the active monitor or the primary monitor.
First we define the callback function that's going to be provided for the EnumDisplayMonitors function.
Callback_Func := RegisterCallback("MONITORENUMPROC")
This could be done in-line without creating an unnecessary variable as well.
Now that we've done that, we also of course need to create the MONITORENUMPROC function we're referring to:
MONITORENUMPROC(hMonitor, hDC, pRECT, data)
{
MsgBox, % hMonitor
return true
}
We're only interested in the handle, which is the first param. We can ignore everything else in this small example.
And we're returning true to indicate that we want to keep enumerating through the rest of the display monitors, assuming there are any. This was specified in the documentation for the callback function.
Ok, that's our callback function all done, now we want to call the EnumDisplayMonitors function and pass it that callback function so it can do its trick.
DllCall("EnumDisplayMonitors", Ptr, 0, Ptr, 0, Ptr, Callback_Func, Ptr, 0)
We're passing null (pointer 0 in AHK) to the first two parameters, as the documentation suggests if one wants to enumerate through all the available monitors.
For the 3rd parameter we pass our callback function's pointer, that's stored in our Callback_Func variable. (AHK's RegisterCallback function returns a pointer to our function).
And to the 4th parameter we just pass null again because we don't care about it in this small example. You could pass whatever data you wish through there, and it'd appear in the 4th parameter of our user-defined MONITORENUMPROC function (the one I named "data").
In the library you were looking at, they pass in a pointer to their own "Monitors" object. It's just a clever way they have of making the function have a double use.
So that's basically it. We print a messagebox for each monitor's handle.
Minimal example of how it works. Assuming you probably want to know which handle is which monitor, you can pass forward the handle to yet another function.
Such as the GetMonitorInfo function, exactly as they do in that library you were looking at.
And here's the example script I produced for you:
Callback_Func := RegisterCallback("MONITORENUMPROC")
DllCall("EnumDisplayMonitors", Ptr, 0, Ptr, 0, Ptr, Callback_Func, Ptr, 0)
MONITORENUMPROC(hMonitor, hDC, pRECT, data)
{
MsgBox, % hMonitor
return true
}

Read mongodump output with go and mgo

I'm trying to read a collection dump generated by mongodump. The file is a few gigabytes so I want to read it incrementally.
I can read the first object with something like this:
buf := make([]byte, 100000)
f, _ := os.Open(path)
f.Read(buf)
var m bson.M
bson.Unmarshal(buf, &m)
However I don't know how much of the buf was consumed, so I don't know how to read the next one.
Is this possible with mgo?
Using mgo's bson.Unmarshal() alone is not enough -- that function is designed to take a []byte representing a single document, and unmarshal it into a value.
You will need a function that can read the next whole document from the dump file, then you can pass the result to bson.Unmarshal().
Comparing this to encoding/json or encoding/gob, it would be convenient if mgo.bson had a Reader type that consumed documents from an io.Reader.
Anyway, from the source for mongodump, it looks like the dump file is just a series of bson documents, with no file header/footer or explicit record separators.
BSONTool::processFile shows how mongorestore reads the dump file. Their code reads 4 bytes to determine the length of the document, then uses that size to read the rest of the document. Confirmed that the size prefix is part of the bson spec.
Here is a playground example that shows how this could be done in Go: read the length field, read the rest of the document, unmarshal, repeat.
The method File.Read returns the number of bytes read.
File.Read
Read reads up to len(b) bytes from the File. It returns the number of bytes read and an error, if any. EOF is signaled by a zero count with err set to io.EOF.
So you can get the number of bytes read by simply storing the return parameters of you read:
n, err := f.Read(buf)
I managed to solve it with the following code:
for len(buf) > 0 {
var r bson.Raw
var m userObject
bson.Unmarshal(buf, &r)
r.Unmarshal(&m)
fmt.Println(m)
buf = buf[len(r.Data):]
}
Niks Keets' answer did not work for me. Somehow len(r.Data) was always the whole buffer length. So I came out with this other code:
for len(buff) > 0 {
messageSize := binary.LittleEndian.Uint32(buff)
err = bson.Unmarshal(buff, &myObject)
if err != nil {
panic(err)
}
// Do your stuff
buff = buff[messageSize:]
}
Of course you have to handle truncated strucs at the end of the buffer. In my case I could load the whole file into memory.

Golang md5 Sum() function

package main
import (
"crypto/md5"
"fmt"
)
func main() {
hash := md5.New()
b := []byte("test")
fmt.Printf("%x\n", hash.Sum(b))
hash.Write(b)
fmt.Printf("%x\n", hash.Sum(nil))
}
Output:
*md5.digest74657374d41d8cd98f00b204e9800998ecf8427e
098f6bcd4621d373cade4e832627b4f6
Could someone please explain to me why/how do I get different result for the two print ?
I'm building up on the already good answers. I'm not sure if Sum is actually the function you want. From the hash.Hash documentation:
// Sum appends the current hash to b and returns the resulting slice.
// It does not change the underlying hash state.
Sum(b []byte) []byte
This function has a dual use-case, which you seem to mix in an unfortunate way. The use-cases are:
Computing the hash of a single run
Chaining the output of several runs
In case you simply want to compute the hash of something, either use md5.Sum(data) or
digest := md5.New()
digest.Write(data)
hash := digest.Sum(nil)
This code will, according to the excerpt of the documentation above, append the checksum of data to nil, resulting in the checksum of data.
If you want to chain several blocks of hashes, the second use-case of hash.Sum, you can do it like this:
hashed := make([]byte, 0)
for hasData {
digest.Write(data)
hashed = digest.Sum(hashed)
}
This will append each iteration's hash to the already computed hashes. Probably not what you want.
So, now you should be able to see why your code is failing. If not, take this commented version of your code (On play):
hash := md5.New()
b := []byte("test")
fmt.Printf("%x\n", hash.Sum(b)) // gives 74657374<hash> (74657374 = "test")
fmt.Printf("%x\n", hash.Sum([]byte("AAA"))) // gives 414141<hash> (41 = 'A')
fmt.Printf("%x\n", hash.Sum(nil)) // gives <hash> as append(nil, hash) == hash
fmt.Printf("%x\n", hash.Sum(b)) // gives 74657374<hash> (74657374 = "test")
fmt.Printf("%x\n", hash.Sum([]byte("AAA"))) // gives 414141<hash> (41 = 'A')
hash.Write(b)
fmt.Printf("%x\n", hash.Sum(nil)) // gives a completely different hash since internal bytes changed due to Write()
You have 2 ways to actually get a md5.Sum of a byte slice :
func main() {
hash := md5.New()
b := []byte("test")
hash.Write(b)
fmt.Printf("way one : %x\n", hash.Sum(nil))
fmt.Printf("way two : %x\n", md5.Sum(b))
}
According to http://golang.org/src/pkg/crypto/md5/md5.go#L88, your hash.Sum(b) is like calling append(b, actual-hash-of-an-empty-md5-hash).
The definition of Sum :
func (d0 *digest) Sum(in []byte) []byte {
// Make a copy of d0 so that caller can keep writing and summing.
d := *d0
hash := d.checkSum()
return append(in, hash[:]...)
}
When you call Sum(nil) it returns d.checkSum() directly as a byte slice, however if you call Sum([]byte) it appends d.checkSum() to your input.
From the docs:
// Sum appends the current hash to b and returns the resulting slice.
// It does not change the underlying hash state.
Sum(b []byte) []byte
so "*74657374*d41d8cd98f00b204e9800998ecf8427e" is actually a hex representation of "test", plus the initial state of the hash.
fmt.Printf("%x", []byte{"test"})
will result in... "74657374"!
So basically hash.Sum(b) is not doing what you think it does. The second statement is the right hash.
I would like to tell you to the point:
why/how do I get different result for the two print ?
Ans:
hash := md5.New()
As you are creating a new instance of md5 hash once you call hash.Sum(b) it actually md5 hash for b as hash itself is empty, hence you got 74657374d41d8cd98f00b204e9800998ecf8427e as output.
Now in next statement hash.Write(b) you are writing b to the hash instance then calling hash.Sum(nil) it will calculate md5 for b that you just written and sum it to previous value i.e 74657374d41d8cd98f00b204e9800998ecf8427e
This is the reason you are getting these outputs.
For your reference look at the Sum API:
func (d0 *digest) Sum(in []byte) []byte {
85 // Make a copy of d0 so that caller can keep writing and summing.
86 d := *d0
87 hash := d.checkSum()
88 return append(in, hash[:]...)
89 }

what could limit bytes read in from read()

I am reading bytes off a socket initialised like this:
fd = socket(PF_PACKET, SOCK_RAW, htons(ETH_P_ALL));
However when I read from this socket
char buf[ETH_FRAME_LEN]
len = read(fd, buf, sizeof(buf));
len shows only 1500 bytes were read. I checked with wireshark and the packet returned is 5854. The total length field under IP says 5840 (so + 14 bytes for ethernet header = 5854). I tried using a larger buffer (6000) but still only 1500 bytes were being read off the wire.
I tried requesting a smaller file from the server (1504 bytes), but I get the same results. As it is a raw socket, the data read in includes the ethernet headers, so it is not reading the last 4 bytes into the buffer.
What could be the cause of this? I'm not aware of any argument to socket() that could cause this.
What happens if you try calling read again? Is the next chunk of the message quickly returned?
From the read man page (my emphasis)
read() attempts to read up to count bytes
If you want to read a certain number of bytes, you should be prepared to call read in a loop until you receive your target total cumulatively over the calls.
What is happening is that you're getting exactly one Ethernet MTU's worth of payload per call to read().
read() returns:
On success, the number of bytes read is returned (zero indicates end of
file), and the file position is advanced by this number. It is not an
error if this number is smaller than the number of bytes requested;
this may happen for example because fewer bytes are actually available
right now (maybe because we were close to end-of-file, or because we
are reading from a pipe, or from a terminal), or because read() was
interrupted by a signal. On error, -1 is returned, and errno is set
appropriately. In this case it is left unspecified whether the file
position (if any) changes.
You can try to use recv() with MSG_WAITALL instead of pure read():
This flag requests that the operation block until the full
request is satisfied. However, the call may still return less
data than requested if a signal is caught, an error or disconnect occurs, or the next data to be received is of a different
type than that returned.
len = recv(fd, buf, sizeof(buf), MSG_WAITALL);
Another way is to read or recv in a loop like:
ssize_t Recv(int fd, void* buf, ssize_t n)
{
ssize_t read = 0;
ssize_t r;
while(read != n)
{
r = recv(fd, ((char*)buf)+read, n-read, 0);
if(r == -1)
return (read) ? read : -1;
if(r == 0)
return 0;
read += r;
}
return read;
}

Determing the number of bytes ready to be recv()'d

I can use select() to determine if a call to recv() would block, but once I've determined that their are bytes to be read, is their a way to query how many bytes are currently available before I actually call recv()?
If your OS provides it (and most do), you can use ioctl(..,FIONREAD,..):
int get_n_readable_bytes(int fd) {
int n = -1;
if (ioctl(fd, FIONREAD, &n) < 0) {
perror("ioctl failed");
return -1;
}
return n;
}
Windows provides an analogous ioctlsocket(..,FIONREAD,..), which expects a pointer to unsigned long:
unsigned long get_n_readable_bytes(SOCKET sock) {
unsigned long n = -1;
if (ioctlsocket(sock, FIONREAD, &n) < 0) {
/* look in WSAGetLastError() for the error code */
return 0;
}
return n;
}
The ioctl call should work on sockets and some other fds, though not on all fds. I believe that it works fine with TCP sockets on nearly any free unix-like OS you are likely to use. Its semantics are a little different for UDP sockets: for them, it tells you the number of bytes in the next datagram.
The ioctlsocket call on Windows will (obviously) only work on sockets.
No, a protocol needs to determine that. For example:
If you use fixed-size messages then you know you need to read X bytes.
You could read a message header that indicates X bytes to read.
You could read until a terminal character / sequence is found.