what could limit bytes read in from read() - sockets

I am reading bytes off a socket initialised like this:
fd = socket(PF_PACKET, SOCK_RAW, htons(ETH_P_ALL));
However when I read from this socket
char buf[ETH_FRAME_LEN]
len = read(fd, buf, sizeof(buf));
len shows only 1500 bytes were read. I checked with wireshark and the packet returned is 5854. The total length field under IP says 5840 (so + 14 bytes for ethernet header = 5854). I tried using a larger buffer (6000) but still only 1500 bytes were being read off the wire.
I tried requesting a smaller file from the server (1504 bytes), but I get the same results. As it is a raw socket, the data read in includes the ethernet headers, so it is not reading the last 4 bytes into the buffer.
What could be the cause of this? I'm not aware of any argument to socket() that could cause this.

What happens if you try calling read again? Is the next chunk of the message quickly returned?
From the read man page (my emphasis)
read() attempts to read up to count bytes
If you want to read a certain number of bytes, you should be prepared to call read in a loop until you receive your target total cumulatively over the calls.

What is happening is that you're getting exactly one Ethernet MTU's worth of payload per call to read().

read() returns:
On success, the number of bytes read is returned (zero indicates end of
file), and the file position is advanced by this number. It is not an
error if this number is smaller than the number of bytes requested;
this may happen for example because fewer bytes are actually available
right now (maybe because we were close to end-of-file, or because we
are reading from a pipe, or from a terminal), or because read() was
interrupted by a signal. On error, -1 is returned, and errno is set
appropriately. In this case it is left unspecified whether the file
position (if any) changes.
You can try to use recv() with MSG_WAITALL instead of pure read():
This flag requests that the operation block until the full
request is satisfied. However, the call may still return less
data than requested if a signal is caught, an error or disconnect occurs, or the next data to be received is of a different
type than that returned.
len = recv(fd, buf, sizeof(buf), MSG_WAITALL);
Another way is to read or recv in a loop like:
ssize_t Recv(int fd, void* buf, ssize_t n)
{
ssize_t read = 0;
ssize_t r;
while(read != n)
{
r = recv(fd, ((char*)buf)+read, n-read, 0);
if(r == -1)
return (read) ? read : -1;
if(r == 0)
return 0;
read += r;
}
return read;
}

Related

read long message from client

I am trying to read a long message from the client then print to the server stdout, but when I run the code, the length of the data that is read from client is different almost every time...
I also tried using malloc, but the result was same. I'm really wondering why...
The client code is well made, the problem seems to be on the server side.
Here's the relevant part of the code:
char buf[MAX]; //MAX=1024;
memset(buf, '\0', sizeof(buf));
size_t b;
while ((b = read(connect_fd, buf, MAX - 1)) > 0) {
buf[b] = '\0';
printf("%s", buf);
flush(stdout);
write(coonect_fd, buf, strlen(buf));
memset(buf, '\0', MAX);
}
This read loop seems OK, there could be problems somewhere else, in the server code or in the client code. Here are a few pointers to potential problems:
There is no need to clear the array with memset().
b should be defined as ssize_t to detect read errors and avoid undefined behavior if read() returns -1.
setting the null terminator is not strictly needed: you can use printf("%.*s", (int)b, buf); or fwrite(buf, 1, b, stdout);.
when writing to the coonect_fd, use b instead of strlen(buf).
make sure the client flushes its output the socket.

saving the head of received buffer after using recv()

i am trying to save part of received buffer in char* variable but it is not working
i am using:
recv(sock,buff,BUFLEN,0);
char *head=NULL;
head= (char *) malloc (16);
strncpy (head,buff,16);
how do i save part of it (lets say 4 first bytes) into char* head?

How to find which socket descriptor had become invalid in fdset

In the server side while doing select() on readfds it returns bad file descriptor error.How can i find which of the fd has become invalid in fdset?
Usually, when a connection on the other side is closed or an RST segment is sent, select returns and marks the corresponding descriptors as ready for read. When you subsequently perform read/recv from them, an error or EOF is returned.
You might also try using strace tool (if available) for debugging. It will help you keep track of what descriptors are fed to select and what descriptors read/recv are called with.
You can check pending error on a socket with the following function:
int get_socket_error( int s ) {
int error;
socklen_t len = sizeof( error );
if ( getsockopt( s, SOL_SOCKET, SO_ERROR, &error, &len ) < 0 )
error = errno;
return error;
}
But as #Maxim is saying, having EBADF returned from select(2) is usually an indication of a sloppy coding,

Determing the number of bytes ready to be recv()'d

I can use select() to determine if a call to recv() would block, but once I've determined that their are bytes to be read, is their a way to query how many bytes are currently available before I actually call recv()?
If your OS provides it (and most do), you can use ioctl(..,FIONREAD,..):
int get_n_readable_bytes(int fd) {
int n = -1;
if (ioctl(fd, FIONREAD, &n) < 0) {
perror("ioctl failed");
return -1;
}
return n;
}
Windows provides an analogous ioctlsocket(..,FIONREAD,..), which expects a pointer to unsigned long:
unsigned long get_n_readable_bytes(SOCKET sock) {
unsigned long n = -1;
if (ioctlsocket(sock, FIONREAD, &n) < 0) {
/* look in WSAGetLastError() for the error code */
return 0;
}
return n;
}
The ioctl call should work on sockets and some other fds, though not on all fds. I believe that it works fine with TCP sockets on nearly any free unix-like OS you are likely to use. Its semantics are a little different for UDP sockets: for them, it tells you the number of bytes in the next datagram.
The ioctlsocket call on Windows will (obviously) only work on sockets.
No, a protocol needs to determine that. For example:
If you use fixed-size messages then you know you need to read X bytes.
You could read a message header that indicates X bytes to read.
You could read until a terminal character / sequence is found.

Using memcpy/memset

When using memset or memcpy within an Obj-C program, will the compiler optimise the setting (memset) or copying (memcpy) of data into 32-bit writes or will it do it byte by byte?
You can see the libc implementations of these methods in the Darwin source. In 10.6.3, memset works at the word level. I didn't check memcpy, but probably it's the same.
You are correct that it's possible for the compiler to do the work inline instead of calling these functions. I suppose I'll let someone who knows better answer what it will do, though I would not expect a problem.
Memset will come as part of your standard C library so it depends on the implementation you are using. I would guess most implementations will copy in blocks of the native CPU size (32/64 bits) and then the remainder byte-by-byte.
Here is glibc's version of memcpy for an example implementation:
void *
memcpy (dstpp, srcpp, len)
void *dstpp;
const void *srcpp;
size_t len;
{
unsigned long int dstp = (long int) dstpp;
unsigned long int srcp = (long int) srcpp;
/* Copy from the beginning to the end. */
/* If there not too few bytes to copy, use word copy. */
if (len >= OP_T_THRES)
{
/* Copy just a few bytes to make DSTP aligned. */
len -= (-dstp) % OPSIZ;
BYTE_COPY_FWD (dstp, srcp, (-dstp) % OPSIZ);
/* Copy whole pages from SRCP to DSTP by virtual address manipulation,
as much as possible. */
PAGE_COPY_FWD_MAYBE (dstp, srcp, len, len);
/* Copy from SRCP to DSTP taking advantage of the known alignment of
DSTP. Number of bytes remaining is put in the third argument,
i.e. in LEN. This number may vary from machine to machine. */
WORD_COPY_FWD (dstp, srcp, len, len);
/* Fall out and copy the tail. */
}
/* There are just a few bytes to copy. Use byte memory operations. */
BYTE_COPY_FWD (dstp, srcp, len);
return dstpp;
}
So you can see it copies a few bytes first to get aligned, then copies in words, then finally in bytes again. It does some optimized page copying using some kernel operations.