TCP socket question - sockets

I starts learning TCP protocol from internet and having some experiments. After I read an article from http://www.diffen.com/difference/TCP_vs_UDP
"TCP is more reliable since it manages message acknowledgment and retransmissions in case of lost parts. Thus there is absolutely no missing data."
Then I do my experiment, I write a block of code with TCP socket:
while( ! EOF (file))
{
data = read_from(file, 5KB); //read 5KB from file
write(data, socket); //write data to socket to send
}
I think it's good because "TCP is reliable" and it "retransmissions lost parts"... But it's not good at all. A small file is OK but when it comes to about 2MB, sometimes it's OK but not always...
Now, I try another one:
while( ! EOF (file))
{
wait_for_ACK();//or sleep 5 seconds
data = read_from(file, 5KB); //read 5KB from file
write(data, socket); //write data to socket to send
}
It's good now...
All I can think of is that the 1st one fails because of:
1. buffer overflow on sender because the sending rate is slower than the writing rate of the program (the sending rate is controlled by TCP)
2. Maybe the sending rate is greater than writing rate but some packets are lost (after some retransmission, still fails and then TCP gives up...)
Any ideas?
Thanks.

TCP will ensure that you don't lose data but you should check how many bytes actually got accepted for transmission... the typical loop is
while (size > 0)
{
int sz = send(socket, bufptr, size, 0);
if (sz == -1) ... whoops, error ...
size -= sz; bufptr += sz;
}
when the send call accepts some data from your program then it's a job of the OS to get that to destination (including retransmission), but the buffer for sending may be smaller than the size you need to send, and that's why the resulting sz (number of bytes accepted for transmission) may be less than size.
It's also important to consider that sending is asynchronous, i.e. after the send function returns the data is not already at the destination, it's has been only assigned to the TCP transport system to be delivered. If you want to know when it will be received then you'll have to use other systems (e.g. a reply message from your counterpart).

You have to check write(socket) to make sure it writes what you ask.
Loop until you've sent everything or you've calculated a time out.
Do not use indefinite timeouts on socket read/write. You're asking for trouble if you do, especially on Windows.

Related

TCP connection and a different buffer size for a client and a server

What will happen if I will establish a connection between a client and a server, and configure a different buffer size for each of them.
This is my client's code:
import socket,sys
TCP_IP = sys.argv[1]
TCP_PORT = int(sys.argv[2])
BUFFER_SIZE = 1024
MESSAGE = "World! Hello, World!"
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((TCP_IP, TCP_PORT))
s.send(MESSAGE)
data = s.recv(BUFFER_SIZE)
s.close()
print "received data:", data
Server's code:
import socket,sys
TCP_IP = '0.0.0.0'
TCP_PORT = int(sys.argv[1])
BUFFER_SIZE = 5
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind((TCP_IP, TCP_PORT))
s.listen(1)
while True:
conn, addr = s.accept()
print 'New connection from:', addr
while True:
data = conn.recv(BUFFER_SIZE)
if not data: break
print "received:", data
conn.send(data.upper())
conn.close()
That means I will be limited to only 5 bytes? Which means I won't be able to receive the full packet and will lose 1024-5 packets?
I or does it mean I am able to get only packets of 5 bytes, which means that instead of receiving one packets of 1024 bytes as the client sent it, I'll have to divide 1024 by 5 and get 204.8 packets (?) which sounds not possible.
What in general is happing in that code?
Thanks.
Your arguments are based on the assumption that a single send should match a single recv. But this is not the case. TCP is a byte stream and not a message based protocol. This means all what matters are the transferred bytes. And for this is does not matter if it does not matter if one or 10 recv are needed to read 50 bytes.
Apart from that send is not guaranteed to send the full buffer either. It might only send parts of the buffer, i.e. the sender need actually check the return code to find out how much of the given buffer was actually send now and how much need to be retried for sending later.
And note that the underlying "packet" is again a different thing. If there is a send for 2000 bytes it will usually need multiple packets to be send (depending on the maximum transfer unit of the underlying data link layer). But this does not mean that one also need multiple recv. If all the 2000 bytes are already transferred to the OS level receive buffer at the recipient then they can be also be read at once, even if they traveled in multiple packets.
Your socket won't lose the remaining 1024 - 5 (1019) bytes.it just stored on the socket and ready to read again! so , all you need to do is to read from the socket again. the size of buffer you want to read to is decided by yourself. and you are not limited to 5 bytes, you are just limiting the read buffer for each single read to 5 bytes. so for 1024 bytes to read you have to read for 204 times plus another time read which would be the last one. but remember that the last time read fills your last buffer index with null. and that means there is no more bytes available for now.

How much data does recv() return from a socket after blocking? [duplicate]

The recv() library function man page mention that:
It returns the number of bytes received. It normally returns any data available, up to the requested amount, rather than waiting for receipt of the full amount requested.
If we are using blocking recv() call and requested for 100 bytes:
recv(sockDesc, buffer, size, 0); /* Where size is 100. */
and only 50 bytes are send by the server then this recv() is blocked until 100 bytes are available or it will return receiving 50 bytes.
The scenario could be that:
server crashes after sendign only 50 bytes
bad protocol design where server is only sending 50 bytes while client is expecting 100 and server is also waiting for client's reply (i.e. socket close connection has not been initiated by server in which recv will return)
I am interested on Linux / Solaris platform. I don't have the development environment to check it out myself.
recv will return when there is data in the internal buffers to return. It will not wait until there is 100 bytes if you request 100 bytes.
If you're sending 100 byte "messages", remember that TCP does not provide messages, it is just a stream. If you're dealing with application messages, you need to handle that at the application layer as TCP will not do it.
There are many, many conditions where a send() call of 100 bytes might not be read fully on the other end with only one recv call when calling recv(..., 100); here's just a few examples:
The sending TCP stack decided to bundle together 15 write calls, and the MTU happened to be 1460, which - depending on timing of the arrived data might cause the clients first 14 calls to fetch 100 bytes and the 15. call to fetch 60 bytes - the last 40 bytes will come the next time you call recv() . (But if you call recv with a buffer of 100 , you might get the last 40 bytes of the prior application "message" and the first 60 bytes of the next message)
The sender buffers are full, maybe the reader is slow, or the network is congested. At some point, data might get through and while emptying the buffers the last chunk of data wasn't a multiple of 100.
The receiver buffers are full, while your app recv() that data, the last chunk it pulls up is just partial since the whole 100 bytes of that message didn't fit the buffers.
Many of these scenarios are rather hard to test, especially on a lan where you might not have a lot of congestion or packet loss - things might differ as you ramp up and down the speed at which messages are sent/produced.
Anyway. If you want to read 100 bytes from a socket, use something like
int
readn(int f, void *av, int n)
{
char *a;
int m, t;
a = av;
t = 0;
while(t < n){
m = read(f, a+t, n-t);
if(m <= 0){
if(t == 0)
return m;
break;
}
t += m;
}
return t;
}
...
if(readn(mysocket,buffer,BUFFER_SZ) != BUFFER_SZ) {
//something really bad is going on.
}
The behavior is determined by two things. The recv low water mark and whether or not you pass the MSG_WAITALL flag. If you pass this flag the call will block until the requested number of bytes are received, even if the server crashes. Other wise it returns as soon as at least SO_RCVLOWAT bytes are available in the socket's receive buffer.
SO_RCVLOWAT
Sets the minimum number of bytes to
process for socket input operations.
The default value for SO_RCVLOWAT is
1. If SO_RCVLOWAT is set to a larger value, blocking receive calls normally
wait until they have received the
smaller of the low water mark value or
the requested amount. (They may return
less than the low water mark if an
error occurs, a signal is caught, or
the type of data next in the receive
queue is different than that returned,
e.g. out of band data). This option
takes an int value. Note that not all
implementations allow this option to
be set.
If you read the quote precisely, the most common scenario is:
the socket is receiving data. That 100 bytes will take some time.
the recv() call is made.
If there are more than 0 bytes in the buffer, recv() returns what is available and does not wait.
While there are 0 bytes available it blocks and the granularity of the threading system determines how long that is.

Handle mutiple UDP client

I have a single server and multiple UDP clients in my setup and each sending a boolean string (true/false 5 byte max) at regular intervals of time (few milliseconds). Based on which i am determining if the device is alive or not.The number of clients are unknown before starting the program and using c++ as my programming language.
Is it possible to have a multi threading in UDP socket connection ? I came across and example for TCP where they create a new socket descriptor and spawn a thread. If its possible to write a multi threading for UDP serves please provide example/reference code.
During
while (true)
{
if ((numbytes = recvfrom(sockfd, buf, MAXBUFLEN-1 , 0,
(struct sockaddr *)&cient_addr, &client_addr_len)) == -1)
{
perror("recvfrom");
exit(1);
}
// Process buffer...
// Do something with buffer
}
In multiple client scenario if i receive message from one client its copied to local buffer of maximum buffer size. And while process this message if we receive one more message from another client where will this message store intermittently before reading to local buffer ? Does the socket file descriptor has its own buffer ?
What happens if both clients send message at the same time ? Only one is read to local buffer what will happen to another message ? will it wait for the next recvfrom ?
I read that if Maximum buffer is less that the message / packet size received then the recvfrom will only read the max buffer length and might through some error...Although all of my client only send 5 bytes maximum..If I assign Maximum bytes to 1024 bytes which is way far than expected what prices am I gonna pay ?
Thank you.

How to write a proxy in go (golang) using tcp connections

I apologize before hand if some of these questions might be obvious for expert network programmers. I have researched and read about coding in networking and it is still not clear to me how to do this.
Assume that I want to write a tcp proxy (in go) with the connection between some TCP client and some TCP server. Something like this:
First assume that these connection are semi-permanent (will be closed after a long long while) and I need the data to arrive in order.
The idea that I want to implement is the following: whenever I get a request from the client, I want to forward that request to the backend server and wait (and do nothing) until the backend server responds to me (the proxy) and then forward that response to the client (assume that both TCP connection will be maintained in the common case).
There is one main problem that I am not sure how to solve. When I forward the request from the proxy to the server, and get the response, how do I know when the server has sent me all the information that I need if I do not know beforehand the format of the data being sent from the server to the proxy (i.e. I don't know if the response from the server is of the form of type-length-value scheme nor do I know if `\r\n\ indicates the end of the message form the server). I was told that I should assume that I get all the data from the server connection whenever my read size from the tcp connection is zero or smaller than the read size that I expected. However, this does not seem correct to me. The reason it might not be correct in general is the following:
Assume that the server for some reason is only writing to its socket one byte at a time but the total length of the response to the "real" client is much much much longer. Therefore, isn't it possible that when the proxy reads the tcp socket connected to the server, that the proxy only reads one byte and if it loops fast enough (to do a read before it receives more data), then read zero and incorrectly concludes that It got all the message that the client intended to receive?
One way to fix this might be to wait after each read from the socket, so that the proxy doesn't loop faster than it gets bytes. The reason that I am worried is, assume there is a network partition and i can't talk to the server anymore. However, it is not disconnected from me long enough to timeout the TCP connection. Thus, isn't it possible that I try to read from the tcp socket to the server again (faster than I get data) and read zero and incorrectly conclude that its all the data and then send it pack to the client? (remember, the promise I want to keep is that I only send whole messages to the client when i write to the client connection. Thus, its illegal to consider correct behaviour if the proxy goes, reads the connection again at a later time after it already wrote to the client, and sends the missing chunk at a later time, maybe during the response of a different request).
The code that I have written is in go-playground.
The analogy that I like to use to explain why I think this method doesn't work is the following:
Say we have a cup and the proxy is drinking half the cup every time it does a read from the server, but the server only puts 1 teaspoon at a time. Thus, if the proxy drinks faster than it gets teaspoons it might reach zero too soon and conclude that its socket is empty and that its ok to move on! Which is wrong if we want to guarantee we are sending full messages every time. Either, this analogy is wrong and some "magic" from TCP makes it work or the algorithm that assumes until the socket is empty is just plain wrong.
A question that deals with a similar problems here suggests to read until EOF. However, I am unsure why that would be correct. Does reading EOF mean that I got the indented message? Is an EOF sent each time someone writes a chunk of bytes to a tcp socket (i.e. I am worried that if the server writes one byte at a time, that it sends 1 EOF per bytes)? However, EOF might be some of the "magic" of how a TCP connection really works? Does sending EOF's close the connection? If it does its not a method that I want to use. Also, I have no control of what the server might be doing (i.e. I do not know how often it wants to write to the socket to send data to the proxy, however, its reasonable to assume it writes to the socket with some "standard/normal writing algorithm to sockets"). I am just not convinced that reading till EOF from the socket from server is correct. Why would it? When can I even read to EOF? Are EOFs part of the data or are they in the TCP header?
Also, the idea that I wrote about putting a wait just epsilon bellow the time-out, would that work in the worst-case or only on average? I was also thinking, I realized that if the Wait() call is longer than the time-out, then if you return to the tcp connection and it doesn't have anything, then its safe to move on. However, if it doesn't have anything and we don't know what happened to the server, then we would time out. So its safe to close the connection (because the timeout would have done that anyway). Thus, I think if the Wait call is at least as long as the timeout, this procedure does work! What do people think?
I am also interested in an answer that can justify maybe why this algorithm work on some cases. For example, I was thinking, even if the server only write a byte at a time, if the scenario of deployment is a tight data centre, then on average, because delays are really small and the wait call is almost certainly enough, then wouldn't this algorithm be fine?
Also, are there any risks of the code I wrote getting into a "deadlock"?
package main
import (
"fmt"
"net"
)
type Proxy struct {
ServerConnection *net.TCPConn
ClientConnection *net.TCPConn
}
func (p *Proxy) Proxy() {
fmt.Println("Running proxy...")
for {
request := p.receiveRequestClient()
p.sendClientRequestToServer(request)
response := p.receiveResponseFromServer() //<--worried about this one.
p.sendServerResponseToClient(response)
}
}
func (p *Proxy) receiveRequestClient() (request []byte) {
//assume this function is a black box and that it works.
//maybe we know that the messages from the client always end in \r\n or they
//they are length prefixed.
return
}
func (p *Proxy) sendClientRequestToServer(request []byte) {
//do
bytesSent := 0
bytesToSend := len(request)
for bytesSent < bytesToSend {
n, _ := p.ServerConnection.Write(request)
bytesSent += n
}
return
}
// Intended behaviour: waits until ALL of the response from backend server is obtained.
// What it does though, assumes that if it reads zero, that the server has not yet
// written to the proxy and therefore waits. However, once the first byte has been read,
// keeps writting until it extracts all the data from the server and the socket is "empty".
// (Signaled by reading zero from the second loop)
func (p *Proxy) receiveResponseFromServer() (response []byte) {
bytesRead, _ := p.ServerConnection.Read(response)
for bytesRead == 0 {
bytesRead, _ = p.ServerConnection.Read(response)
}
for bytesRead != 0 {
n, _ := p.ServerConnection.Read(response)
bytesRead += n
//Wait(n) could solve it here?
}
return
}
func (p *Proxy) sendServerResponseToClient(response []byte) {
bytesSent := 0
bytesToSend := len(request)
for bytesSent < bytesToSend {
n, _ := p.ServerConnection.Write(request)
bytesSent += n
}
return
}
func main() {
proxy := &Proxy{}
proxy.Proxy()
}
Unless you're working with a specific higher-level protocol, there is no "message" to read from the client to relay to the server. TCP is a stream protocol, and all you can do is shuttle bytes back and forth.
The good news is that this is amazingly easy in go, and the core part of this proxy will be:
go io.Copy(server, client)
io.Copy(client, server)
This is obviously missing error handling, and doesn't shut down cleanly, but clearly shows how the core data transfer is handled.

Socket programming Client Connect

I am working with client-server programming I am referring this link and my server is successfully running.
I need to send data continuously to the server.
I don't want to connect() every time before sending each packet. So for first time I just created a socket and send the first packet, the rest of the data I just used write() function to write data to the socket.
But my problem is while sending data continuously if the server is not there or my Ethernet is disabled still it successfully write data to socket.
Is there any method by which I can create socket only at once and send data continuously with knowing server failure?.
The main reason for doing like this that, on the server side I am using GPRS modem and on each time when call connect() function for each packet the modem get hanged.
For creating socket I using below code
Gprs_sockfd = socket(AF_INET, SOCK_STREAM, 0);
if (Gprs_sockfd < 0)
{
Display("ERROR opening socket");
return 0;
}
server = gethostbyname((const char*)ip_address);
if (server == NULL)
{
Display("ERROR, no such host");
return 0;
}
bzero((char *) &serv_addr, sizeof(serv_addr));
serv_addr.sin_family = AF_INET;
bcopy((char *)server->h_addr,(char *)&serv_addr.sin_addr.s_addr,server->h_length);
serv_addr.sin_port = htons(portno);
if (connect(Gprs_sockfd,(struct sockaddr *) &serv_addr,sizeof(serv_addr)) < 0)
{
Display("ERROR connecting");
return 0;
}
And each time I writing to the socket using the below code
n = write(Gprs_sockfd,data,length);
if(n<0)
{
Display("ERROR writing to socket");
return 0;
}
Thanks in advance.............
TCP was designed to tolerate temporary failures. It does byte sequencing, acknowledgments, and, if necessary, retransmissions. All unacknowledged data is buffered inside the kernel network stack. If I remember correctly the default is three re-transmission attempts (somebody correct me if I'm wrong) with exponential back-off timeouts. That quickly adds up to dozens of seconds, if not minutes.
My suggestion would be to design application-level acknowledgments into your protocol, meaning the server would send a short reply saying that it received that much data up to now, say every second. If the client does not receive suck ack in say 3 seconds, the client knows the connection is unusable and can close it. By the way, this is easier done with non-blocking sockets and polling functions like select(2) or poll(2).
Edit 0:
I think this would be very relevant here - "The ultimate SO_LINGER page, or: why is my tcp not reliable".
Nikolai is correct here, the behaviour you experience here is desirable as basically you could continue transfering data after network outage without any logic in your application. If your application should detect outages longer that specified amount of time, you need to add heartbeating into your protocol. This is standard way of solving the problem. It can also allow you for detect situation when network is all right, receiver is alive, but it has deadlocked (due to to a software bug).
Heartbeating could be as simple as mentioned by Nikolai -- sending a small packet every X seconds; if the server can't see the packet for N*X seconds, the connection would be dropped.