Sending and receiving data over Internet

Sending and receiving data over Internet - sockets

This question is not for a concrete implementation of how this is done. It is more about the concept and design of sending information over Internet with some kind of protocol - either TCP or UDP. I know only that sockets are needed, but I am wondering about the rest. For example after a connection is made and you send the information through that, but how does the other end listen for a specific port and does it listen constantly?
Is listening done in a background thread waiting for information to be received? (In order to be able to do other things/processing while waiting for information)
So in essence, I think a real world example of how such an application works on a high level would be enough to explain the data flow. For example sending files in Skype or something similar.
P.S. Most other questions on similar topics are about a concrete implementation or a bug that someone has.

What I currently do in an application is the following using POSIX sockets with the TCP Protocol:
Most important thing is: The most function are blocking functions. So when you tell your server to wait for client connection, the function will block until a connection is established (if you need a server that handles multiple clients at once, you need to use threading!)
Server listens for specific port until a client connects. After the connect, you will get a new socket file descriptor to communicate with the client whilst the initial socket can listen to new connections. My server then creats a new thread to handle that client whilst waiting for new connections on the initial socket. In the new thread the server waits for a request command from the Client (e.g. Request Login Token). After a request was received by the server, the server will gather its informations, packs it together using Googles Protocol Buffers and sends it to the client. The client now either tells the server to terminate the session (if every data is received by the client that it needs) or send another request.
Thats basically the idea in my server. The bigger problem is the way you transmit and receive data. E.g. you cant send structs or classes (at least not via C++) over the wire, you need some kind of serializer and you have to make sure the other part knows how much to receive. So what i do is, first send a 4byte integer over the wire containing the size of the incomming package, then send the package itself using a serializer (in my case Googles Protocol buffers). The other side waits for 4 byte to be available, knowing that this will be the size of the incomming package. After 4 bytes are received, the program waits for exact that amount of data being available on the socket, when available, read the data out of the buffer and deserialize it. When the socket is not receiving data for 30 seconds, trigger a timeout and terminate the connection.
What you always need to be aware of is the endianess of the systems. E.g. a big endian system (e.g. PowerPC) and a little endian system (e.g. x86) will have problems when you send an integer directly over the wire. For example a
0001
on the x86, is a
1000
on the Power PC, thus making a 8 out of a 1. So you should always use functions like ntohl, an htonl, which will convert data from and to host byte order from and to network byte order (network byte order is always big endian).
Hope this kind of helps. I could also provide some code to you if that would help.

Related

Does it make sense to use RTP protocol for multiple streamers and single receiver?

I am in a process of learning and trying to use the RTP/RTCP protocol. My situation is that there is 1 to n streamers and 1 (or potentially 1 to m if needed) receiver(s), but in a way that the streamers themselves do not know about each other (they cannot directly due to technical reasons, such as different network, limited bandwidth, etc...). So it is more like multiple unicast sessions, but the receiver actually knows about them all, collects data from all of them, it is just the senders do not know about each other .
Now reading about the protocol, it seems to me that huge portion of it is related to sending some feedback, collision detections, and so on. So I have doubts, is RTP is really applicable in this case? Is is already used in this way somewhere?
Seems to me it is still beneficial to collect statistic about data transfer that RTP provides (data sent, loss, times, etc...), it just feels the most of the protocol is sort of left out...
Also I have one additional question, going through the various RTP libraries, they all assume that sender will also open ports for receiving RTP/RTCP data, does RTP forbid use of one way communication? I mean application that would only stream the data, not expecting to receive anything back. The libraries (e.g. ccRTP) seem to assume both way communication only...

RTCP is the protocol that provides statistics. The stream receiver (client) will send stats to the sender (server) via RTCP. I don't believe the client will get any statistic reports from the server.
There's nothing wrong with a single client receiving multiple unicast sessions from various servers.
RTP requires two way communication during the setup process. Once setup is complete and the play cmd is sent, it is mostly one way. The exception are the "keep alive" packets that must be sent to the server periodically (usually every 60 seconds or so) to keep the stream going. The exact timeout value is sent to the client during the setup process.
But if you implement your own RTP, there's nothing stopping you from having the server send the stream continuously without any feedback from the client. Basically it would be implementing an infinite timeout value.
You can read about all the details in the spec: RTP: A Transport Protocol for Real-Time Applications

how to find that the client is reading from tcp buffer in go

I'm starting to use golang for a quite amount of time for a project. In my project I have to implement a tcp server which responds to tcp clients. The server has to send a number of messages to a client.
The problem is that when a server writes a message to a client connection, it has to wait until the client has read that message from buffer and then send another message (the server has to wait until the client calls the reader.ReadString('\n') method).
In my server code I wrote:
for {
data := <-client.outgoing
client.writer.WriteString(data + "\n")
client.writer.Flush()
}
but the server sends all the messages to client without waiting for ReadString in client.
How to make server wait until the client read a message and then send the other message?

I think that either the assignment is ambiguous or you're misinterpreting it and solving the XY problem.
The short answer is that you can never know whether the client has read a message just by looking at the TCP conversation. You have to implement this "protocol" in your application.
Here are a few problems:
From your application you don't really have access to what TCP is doing. You get a stream on which you can perform I/O.
The fact that a write to your stream "succeeds" only means that TCP has agreed to try to transport your stuff and has an independent copy. It doesn't say anything about whether the data has been received and it doesn't even mean the data has been even sent
You may find certain mechanisms to peer into TCP's inner workings (such as ioctls, SIOCINQ, SIOCOUTQ or various setsockopts): these won't help
Even if you find out what your TCP is doing, this only tells you what the remote TCP is doing. So if you have full control over your TCP and even see the acknowledgments from the peer, you still don't know what the application is doing. It's very possible the application didn't read the data yet (it might not have requested the data, the TCP might be withholding it in a buffer for some weird reason, the scheduler might not have scheduled the remote process etc.)
Going back to your question, a way to really know whether the remote application has received your message is to have the remote application tell you. This means you have to restructure your protocol to:
Send stuff from the server
Wait for a message from the application telling you it received your stuff
Send more stuff (because you know from point 2 it's safe to do so)

Game server TCP networking sockets - fairness

I'm writing a game server for a turn-based game. One criteria is that the game needs to be as fair for all players as possible.
So far it works like this:
Each client has a TCP connection. (If relevant, the connection is opened via WebSockets)
While running, continually check for incoming socket messages via epoll.
Iterate through clients with sockets ready to read:
Read all messages from the client.
Update the internal game state for each message.
Queue outgoing messages to affected clients.
At the end of each "window" (turn):
Iterate through clients and write all queued outgoing messages to their sockets
My concern for fairness raises the following questions:
Does it matter in which order I send messages to the clients?
Calling write() on all the sockets takes only a fraction of a second for my program, but somewhere in the underlying OS or networking would it make a difference if I sorted the client list?
Perhaps I should be sending to the highest-latency clients first?
Does it matter how I write the outgoing messages to the sockets?
Currently I'm writing them as one large chunk. The size can exceed a single packet.
Would it be faster for the client to begin its processing if I sent messages in smaller chunks than 1 packet?
Would it be better to write 1 packet worth to each client at a time, and iterate over the clients multiple times?
Are there any linux/networking configurations that would bear impact here?
Thanks in advance for your feedback and tips.

Does it matter in which order I send messages to the clients?
Yes, by fractions of milliseconds. If the network interface is available for sending the OS will immediately start sending. Why would it wait?
Perhaps I should be sending to the highest-latency clients first?
I think you should be sending in random order. Shuffle the list prior to sending. This makes it fair. I think your question is valid and this should be addressed.
Currently I'm writing them as one large chunk. [...]
First, realize that TCP is stream-based and that there are no packets/messages at the protocol level. On a physical level data is indeed packetized.
It is not necessary to manually split off packets because clients will read data as it arrives anyway. If a client issues a read, that read will complete immediately once the first packet has arrived. There is no artificial waiting in the OS.
Are there any linux/networking configurations that would bear impact here?
I don't know. Be sure to disable nagling.

TCP Sockets: "Rollback" after timeout occured

This is a rather general question about TCP sockets. I got a client/server application setup where messages are sent over the wire via TCP. The implementation is done via C++ POCO, however the question is not related to a certain technology.
A message can be a request (initiated by the client) or a response (initiated by the server).
A request has the structure:
Message Header
Request Header
Parameters
A response has the structure
Message Header
Response Header
Parameters
I know TCP guarantees that sent packages will be delivered in the order they have been sent. However, nothing can be assumed about the timespan a delivery might need.
On both sides I have a read/send timeout configured. Now I wonder how to have a clean set up on the transmitted data after a timeout. Don't know how to express this in the right terms, so let me describe an example:
Server S sends a response to the client (Message Header, Response Header, Parameters are put into the stream)
Client C receives the message header partially (e.g. the first 4 bytes of 12)
After these 4 bytes have been received, the reception timeout occurs
On client-side, an appropriate exception is thrown, the reception will be stopped.
The client considers the package as invalid.
Now the problem is, when the client tries to receive another package, he might receive the lasting part of the "old" response message header. From the point of view of the currently processed transaction (send request/get response), the client receives garbage.
So it seems that after a timeout has occured (no matter whether it has been on client or server-side), the communication should continue with a "clean setup", meaning that none of the communication partners will try to send some old package data and that no old package data is stored within the stream buffer of the respective socket.
So how are such situations commonly handled? Is there some kind of design pattern / idiomatic way to solve this?
How are such situations handled within other TCP-based protocols, e.g. HTTP?
In all the TCP samples around the net I've never seen an implementation that deals with those kind of problems...
Thank you in advance

when the client tries to receive another package, he might receive the lasting part of the "old" response message header
He will receive the rest of the failed message, if he receives anything at all. He can't receive anything else, and specifically data that was sent later can't be received before or instead of data that was sent earlier. It is a reliable byte-stream. You can code accordingly.
the communication should continue with a "clean setup", meaning that none of the communication partners will try to send some old package data
You can't control that. If a following message has been written to the TCP socket send buffer, which is all that send() actually does, it will be sent, and there is no way of preventing it short of resetting the connection.
So you either need to code your client to cope with the entire bytestream as it arrives or possibly close the connection on a timeout and start again.

Will a TCP RST cause a host to drop the receive buffer?

Upon receiving a TCP RST packet, will the host drop all the remaining data in the receive buffer that has already been ACKed by the remote host but not read by the application process using the socket?
I'm wondering if it's dangerous to close a socket as soon as I'm not interested in what the other host has to say anymore (e.g. to conserver resources); e.g. if that could cause the other party to lose any data I've already sent, but he has not yet read.
Should RSTs generally be avoided and indicate a complete, bidirectional failure of communication, or are they a relatively safe way to unidirectionally force a connection teardown as in the example above?

I've found some nice explanations of the topic, they indicate that data loss is quite possible in that case:
http://blog.olivierlanglois.net/index.php/2010/02/06/tcp_rst_flag_subtleties
http://blog.netherlabs.nl/articles/2009/01/18/the-ultimate-so_linger-page-or-why-is-my-tcp-not-reliable also gives some more information on the topic, and offers a solution that I've used in my code. So far, I've not seen any RSTs sent by my server application.

Application-level close(2) on a socket does not produce an RST but a FIN packet sent to the other side, which results in normal four-way connection tear-down. RSTs are generated by the network stack in response to packets targeting not-existing TCP connection.
On the other hand, if you close the socket but the other side still has some data to write, its next send(2) will result in EPIPE.
With all of the above in mind, you are much better off designing your own protocol on top of TCP that includes explicit "logout" or "disconnect" message.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse