How are sockets sending data through packets in erlang? - sockets

So I'm reading trying to find more about sockets and I found that the documentation kind of repeats itself without going in any further detail in regards to the way sockets send data:
So what does the {packet, N} do when specifying this in the options (when opening a socket) ? Does it include a header of N bytes before the data or does it break the message into N packets? Or does it include a header of N to all the packets that the message ends up being broken into?
I was reading Joe Armstrong's Software for a concurrent world and I found this paragraph:
The word packet refers to the length of an application request or response message, not to the physical packet seen on the wire.
I can't get my head round the meaning of this. What is meant by the packet seen on the wire.
I tried to look into the documentation and I found little referring to what the option does. The brief explanation that I found is that it prepends the message with a N header however I also found this comment in a code written as an example:
%% Usually, it's a good idea to give up in case of a
%% send timeout, as you never know how much actually
%% reached the server, maybe only a packet header?!
According to this, then, the message gets broken into pieces (a random number I presume?) and each gets sent with a N byte header.
My question is how does the {header, N} option affect the way data is sent.

The excerpt from Joe's book you quote refers to the fact that applications are typically blissfully unaware of how networks arrange data for transmission. Depending on the network type, configuration, and protocols in use, the blocks of data sent over it will vary in size and will be framed with different metadata. Applications typically don't see raw packets, framing information, or the fact that sometimes raw packets are retransmitted due to problems that cause them to be dropped or corrupted.
Your question mentions two options: {packet, N} and {header, N}. These are quite different from each other.
The {packet, N} option allows N to be 1, 2, or 4. It attaches a header of N bytes to the front the message. The header specifies, in network order, the length of the message, i.e. the number of bytes in the message.
If you send a message consisting of X bytes, Erlang will prepend to the data N bytes containing the network order value of X, and send the whole thing. Assuming the receiver has also been configured with the same {packet, N} option, it will read the N-byte header to determine how many bytes to expect, wait to receive that many bytes, and then deliver those bytes, without the length header, to the receiving application. How the underlying networking software and hardware breaks the data into chunks for transmission across the network is a separate matter hidden from your application.
The {header, Size} option delivers the message to the receiver as a list of Size bytes followed by the remainder of the data as a binary. This option makes sense only when the binary option is in effect for the socket.

Related

Question on UDP : Packet containing multiple mesages and straddling

In the context of a UDP Message that I receive the UDP Message will have apart from the header containing the size of the overall packet, the second field is an unsigned int indicating a sequence number. The rest of packet is PayLoad [ actual message(s) ].
The payload contains zero or more messages which need to be further decoded and parsed. Now there is the below requirement in design.
"Messages can straddle packet boundaries."
What does this sentence mean in layman terms?
This might be quite simple. A message is thought not to be the same as a packet, and the former may be scattered across multiple packets.
Say, the first part of the message may reside in the first packet whilst the rest of it in the second.

application layer protocol - different size of packets

Assume I have defined my own application layer protocol on top of TCP for Instant Messaging. I have used a packet structure for the messages. As I am using symmetric (AES) and asymmetric (RSA) encryption, I obtain a different
packet size for different message types. Now to my questions.
How to read from a socket that I receive a single application layer packet?
What size should I specify?
Thanks in advance.
I have two approaches in mind.
Read from the TCP stream a fixed amount of bytes that represents the
actual packet size, and finally re-read from the stream the former gathered size of bytes.
Read the maximal packet size from the stream. Verify the actual size of
obtained bytes and decide so which message type it was.
Now, a more general question. Should I provide metadata like the packet size, encryption method, receiver, sender, etc.? If yes, should I encrypt these meta data as well?
Remember that with TCP, when reading from the network, there is no guarantee about the number of bytes received at that point in time. That is, a client might send a full packet in its write(), but that does not mean that your read() will receive the same number of bytes. Thus your code will always need to read some number of bytes from the network, then verify (based on the accumulated data) that you have received the necessary number of bytes, and then you can verify the packet (type, contents, etc) from there.
Some applications use state machine encoders/decoders and fixed size buffers for reading/writing their network data; other applications dynamically allocate buffers large enough for the "full packet", then continue reading bytes from the network until the "full packet" buffer is full. Which approach you take depends on your application. Thus the size you use for reading is not as important as how your code ensures that it has received a full packet.
As for whether you should encrypt additional metadata, that depends very much on your threat model (i.e. what threats your protocol wants to guard against, what assurances your protocol needs to provide to its clients/users). There's no easy way to answer that question without more context/details.
Hope this helps!

Message delimitation in TCP communication

I am a newbie to networks and in particular TCP (I have been fooling a bit with UDP, but that's it).
I am developing a simple protocol based on exchanging messages between two endpoints. Those messages need to be certified, so I implemented a cryptographic layer that takes care of that. However, while UDP has a sound definition of a packet that constitutes the minimum unit that can get transferred at a time, the TCP protocol (as far as my understanding goes) is completely stream oriented.
Now, this puzzles me a bit. When exchanging messages, how can I tell where one starts and the other one ends? In principle, I can obviously communicate fixed length messages or first communicate the size of each message in some header. However, this can be subject to attacks: while of course it is going to be impossible to distort or determine the content of the communication, the above technique would make it easy to completely disrupt my communication just by adding a single byte in the middle.
Say that I need to transfer a message 1234567 bytes long. First of all, I communicate 4 bytes with an integer representing the size of the message. Okay. Then I start sending out the actual message. That message gets split in several packets, which get separately received. Now, an attacker just sends in an additional packet, faking it as if it was part of the conversation. It can just be one byte long: this completely destroys any synchronization mechanism I have implemented! The message has a spurious byte in the middle, and it doesn't successfully get decoded. Not only that, the last byte of the first message disrupts the alignment of the second message and so on: the connection is destroyed, and with a simple, simple attack! How likely and feasible is this attack anyway?
So I am wondering: what is the maximum data unit that can be transferred at once? I understand that to a call to send doesn't correspond a call to receive: the message can be split in different chunks. How can I group the packets together in some way so that I know that they get packed together? Is there a way to define an higher level message that gets reconstructed and aligned all together and triggers a single call to a receive-like function? If not, what other solutions can I find to keep my communication re-alignable even in presence of an attacker?
Basically it is difficult to control the way the OS divides the stream into TCP packets (The RFC defining TCP protocol states that TCP stack should allow the clients to force it to send buffered data by using push function, but it does not define how many packets this should generate. After all the attacker can modify any of them).
And these TCP packets can get divided even more into IP fragments during their way through the network (which can be opted-out by a 'Do not fragment' IP flag -- but this flag can cause that your packets are not delivered at all).
I think that your problem is not about introducing packets into a stream protocol, but about securing it.
IPSec could be very beneficial in your scenario, as it operates on the network layer.
It provides integrity for every packet sent, so any modification on-the-wire gets detected and the invalid packets are dropped. In case of TCP the dropped packets get re-transmitted automatically.
(Almost) everything is done automatically by the OS -- so yo do not need to worry about it (and make mistakes doing so).
The confidentiality can be assured as well (with the same advantage of not re-inventing the wheel).
IPSec should provide you a reliable transport protocol ontop of which you can use whatever framing format you like.
Another alternative is using SSL/TLS on top of TCP session which is less robust (as it does close the whole connection on integrity error).
Now, an attacker just sends in an additional packet, faking it as if it was part of the conversation. It can just be one byte long: this completely destroys any synchronization mechanism I have implemented!
Thwarting such an injection problem is dealt with by securing the stream. Create an encrypted stream and send your packets through that.
Of course the encrypted stream itself then has this problem; its messages can be corrupted. But those messages have secure integrity checks. The problem is detected, and the connection can be torn down and re-established to resynchronize it.
Also, some fixed-length synchronizing/framing bit sequence can be used between messages: some specific bit pattern. It doesn't matter if that pattern occurs inside messages by accident, because we only ever specifically look for that pattern when things go wrong (a corrupt message is received), otherwise we skip that sequence. If a corrupt message is received, we then receive bytes until we see the synchronizing pattern, and assume that whatever follows it is the start of a message (length followed by payload). If that fails, we repeat the process. When we receive a correct message, we reply to the peer, which will re-transmit anything we didn't get.
How likely and feasible is this attack anyway?
TCP connections are identified by four items: the source and destination IP, and source and destination port number. The attacker has to fake a packet which matches your stream in these four identifiers, and sneak that packet past all the routers and firewalls between that attacker and the receiving machine. The attacker also has to be in the right ballpark with regard to the TCP sequence number.
Basically, this is next to impossible for an attacker C to perpetrate against endpoints A and B which are both distant from C on the network. The fake source IP will be rejected long before C is able to reach its destination. It's more plausible as an inside job (which includes malware): C is close to A and B.

Finding mpeg 2 packages in matlab with fread

I used a ts analyzer for a .ts file i have with mpeg-2 codec and i found out that it splits in 7311 packets.
I m trying to find this through matlab by using fopen to open the ts file in binary and fread to read the file but all i get is a column with a huge collection of numbers(way above the number of packets). Does anyone know how can i determine which of these data are the packets? Or if someone knows another way to find the packets would help me a lot.
Thank you in advance
From some quick googling, the MPEG-2 transport stream ('ts') format consists of packets 188-bytes in length, each having a 4-byte header followed by a 184-byte payload. Essentially, you can count the number of packets by counting the number of headers you find - but beware that, if you are only interested in counting the number of, e.g., video packets in the stream, then you will need some deeper analysis of the headers, because the stream may contain any number of interleaved "elementary streams" (which can be video, audio, or arbitrary data). Each elementary packet type in the stream is denoted by a unique "PID" which is contained in the header.
Aside from the above, you will also have to handle synchronisation - each header begins with the "synchronisation byte", which has a value 0x47 (or 01000111 in binary). According to this resource, decoders begin by looking for this synchronisation byte; once they find one, they may have found a packet header. To make sure, they try to find three consecutive synchronisation bytes (188 bytes apart in the stream); if three are found, synchronisation can occur and the packet boundaries may from then on be assumed at 188-byte intervals. Note, however, that the first byte of each assumed header should be checked to see if it is a synchronisation byte - if it is not, then this is called "sync loss" and the syncrhonisation process must start again.
Once you have some code to syncrhonise to a stream, it should be fairly easy to extract the PIDs from the header of each packet and count the number of packets associated with each unique PID you find. You should probably also check the first bit after the synchronisation byte as, if set to 1, this indicates a transport error, and the packet's payload is invalid. Detailed information on the format of packet headers can be found here.

Sending And Receiving Sockets (TCP/IP)

I know that it is possible that multiple packets would be stacked to the buffer to be read from and that a long packet might require a loop of multiple send attempts to be fully sent. But I have a question about packaging in these cases:
If I call recv (or any alternative (low-level) function) when there are multiple packets awaiting to be read, would it return them all stacked into my buffer or only one of them (or part of the first one if my buffer is insufficient)?
If I send a long packet which requires multiple iterations to be sent fully, does it count as a single packet or multiple packets? It's basically a question whether it marks that the package sent is not full?
These questions came to my mind when I thought about web sockets packaging. Special characters are used to mark the beginning and end of a packet which sorta leads to a conclusion that it's not possible to separate multiple packages.
P.S. All the questions are about TCP/IP but you are welcomed to share information (answers) about UDP as well.
TCP sockets are stream based. The order is guaranteed but the number of bytes you receive with each recv/read could be any chunk of the pending bytes from the sender. You can layer a message based transport on top of TCP by adding framing information to indicate the way that the payload should be chunked into messages. This is what WebSockets does. Each WebSocket message/frame starts with at least 2 bytes of header information which contains the length of the payload to follow. This allows the receiver to wait for and re-assemble complete messages.
For example, libraries/interfaces that implement the standard Websocket API or a similar API (such as a browser), the onmessage event will fire once for each message received and the data attribute of the event will contain the entire message.
Note that in the older Hixie version of WebSockets, each frame was started with '\x00' and terminated with '\xff'. The current standardized IETF 6455 (HyBi) version of the protocol uses the header information that contains the length which allows much easier processing of the frames (but note that both the old and new are still message based and have basically the same API).
TCP connection provides for stream of bytes, so treat it as such. No application message boundaries are preserved - one send can correspond to multiple receives and the other way around. You need loops on both sides.
UDP, on the other hand, is datagram (i.e. message) based. Here one read will always dequeue single datagram (unless you mess with low-level flags on the socket). Event if your application buffer is smaller then the pending datagram and you read only a part of it, the rest of it is lost. The way around it is to limit the size of datagrams you send to something bellow the normal MTU of 1500 (less IP and UDP headers, so actually 1472).