Bandwidth save GPRS and TCP - sockets

Hello i have made a program for my old windows mobile phone to send gps data ,temperature etc every 5 seconds just for experimental reasons to create a fleet management system.
I noticed that within one hour 350kb were consumed although i sent only 20kb of data...
As i dont have deep knowledge in networks ,how much does a tcp connection cost in bytes?
Maybe i should keep the socket alive because i close and open it every 5 seconds.Would that save bytes?
Also does MTU matter here?
Any other idea to reduce the overhead?
thank you

Let's do some math here.
Every 5 seconds is 720 connections per hour plus data. 20K / 720 is about 28 bytes of payload (your GPS data) for each connection.
IP and TCP headers along are 48 bytes in addition to whatever data is being sent.
3-way handshake connection: 3 packets (2 out, 1 in) == 96 bytes out and 48 bytes in
Outbound Data-packet: 48+28 bytes == 76 bytes (out)
Inbound Ack: 48 bytes (in)
Close: 48 bytes (out)
Final Ack: 48 bytes (in)
Total out per connection: 220
Total in per connection: 144
Total data send/received per connection: 220+144 = 364
Total data usage in one hour = 364 * 720 = 262K
So I'm in the ballpark of your data usage estimates.
If you're looking to reduce bandwidth usage, here's three ideas:
Scale back on your update rate.
Don't tear down the socket connection each time. Just keep it open.
Given your GPS coordinates are periodically updated, you could consider using UDP instead of TCP. There's potential for packet loss, but given you're retransmitting fresher data every 5 seconds anyway, an update getting lost isn't worth the bandwidth to retransmit. IP and UDP headers combined are only 28 bytes with no "connection" overhead.
UPDATE
When I originally posted this, I erroneously misunderstood the connection close to be a single exchange of FIN packets between client and server. In practice, the client sends a FIN as part of it initiating the CLOSE. Then server ACKs the FIN. Then the server sends its own FIN that is ACK'd by the client. In other words, an additional 96 bytes per connection. Redoing our math:
Total data send/received per connection =
220+48 + 144+48 = 460
Total data usage in one hour = 460 * 720 = 331K
So my revised estimate of 331KB in one hour is a bit closer to what the OP saw.

Related

Write big messages to socket buffer vs lots of small messages

My question is quite simple. Assume we have a TCP socket server which is going to send 10 length prefixed messages each second to its connected clients. each message has size of 500 bytes. Is it better to merge all those 10 messages and call socket.write() once with a 5000 bytes lengthed message each second , or it is better to call socket.write() 10 times per second ? Which one cause lower latency ?

Protocol with small(est) overhead over GPRS

Our company developed stations that collect data in an agricultural field. These field could be in the middle of nowhere, so stations use GSM/GPRS with SIM card that automatically switches to strongest provider.
Every 5 minutes, an internet connection is setup to send data to a server. The data has a structure with packet length, command, sensor data and crc check. But these data structures are sent with a http post to an url.
For 480 bytes of data, about 2550 bytes of data traffic is used. There is a lot of overhead in the HTTP protocol. Because we only need to send 480 bytes of data, we have 80% overhead with the post over http. Now we have a few hundreds of stations and that number is growing. So costs for data traffic are increasing rapidly.
We want to do a redesign of the transmission of data. The data is send by microchip microprocessor in the stations.
Our goal is to decrease the overhead as much as possible, with guaranteed data delivery. So I looked into TCP and UDP.
TCP has failure detection and recovery, but has a higher overhead.
UDP has lower overhead, but there it is not guaranteed that data is delivered without failure.
My first idea is to build a server that listens to a TCP port. And stations sent the data over TCP. Mainly because of guarantied data delivery.
With UDP we have to develop check and resend of data ourself, but the data structure of our records is already prepared to do checks.
So I am really in doubt what to do. And I am trying to get an answer on these questions:
How many bytes overhead would it take by TCP and UDP to sent (and deliver) 480 bytes of data?
Are TCP and UDP the best ways to consider for sending 480 bytes of data, or is there a smarter solution with even lower overhead?
How many bytes overhead would it take by TCP and UDP to sent (and deliver) 480 bytes of data?
A (typical) TCP header is 20 bytes long although it can be (slightly) longer with options. If the entire 480 bytes are sent in a single TCP segment, you'd end up with 480 +20 + 20 (IP header) = 520 bytes before layer-2 overhead.
UDP has an 8 byte header, so for UDP you'll have 480 + 8 + 20 = 508 bytes.
However you should to consider that TCP is a stream protocol. Reading from a TCP socket is like reading from a binary file - you'd need to split that stream into individual messages yourself by using some sort of delimiter or prepending the length of the message to each message.
UDP on the other hand works on individual messages. Reading from a UDP socket would return messages one at a time.
Are TCP and UDP the best ways to consider for sending 480 bytes of data, or is there a smarter solution with even lower overhead?
UDP and TCP are the lowest level transport protocols on the internet. HTTP and other high-level protocols are built on top of them. If the size of data is critical, raw TCP and UDP are as low-overhead as you're going to get without using RAW sockets and embedding your data directly into IP packets.

SIP over TLS bandwidth consumption

let's say I have connection using SIPS (SIP secure) using g729 codec.
Does anyone know how many bandwidth it takes?
I know call using g729 codec 10ms is about 11Kb bandwidth consumption.
According to me:
SIP or SIPS use almost the same bandwidth. It will makes no difference.
The SIP bandwidth compared to RTP is insignificant.
Imagine you have a one minute call. The total exchange would be, for example:
For SIP:
1000 bytes for INVITE
1000 bytes for 200 OK for INVITE
500 bytes for ACK
500 bytes for BYE
500 bytes for 200 Ok for BYE
total = 3500 bytes
For RTP and g729, with 10ms:
Each of my RTP packet is 22 bytes. (not including UDP headers)
G729 payload: 10 bytes
RTP header: 12 bytes
total = 100 * 22 = 2200 bytes/second (which is 17,6kb/s)
total = 100 * 22 * 60 = 132000 bytes for a one minute call
For only one minute, the ratio is already
132000/(132000+3500) = 97,4%
3500/(132000+3500) = 2,6%
If you have longuer call duration, the sip related bandiwdth would drop quickly under 1%.
If you have frequent SIP messages during the calls (like INFO), may be you can take them into account, but this is usually not the case.
NOTE: I used an 8kbit/s G729 stream encoder instead of 11kbit/s. Just replace with your own values.
EDIT:
With usual SRTP encryption method, if you use SRTP, the encrypted payload will remain the same size. However, an additionnal authentication tag is usually used. With AES_CM_128_HMAC_SHA1_80 being used, 10 bytes will be added to each packet.

Can UDP packets be partially sent like TCP ones?

I've created some type of client/server application that has its own data ACK system. It was originally written in TCP because of some limitations, but the base was written thinking about UDP.
The packets that I sent to the server had their own encapsulation (packet id and packet size headers. I know that UDP has also a checksum so I didn't add a header for that), but how TCP works, I know that the server may not receive the entire packet, so I gathered and buffered the data received until a full valid packet was received.
Now I have the chance to change my client/server program to UDP, and I know that one difference with TCP is that data is not received in the same order as sent (which is why I added a packet id header).
The thing that I want to know is: If I send multiple packets, will they be received with no guaranteed order but with guaranteed encapsulation? I mean, if I send a packet sized 1000 bytes of data and another packet sized 400 bytes of data later, will the server receive 2 packets, one of 1000 bytes and another one of 400 bytes, or is there a chance to receive 200 of that 1000 bytes, then 400 bytes of that 1000 bytes and later the rest of the bytes like TCP does?
UDP is a datagram service. Datagrams may be split for transport, but they will be reassembled before being passed up to the application layer.
With small packet sizes you should have no concern that packets will be broken into multiple packets. That generally is only an issue when the packet gets over an Ethernet network.
You ask" will the server receive 2 packets, one of 1000 bytes and another one of 400 bytes, or there's a chance to receive 200 of that 1000 bytes, then 400 bytes of that 1000 bytes and later the rest of the bytes like TCP can do?
With a packet size of under 1492 bytes there is not going to be any partial packets.
UPDATE:
Apparently I see a need to clarify why I say UDP packet lengths 1492 bytes or less will not affect transport robustness.
The maximum UDP length as implicitly specified in RFC 768 is 65535 including the 8 byte Header. Max Payload Frame Length is 65527 bytes.
While this should not be disputed number the UDP data length is often reported incorrectly. This is exemplified in a previous post:
What is the largest Safe UDP Packet Size on the Internet
A data packet is not constrained by the MTU of the underlying network ToS or communications protocol's Frame length (e.g. IP and Ethernet respectively). Discrepancies between MTU and Protocol Lengths are remedied by Fragmentation and Reassembly
At the Transport Layer each network Type of Service (ToS) has a specific Maximum Transmission Unit (MTU). UDP is encapsulated within IP Packets and IP Packets are encapsulated by the transporting Network's ToS. IP packets are often transmitted through networks of various ToS which include Ethernet, PPP, HDLC, and ADCCP.
When the MTU for a receiving Network ToS is less than the sending ToS then the receiving network must Fragment the received packet. When the Network sends a packet to a network with a higher MTU, the receiving Network must reassemble any Fragmented packets.
Ethernet is the defacto mainstream protocol with the lowest MTU. Non-Mainsteam Arcnet the MTU is 507 bytes. The practical lowest MTU is Ethernet's 1500 bytes, minus the overhead makes maximum payload length 1492 bytes.
If the UDP packet has more than 1492 bytes the data packet will likely be Fragmented and Reassembled. The Fragment and Reassembly adds complexity to the already complex process coupling UDP and IP, and therefore should be avoided.
Because UDP is a non-guaranteed datagram delivery protocol it boosts the transport performance. Robustness is left to the originating and terminating Application. RFC 1166 sets the standards for the communication protocol link layer, IP layer, and transport layer, the UDP Application is responsible for packetization, reassembly, and flow control.
The maximum UDP packet size can also be lowered by a Communication Host's Application Layer. The packet length is a balance between performance and robustness.
The Communications Host's Application Layer may set a maximum UDP packet size. The typical UDP max data length at the Application layer will use the maximum allowed by the IP protocol or the Host Data Link Layer, typically Ethernet.
It is the Application's programer that chooses to use the Host Application Layer or the Host Data Link layer. The Host Application Layer will detect UDP packet errors and discard the packet if necessary. When the application communicates directly with the Host Data Link, the application then has the responsibility of detecting packet errors.
Using maximum UDP data packet length of Ethernet's max payload length of 1492 bytes will eliminate the issues of Fragmentation and Delivery Order of multiple Frames.
That is why I said packet length is not a Fragmentation issue with packet lengths of 1000 and 400 bytes.
###
I do not know what you mean by "guaranteed encapsulation", it makes no sense to me.
With IP there is no guarantee of packet delivery of the order whether UDP or TCP.
As long as you control both sides of the conversation, you can work out your own protocol within the data packet to handle ordering and post packets. Reserve the first x bytes of the packet for a sequential order number and total number of packets. (e.g. 1 of 3, 2 of 2, 3 of 3). If the client side is missing a packet then the client must send a request for retransmission. You need to determine to what level you are going to go for data integrity. like maybe the re-transmission packet is lost.
That may be what you meant by "guaranteed encapsulation", Where there is other information within your datagram packet to ensure some integrity. You should add your own CRC for the total data being sent if broken into multiple datagrams. the checksum is not very robust and is only for the one packet.
UDP is much faster then TCP but TCP has flow control and guaranteed delivery.
UDP is good for streaming content like voice where a lost packet is not going to matter.
Network reliability has improved a lot since the days when these issues were a major concern.

TCP, HTTP and the Multi-Threading Sweet Spot

I'm trying to understand the performance numbers I'm getting and how to determine the optimal number of threads.
See the bottom of this post for my results
I wrote an experimental multi-threaded web client in perl which downloads a page, grabs the source for each image tag and downloads the image - discarding the data.
It uses a non-blocking connect with an initial per file timeout of 10 seconds which doubles after each timeout and retry. It also caches IP addresses so each thread only has to do a DNS lookup once.
The total amount of data downloaded is 2271122 bytes in 1316 files via 2.5Mbit connection from http://hubblesite.org/gallery/album/entire/npp/all/hires/true/ . The thumbnail images are hosted by a company which claims to specialize in low latency for high bandwidth applications.
Wall times are:
1 Thread takes 4:48 -- 0 timeouts
2 Threads takes 2:38 -- 0 timeouts
5 Threads takes 2:22 -- 20 timeouts
10 Threads take 2:27 -- 40 timeouts
50 Threads take 2:27 -- 170 timeouts
In the worst case ( 50 threads ) less than 2 seconds of CPU time are consumed by the client.
avg file size 1.7k
avg rtt 100 ms ( as measured by ping )
avg cli cpu/img 1 ms
The fastest average download speed is 5 threads at about 15 KB / sec overall.
The server actually does seem to have pretty low latency as it takes only 218 ms to get each image meaning it takes only 18 ms on average for the server to process each request:
0 cli sends syn
50 srv rcvs syn
50 srv sends syn + ack
100 cli conn established / cli sends get
150 srv recv's get
168 srv reads file, sends data , calls close
218 cli recv HTTP headers + complete file in 2 segments MSS == 1448
I can see that the per file average download speed is low because of the small file sizes and the relatively high cost per file of the connection setup.
What I don't understand is why I see virtually no improvement in performance beyond 2 threads. The server seems to be sufficiently fast, but already starts timing out connections at 5 threads.
The timeouts seem to start after about 900 - 1000 successful connections whether it's 5 or 50 threads, which I assume is probably some kind of throttling threshold on the server, but I would expect 10 threads to still be significantly faster than 2.
Am I missing something here?
EDIT-1
Just for comparisons sake I installed the DownThemAll Firefox extension and downloaded the images using it. I set it to 4 simultaneous connections with a 10 second timeout. DTM took about 3 minutes to download all the files + write them to disk, and it also started experiencing timeouts after about 900 connections.
I'm going to run tcpdump to try and get a better picture what's going on at the tcp protocol level.
I also cleared Firefox's cache and hit reload. 40 Seconds to reload the page and all the images. That seemed way too fast - maybe Firefox kept them in a memory cache which wasn't cleared? So I opened Opera and it also took about 40 seconds. I assume they're so much faster because they must be using HTTP/1.1 pipelining?
And the Answer Is!??
So after a little more testing and writing code to reuse the sockets via pipelining I found out some interesting info.
When running at 5 threads the non-pipelined version retrieves the first 1026 images in 77 seconds but takes a further 65 seconds to retrieve the remaining 290 images. This pretty much confirms MattH's theory about my client getting hit by a SYN FLOOD event causing the server to stop responding to my connection attempts for a short period of time. However, that is only part of the problem since 77 seconds is still very slow for 5 threads to get 1026 images; if you remove the SYN FLOOD issue it would still take about 99 seconds to retrieve all the files. So based on a little research and some tcpdump's it seems like the other part of the issue is latency and the connection setup overhead.
Here's where we get back to the issue of finding the "Sweet Spot" or the optimal number of threads. I modified the client to implement HTTP/1.1 Pipelining and found that the optimal number of threads in this case is between 15 and 20. For example:
1 Thread takes 2:37 -- 0 timeouts
2 Threads takes 1:22 -- 0 timeouts
5 Threads takes 0:34 -- 0 timeouts
10 Threads take 0:20 -- 0 timeouts
11 Threads take 0:19 -- 0 timeouts
15 Threads take 0:16 -- 0 timeouts
There are four factors which
affect this; latency / rtt , maximum end-to-end bandwidth, recv buffer size
and the size of the image files being downloaded. See this site for a
discussion on how receive buffer size and RTT latency affect available
bandwidth.
In addition to the above, average file size affects the maximum per connection
transfer rate. Every time you issue a GET request you create an empty gap in
your transfer pipe which is the size of the connection RTT. For example, if
you're Maximum Possible Transfer Rate ( recv buff size / RTT ) is 2.5Mbit and
your RTT is 100ms, then every GET request incurs a minimum 32kB gap in your
pipe. For a large average image size of 320kB that amounts to a 10% overhead
per file, effectively reducing your available bandwidth to 2.25Mbit. However,
for a small average file size of 3.2kB the overhead jumps to 1000% and
available bandwidth is reduced to 232 kbit / second - about 29kB.
So to find the optimal number of threads:
Gap Size = MPTR * RTT
MPTR / (MPTR / Gap Size + AVG file size) * AVG file size)
For my above scenario this gives me an optimum thread count of 11 threads, which is extremely close to my real world results.
If the actual connection speed is slower than the theoretical MPTR then it
should be used in the calculation instead.
Please correct me this summary is incorrect:
Your multi-threaded client will start a thread that connects to the server and issues just one HTTP GET then that thread closes.
When you say 1, 2, 5, 10, 50 threads, you're just referring to how many concurrent threads you allow, each thread itself only handles one request
Your client takes between 2 and 5 minutes to download over 1000 images
Firefox and Opera will download an equivalent data set in 40 seconds
I suggest that the server rate-limits http connections, either by the webserver daemon itself, a server-local firewall or most likely dedicated firewall.
You are actually abusing the webservice by not re-using the HTTP Connections for more than one request and that the timeouts you experience are because your SYN FLOOD is being clamped.
Firefox and Opera are probably using between 4 and 8 connections to download all of the files.
If you redesign your code to re-use the connections you should achieve similar performance.