How to use UDP from a machine with only NAT access - sockets

I have a machine, with no external IP address, it will need to send UDP packets to the outside world. Only NAT access.
Will this work?
It is really hard to prototype this in our environment.
It is still really under construction.
Any thoughts on how I can prototype this?

Most of the home network configurations in the world are made of a PC with an internal IP and a router with a public IP that NAT the internal one. (Independently of UDP/TCP or whatever protocol that needs to go out)
I see no troubles with it

It should work.
Ensure that for the socket created, set the TTL (time-to-live) to a value that is sufficiently large to cover the possible number of router hops to reach the destination. Running traceroute to the destination IP will give you a rough idea on the number of hops. Note that this value can change depending on network conditions. So it's best to set this to a larger value. Refer to sockets IOCtl API documentation for the syntax for setting TTL.
Finally, remember that UDP is not a reliable protocol. So even after taking the necessary steps above, the packet may not reach its destination. However, if the entire network, including the intermediary routers, is within a controlled environment, such as a corporate intranet, chances of packet drop are minimal.
If you want to add reliability on top of UDP, you can adopt a NAK based algorithm where packets are stamped with a sequence number. Various resources might advise you that if you need to add reliability over UDP you should consider TCP, but my experience has been that if your app runs in a controlled environment with very minimal chance of packet drops and you need fast connection setup and tear down, adding a lightweight reliability over UDP has its merits. Also TCP connections take up valuable space in the OS kernel whereas UDP don't. This could also be a consideration if you want to support very large number of 'connections' in a constrained environment.
At the end of the day you need to experiment a little to figure out what works best for you.
To prototype, I would set up a NAT server using something like Linux and then start working from there. Real world traffic scenarios that you want to simulate will determine where the client and server are to be located on either side of the NAT. That is, if the traffic should go through an ISP or all within a controlled environment.
HTH

Related

TCP is on top of IP, what does this mean? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I always hear about the layers of internet and i vaguely understand this. But, what confuse me most is that transport layer (including TCP protocol) lies on top of the internet layer(including IP protocol)..
What does this mean? For one who has a foggy understanding of the internet mechanism (I'm not a CS student or something I am just a hobby programmer)
The picture I have about the internet is that the network card sends/receives signals (packets) from the internet through wired connection / wifi then the OS using socket API sends/receives these packets acting as a layer between the hardware and the application which in turn uses some high-level protocol such as HTTP to interpret the data transferred - these protocol usually provided by languages e.g. python or java
.. I guess then that IP and TCP protocol are used at the level of the socket API? but I need more details ? I hope the explanation be in terms of coding/programming/implementation because abstractions used in this area confuse me.
Thank you , and sorry for my bad English
This is part of a layered solution to solve networking. Each layer has its own functionality:
IP (Internet Protocol) is in charge of delivering a packet (or datagram) from one interface, in one machine, with an IP address assigned to another interface in the same or other machine (node). Both nodes can be in the same LAN or different LAN connected through different paths (LAN's and routers). Basically it will make the packet get from source IP to destination IP. It provides a best-effort services, it doesn't assure the IP packet is going to arrive, it can be lost in the middle.
Above layer 3 or IP in the so-called TCP/IP stack, there is the transport layer. Its main functionality is to multiplex the lower layer (IP) service (take a packet from src to dst) among different applications. This is why in all transport layer protocols there is the concept of port or more generically Transport Service Access Point (TSAP). UDP, TCP, SCTP do that. UDP provides an unreliable service to the application. TCP provides a connected, reliable transport service to the application. This layer will make a message sent from application A in node Y reach application A1 in node Z, either reliably or unreliably (while IP only takes care of carrying the packet from node Y to node Z).
You will need to read a little about the OSI layered model and the TCP/IP layered model.
If you need to get more info I can address you to a training I have about IPv6 with a good introduction to networking: http://www.slideshare.net/rodolk/networking-tcpip-stack-introduction-ipv6
TCP is a protocol, known as "Transmission Control Protocol" - by specification it has features in place which makes sure that transmitted data is checked. On the other hand, there are things such as UDP, aka "User Datagram Protocol" which also works on top of IP - by specification this method does not check any transmitted data, so it's less useful where files must be fully intact (more utilised for video streaming, where some lost frames is acceptable, as opposed to binary file transfers where incorrect data means corruption and the whole file would be useless).
On to IP, IP is an addressing protocol, allowing a network to address and communicate with any machine that lives within it. IP stands for Internet Protocol, and it defines the fundamental way that two machines communicate over the "internet". It does not define how communications are handled, in ways such as being checked for data integrity, etc.
So, to summarise, the TCP and UDP are just extensions of IP. It is entirely possible, however, to have a socket based TCP or UDP connection, and I expect it's also possible to have some sort of MAC address protocol (as opposed to an IP address protocol). I don't know of any protocols which are similar to IP, but I imagine they do exist. In reality, using TCP over something other than IP is entirely unlikely. If you're going to the effort to create a custom protocol, chances are you'll want it fully custom and won't want to stick to design specifications designed for another protocol layer.
Note that calling it a "TCP/IP" connection is probably only ever used for legacy reasons. A lot of terms like this still exist because before the technology "bubble" growth, there were competing alternatives to IP. Even today, there is IPv6 which is technically an alternative to IPv4. It's also possible that we might one day outgrow IPv6, and at that point in time, there could be something other than IP to worry about.

TCP or UDP for lots of connections?

I want to create a P2P network with the following characteristics:
low latency is not really important
loosing packages is okay
the nodes would only send tiny amounts of data around
there will be no NAT/firewall issues, every node has an open port on its public ip
every node is connected to every other node
Usually I would use TCP for anything not time-critical but the last requirements causes the nodes to have lots of open connections for a long time. If I remember correctly, using TCP to connect to 1000 servers would mean I had to use 1000 ports to handle these connections. UDP on the other hand, would only require a single port for each node.
So my question is: Is TCP able to handle the above requirements in a network with e.g. 1000 nodes without tweaking the system? Would UDP be better suited in this case? Is there anything else that would be a deal-breaker for either protocol?
With UDP you control the "connection state" and it is pretty much the best way to do anything peer to peer related IF you have a high number of nodes or care about bandwidth, memory and CPU overhead. By moving all the control to your application in regards to the "connection state" of each node you minimize the amount of wasted resources by making it fit your needs exactly.
You will bypass a lot of operating system specific weirdness that limits the effectiveness of TCP with high numbers of connections. There is TIME_WAIT bloat and tens to hundreds of OS specific settings which will need tweaking for every user of your P2P app if it needs those high numbers. A test app I made which allowed you to use UDP with ack or TCP showed only a 10% difference in performance regardless of operating system using UDP. TCP performance was always lower than the best UDP and its performance varied wildly by over 600% depending upon the OS. With tweaks you can make most OS perform roughly the same using TCP but by default most are not properly tweaked.
So in my opinion it is harder to make a reliable UDP P2P network compared to TCP but it is often needed. However I would only advise that route it if you were quite experienced with networking as there are a lot of "gotchas" to deal with. There are libraries which help with this like Raknet or Enet. They provide ways to do reliable UDP but it still takes a higher amount of networking knowledge to know how this all ties in together, whereas with TCP it is mostly hidden from you.
In a peer to peer network you often have messages like NODE PINGs that you may not care if each one is always received, you just care if you have received one recently. ie You may send a ping every 10 seconds, and disconnect the node after 60 seconds of no ping. This would mean you would need 6 ping packets in a row to fail, which is highly unlikely unless the node is really down. If you received even one ping in that 60 second period then the node is still active. A TCP implementation of this would have involved more latency and bandwidth as it makes sure EACH ping message gets through and will block any other data going out until it does. And since you cannot rely on TCP to reliably tell you if a connection is dead, you are forced to add similar PING features for TCP, on top of all the other things TCP is already doing extra with your packets.
Games also often have data that if its not received by a client it is no big deal because there are more packets coming in a few milliseconds which will invalidate any missed packets. ie Player is moving from A to Z over a time span of 1 second, his client sends out each packet, roughly 40 milliseconds apart ABCDEFG__I__KLMNOPQRSTUVWXYZ Do we really care if we miss "H and J" since every 40ms we are receiving updates? Not really, this is where prediction can come into it, but this is usually not relevant to most P2P projects. If that was TCP instead of UDP then it would have increased bandwidth requirements and added latency to the rest of the packets being received as the data will be resent until it arrives, on top of the extra latency it is already adding by acking everything.
Essentially you can lower latency and network overhead for many messages in a peer to peer network using UDP. However there will always be some messages which NEED to be sent reliably and that requires you to basically implement some reliable way to get packets to that node, similar to that of TCP. And this is where you need some level of expertise if you want a reliable peer to peer network. Some things to look into include sequencing packets with a number, message ACKs, etc.
If you care a lot about efficiency or really need tens of thousands of connections then implementing your specific protocol in UDP will always be better than TCP. But there are cases to be made for TCP, like if the time to make the project matters or if you are a new to network programming.
If I remember correctly, using TCP to connect to 1000 servers would mean I had to use 1000 ports to handle these connections.
You remember wrong.
Take a web server which is listening on port 80 and can handle 1000s of connections at the same time on this single port. This is because a connection is defined by the tuple of {client-ip,client-port,server-ip,server-port}. And while server-ip and server-port are the same for all connections to this server the client-ip and client-port are not. Even if the client-ip is the same (i.e. same client) the client would pick a different source port.
... with e.g. 1000 nodes without tweaking the system?
This depends on the system since each of the open connections needs to preserve the state and thus needs memory. This might be a problem for embedded systems with only little memory.
In any case: if your protocol is just sending small messages and if packet loss, reordering or duplication are acceptable than UDP might be the better choice because the overhead (connection setup, ACK..) is smaller and it takes less memory. You could also use a single socket to exchange data with all 1000 nodes whereas with TCP you would need a separate socket for each connection (socket is not the same as port!). Using only a single socket might allow for a simpler application design.
I want to amend the answer by Steffen with a few points:
1000 connections are nothing for any normal computer and OS.
UDP fits your requirements. It might be easier to program because it is message oriented. TCP provides a stream of bytes. You need to layer a messaging protocol on top of that which is not that easy. Also, you need to handle broken TCP connections by reconnecting.
Ports are not scarce. No problem with consuming 1000 ports.

TURN Server WebRTC Hardware / Network Requirements

I am currently being challenged (mentally) with the idea of scaling out a TURN server(s) from a novelty to something that scales based on call volume.
Subsequently, I am trying to understand the requirements from a hardware, network, and application perspective and it's associated cost. I have some specific questions come to mind I would love the community's help with wrapping my brain around.
1) Are the ports reused for multiple destinations simultaneously? It seems to me conceptually if it's UDP and the source ip/port, destination ip/port is enough of 4-tuple for uniqueness, I could see it theoretically possible but I've never seen documentation around this.
2) What is the time (if any) before ports are reused. If the TURN server has allocated ports 1234 and 1235 for a given time, when one or both of those sockets close, how long will it be before the TURN server re-allocates those ports as a result of another request.
3) How should I think about the hardware requirements (specifically CPU and memory) of my TURN server(s) as a function of number of concurrent calls?

How to let different processes use different network interfaces?

I'm on the client side. There're multiple network interfaces. How can I let different processes use different network interfaces to communicate? Since I want to connect to the same server, routing seems not working here. Also, connect() doesn't have arguments to specify local address or interface as bind() does.
If your goal is to increase bandwidth to the server by using multiple network interfaces in parallel, then that's probably not something you can (or should) do at the application level. Instead, you should study up on Link Aggregation and then configure your computer and networking stack to use that. Once that is working properly, you will get the parallelization-speedup you want automatically, without the client application having to do anything special to enable it.
"The bind() system call is frequently misunderstood. It is used to
bind to a particular IP address. Only packets destined to that IP
address will be received, and any transmitted packets will carry that
IP address as their source. bind() does not control anything about the
routing of transmitted packets. So for example, if you bound to the IP
address of eth0 but you send a packet to a destination where the
kernel's best route goes out eth1, it will happily send the packet out
eth1 with the source IP address of eth0. This is perfectly valid for
TCP/IP, where packets can traverse unrelated networks on their way to
the destination."
More info e.g. here.
That's why you probably misunderstand bind() call.
The appropriate way to bind to physical topology (to some specific interface) is to use SO_BINDTODEVICE socket option. This is done by setsockopt() call.
Source Policy Routing might be helpful.
Try the following steps:
Use iptables to give packets from different process with different marks.
Use iproute2 to route packets with different marks to different table.
In different table, set the default route to different uplink.
The whole process require certain amount of understanding about linux networking.
Here is an example shows how to route all traffic for a user through one specific uplink: http://www.niftiestsoftware.com/2011/08/28/making-all-network-traffic-for-a-linux-user-use-a-specific-network-interface/
You could try follow similar approach by running different process with different user and route traffic from one user to one uplink.
Also you could let processes communicate with the server with different port and mark the traffic by port.

Can socket connections be multiplexed?

Is it possible to multiplex sa ocket connection?
I need to establish multiple connections to yahoo messenger and i am looking for a way to do this efficiently without having to hold a socket open for each client connection.
so far i have to use one socket for each client and this does not scale well above 50,000 connections.
oh, my solution is for a TELCO, so i need to at least hit 250,000 to 500,000 connections
i'm planing to bind multiple IP addresses to a single NIC to beat the 65k port restriction per IP address.
Please i would any help, insight i can get.
**most of my other questions on this site have gone un-answered :) **
Thanks
This is an interesting question about scaling in a serious situation.
You are essentially asking, "How do I establish N connections to an internet service, where N is >= 250,000".
The only way to do this effectively and efficiently is to cluster. You cannot do this on a single host, so you will need to be able to fragment and partition your client base into a number of different servers, so that each is only handling a subset.
The idea would be for a single server to hold open as few connections as possible (spreading out the connectivity evenly) while holding enough connections to make whatever service you're hosting viable by keeping inter-server communication to a minimum level. This will mean that any two connections that are related (such as two accounts that talk to each other a lot) will have to be on the same host.
You will need servers and network infrastructure that can handle this. You will need a subnet of ip addresses, each server will have to have stateless communication with the internet (i.e. your router will not be doing any NAT in order to not have to track 250,000+ connections).
You will have to talk to AOL. There is no way that AOL will be able to handle this level of connectivity without considering cutting your connection off. Any service of this scale would have to be negotiated with AOL so both you and they would be able to handle the connectivity.
There are i/o multiplexing technologies that you should investigate. Kqueue and epoll come to mind.
In order to write this massively concurrent and teleco grade solution, I would recommend investigating erlang. Erlang is designed for situations such as these (multi-server, massively-multi-client, massively-multithreaded telecommunications grade software). It is currently used for running Ericsson telephone exchanges.
While you can listen on a socket for multiple incoming connection requests, when the connection is established, it connects a unique port on the server to a unique port on the client. In order to multiplex a connection, you need to control both ends of the pipe and have a protocol that allows you to switch contexts from one virtual connection to another or use a stateless protocol that doesn't care about the client's identity. In the former case you'd need to implement it in the application layer so that you could reuse existing connections. In the latter case you could get by using a proxy that keeps track of which server response goes to which client. Since you're connecting to Yahoo Messenger, I don't think you'll be able to do this since it requires an authenticated connection and it assumes that each connection corresponds to a single user.
You can only multiplex multiple connections over a single socket if the other end supports such an operation.
In other words it's a function protocol - sockets don't have any native support for it.
I doubt yahoo messenger protocol has any support for it.
An alternative (to multiple IPs on a single NIC) is to design your own multiplexing protocol and have satellite servers that convert from the multiplex protocol to the yahoo protocol.
I'll trow in another approach you could consider (depending on how desperate you are).
Note that operating system TCP/IP implementations need to be general purpose, but you are only interested in a very specific use-case. So it might make sense to implement a cut-down version of TCP/IP (which only handles your use-case, but does that very well) in your application code.
For example, if you are using Linux, you could route a couple of IP addresses to a tun interface and have your application handle the IP packets for that tun interface. That way you can implement TCP/IP (optimised for your use-case) entirely in your application and avoid any operating system restriction on the number of open connections.
Of course, it's quite a bit of work doing the TCP/IP yourself, but it really depends on how desperate you are - i.e. how much hardware can you afford to throw at the problem.
500,000 arbitrary yahoo messenger connections - is your telco doing this on behalf of Yahoo? It seems like whatever solution has been in place for many years now should be scalable with the help of Moore's Law - and as far as I know all the IM clients have been pretty effective for a long time, and there's no pressing increase in demand that I can think of.
Why isn't this a reasonable problem to address with hardware plus traditional solutions?