Peer to Peer Networking - with shared public IP and DHCP - sockets

I am trying to setup peer to peer networking and am trying to understand how this works.
Normally in Client to Server connection, I will connect to the server IP and port. Behind the scenes, it will create a client socket bound to a local port at the local ip, and the packet is sent to the router. The router will then NAT the local port and the local socket, to the client public ip and a different public client socket with a destination for the server IP and port.
When the server responds, the router then DENATs the public client ip and public client port back to the local ip and local port, and the packet arrives at the computer.
In a Peer to Peer networking, I may have the peer's public IP, but it is shared by many machines and the router hasn't allowed a connection yet, so there isn't a open port I can send the data to.
There was then an option that both peers contact a server. That opens a port on the router. Then the peers send packets to each other's client port.
However, usually the router will only accept packets from the same IP the request was made to, so the two peers cannot reuse the server's connection.
How do the two peers talk to each other in this scenario ?

Peer-to-peer networking works exactly the same way as client/server networking. Only one of the peers will become a server and the other a client.
Normally in a peer-to-peer app like bittorrent all peers are also servers but of course for any individual connection one machine must take the role of the client. However a single peer may have multiple connections. So for any single peer some of the connections to it will be server sockets and some will be client sockets.
How this works with NAT is exactly the same as a client/server architecture. You must configure your router to NAT back to your peer application in order for others to connect to it. If not then your peer can only connect to other peers but other peers cannot connect to you. For example, if your bittorrent client is generally acting slow, not managing to get a lot of connections and not managing to finish downloading some torrents this often signifies that you have not configured your router's port forwarding back to your PC for your bittorrent client.
For the use-case of non-expert users (consumers) there are several ways to get around NAT automatically without requiring your users to configure their routers. The most widely used method is UPnP (Universal Plug and Play). However a lot of more expert users who can configure their own routers often disable UPnP because it is a fairly well known DDoS target. So if you do decide to use UPnP you should make it optional for more advanced users to disable it if they don't want to use it.
For cases where you need a guaranteed connection regardless of router configuration then your app cannot be 100% peer-to-peer. You'd need a relay server that acts as a server to both peers that will forward the packet form the sending client peer to the receiving client peer. Of course, the disadvantage of this is that you now have a fixed cost of maintaining a server to support your app just like traditional client/server systems but in this case you're using peer-to-peer to reduce server costs, not eliminate the server.
One example of this "hybrid" approach is cryptocurrencies like Bitcoin and Ethereum. They need a core group of servers to exist in order to work. However, for these protocols the servers run the same software as the clients - they're all just nodes. The only difference is that you don't shut down the servers whereas most people quit their bitcoin wallet once they've done using it (unless they're mining). Another example that is similar is the TOR network. There is a set of core TOR nodes that act as the "server" part of the network ensuring that the network always exist.

You said it yourself: "peers send packets to each other's client port". Therefore, the router will "accept packets from the same IP the request was made to".
Say, Alice is behind router A and Bob is behind router B.
Having learned their public endpoints from a server, Alice will send UDP packets to Bob's public IP, and Bob will send UDP packets to Alice's.
Having seen Alice talk to Bob's IP, router A will accept UDP packets from Bob.
Having seen Bob talk to Alice, router B will accept UDP packets from her as well.
That is, some initial packets might be rejected as coming from the blue, but after both parties have initiated communication on their side, routers will have no reason to block what follows.
In terms of Symmetric NAT Traversal using STUN 2003, by sending a packet to Bob, Alice is creating a door for Bob in A. On the other side, by sending a packet to Alice, Bob is creating a door for Alice in B.
The trick in UDP hole punching seems to be for the routers to reuse the same NAT tunnel for different IPs - so that the port discovered by a server is the same as the port reused for direct communication.
We can talk with different IPs from a normal UDP socket (by skipping connect and using sendto), so it's kind of logical that a tunneled socket would be able to do the same.

Related

Understanding of WebSockets

My understanding is that a socket corresponds to a network identifier, port and TCP identifier. [1]
Operating systems enable a process to be associated with a port (which IIUC is a way of making the process addressable on the network for inbound data).
So a WebSocket server will typically be associated with a port well-known for accepting and understanding HTTP for the upgrade request (like 443) and then use TCP identifiers to enable multiple network sockets to be open concurrently for a single server process and a single port.
Please can someone confirm or correct my understanding?
[1] "To provide for unique names at
each TCP, we concatenate a NETWORK identifier, and a TCP identifier
with a port name to create a SOCKET name which will be unique
throughout all networks connected together." https://www.rfc-editor.org/rfc/rfc675
When a client connects to your server on a given port, the client connection is coming from an IP address and a client-side port number. The client-side port number is automatically generated by the client and will be unique for that client. So, you end up with four items that make a connection.
Server IP address (well known to all clients)
Server port (well known to all clients)
Client IP address (unique for that client)
Client port (dynamically unique for that client and that socket)
So, it is the combination of these four items that make a unique TCP connection. If the same client makes a second connection to the same server and port, then that second connection will have a different client port number (each connection a client makes will be given a different client port number) and thus the combination of those four items above will be different for that second client connection, allowing it's traffic to be completely separate from the first connection that client made.
So, a TCP socket is a unique combination of the four items above. To see how that is used, let's look at how some traffic flows.
After a client connects to the server and a TCP socket is created to represent that connection, then the client sends a packet. The packet is sent from the client IP address and from the unique client port number that that particular socket is using. When the server receives that packet on its own port number, it can see that the packet is coming from the client IP address and from that particular client port number. It can use these items to look up in its table and see which TCP socket this traffic is associated with and trigger an event for that particular socket. This separates that client's traffic from all the other currently connected sockets (whether they are other connections from that same client or connections from other clients).
Now, the server wants to send a response to that client. The packet is sent to the client's IP address and client port number. The client TCP stack does the same thing. It receives the packet from the server IP/port and addressed to the specific client port number and can then associate that packet with the appropriate TCP socket on the client so it can trigger an event on the right socket.
All traffic can uniquely be associated with the appropriate client or server TCP socket in this way, even though many clients may connect to the same server IP and port. The uniqueness of the client IP/port allows both ends to tell which socket a given packet belongs to.
webSocket connections start out with an HTTP connection (which is a TCP socket running the HTTP protocol). That initial HTTP request contains an "upgrade" header requesting the server to upgrade the protocol from HTTP to webSocket. If the server agrees to the upgrade, then it returns a response that indicates that the protocol will be changed to the webSocket protocol. The TCP socket remains the same, but both sides agree that they will now speak the webSocket protocol instead of the HTTP protocol. So, once connected, you then have a TCP socket where both sides are speaking the webSocket protocol. This TCP connection uses the same logic described above to remain unique from other TCP connections to the same server.
In this manner, you can have a single server on a single port that works for both HTTP connections and webSocket connections. All connections to that server start out as HTTP connections, but some are converted to webSocket connections after both sides agree to change the protocol. The HTTP connections that remain HTTP connections will be typical request/response and then the socket will be closed. The HTTP connections that are "upgraded" to the webSocket protocol will remain open for the duration of the webSocket session (which can be long lived). You can have many concurrent open webSocket connections that are all distinct from one another while new HTTP connections are regularly serviced all by the same server. The TCP logic above is used to keep track of which packets to/from the same server/port belong to which connection.
FYI, you may have heard about NAT (Network Address Translation). This is commonly used to allow private networks (like a home or corporate network) to interface to a public network (like the internet). With NAT a server may see multiple clients as having the same client IP address even though they are physically different computers on a private network). With NAT, multiple computers are routed through a common IP address, but NAT still guarantees that the client IP address and client port number are still a unique combination so the above scheme still works. When using NAT an incoming packet destined for a particular client arrives at the shared IP address. The IP/port is then translated to the actual client IP address and port number on the private network and then packet is forwarded to that device. The server is generally unaware of this translation and packet forwarding. Because the NAT server still maintains the uniqueness of the client IP/client port combination, the server's logic still works just fine even though it appears that many clients are sharing a common IP address). Note, home network routes are usually configured to use NAT since all computers on the home network will "share" the one public IP address that your router has when accessing the internet.
You will not enable multiple sockets, there is no need for it. You will have multiple conections. It's a little different, but you undesrstand well. For UDP there's nothing to do, cause there is no connections.
In TCP, if two different machines connect to the same port on a third machine, there are two distinct connections because the source IPs differ. If the same machine (or two behind NAT or otherwise sharing the same IP address) connects twice to a single remote end, the connections are differentiated by source port, the same machine cannot open 2 connections on the same port.

Lua Networking - Passing data through a 'closed' port

This might be a bit weird to explain, but I'll try my best.
I have a Lua program that's intended to serve some data through the network. Specifically, the internet. The data the program is actually transmitting are only strings stored within UDP packets. Generalized, this is how the program operates:
The first client launches the program and specifies that they are the 'host' of the connection. The program opens a connection on UDP port 6000 and the main loop listens for any packets received on said port.
The second client launches the program and specifies that they are to connect to the 'host' on port 6000. The user enters the IP, and the client opens a UDP connection using a random port between 6050 and 7000
When the client successfully connects to the server, they send a 'connection' packet, simply containing a '202 OK' string. The 'host' receives this and registers the new client
Now that the connection has been initialized, the programs can send data between each other using the registered data.
Now, on a local network this program works fine. The purpose of the 'host' mode is to have multiple clients connect to the host and have the host relay packets from one clients to all the currently registered clients. Port selections are arbitrary and random port selection from the client was simply to allow debugging and testing from a single computer. This has been tested between two and more computers on a physical network, and worked successfully. However, when I attempt to run this over the internet it's a no go. I know that the ports are closed and that's why it's not working. But seeing as I'm going to be distributing this program (privately) I can't expect every person to open ports on their router (or know how to). Security is not currently a concern with the program, and should be disregarded in the current state. That being said, I recognise there's the potential for a lot to go wrong with the use of this program through the network and I accept that. Onto the main question, how can I have the host and client communicate over the internet without having to open ports?
I'll elaborate - for example, browsers. Although the technology is quite different to what I'm doing, it's easier to paint a picture - the browser requests data from a web server, and it gets sent back to the client. But wait, if the router is in it's default state (I hope) all the ports are closed? So how does the client receive this data if the port is closed?
I hope this makes some kind of sense and I don't sound like a complete fool.
I managed to find some suitable libraries and utilities to be able to communicate through the internet (NAT traversal is now a term I am familiar with), those libraries being that supplied by NMAP. These libraries include an implementation for STUN in LUA, among HEAPS of other useful networking-related libraries and scripts.
To actually answer my own question (very simply), the clients and servers are behind what's known as a NAT gateway. Due to the limitations of addresses of IPv4, NAT gateways were implemented to bypass this limitation of IPv4 (a total of about 4.2 billion addresses) by separating the clients' internal network from the external network - in this case the internet. The NAT is supplied with a single IP address, and the NAT then supplies all of its users within the internal network with an IP respective to the network they're on. As such, the devices cannot directly be accessed without forwarding connections from the NAT gateway (generally the router) to the client. However, when using UDP connections the NAT gateway opens a port for the purposes of this connection which gets closed after the connection dies. This port that is opened differs from what is specified by the client when they open the connection, which is where the STUN methods come in. STUN allows the host to find the port that the client is connecting from and send data back to this port so the user can receive it. Bear in mind this is an EXTREMELY simple explanation of how the technology works, and I'd suggest reading up on the Wiki and some of the Request for Comments for STUN.

What does it mean to connect to a certain port?

For example, when you make an ssh connection, you are connected to port 22. What happens then? On a very high level brief overview, I know that if port 22 is open on the other end and if you can authenticate to it as a certain user, then you get a shell on that machine.
But I don't understand how ports tie into this model of services and connections to different services from remote machines? Why is there a need for so many specific ports running specific services? And what exactly happens when you try to connect to a port?
I hope this question isn't too confusing due to my naive understanding. Thanks.
Imagine your server as a house with 65536 doors. If you want to visit family "HTTP", you go to door 80. If you were to visit family "SMTP", you would visit door no. 25.
Technically, a port is just one of multiple possible endpoints for outgoing/incomming connections. Many of the port numbers are assigned to certain services by convention.
Opening/establishing a connection means (when the transport protocol is TCP, which are most of the “classical” services like HTTP, SMTP, etc.) that you are performing a TCP handshake. With UDP (used for things like streaming and VoIP), there's no handshake.
Unless you want to understand the deeper voodoo of IP networks, you could just say, that's about it. Nothing overly special.
TCP-IP ports on your machine are essentially a mechanism to get messages to the right endpoints.
Each of the possible 65536 ports (16 total bits) fall under certain categories as designated by the Internet Assigned Numbers Authority (IANA).
But I don't understand how ports tie into this model of services and
connections to different services from remote machines? Why is there a
need for so many specific ports running specific services?
...
And what exactly happens when you try to connect to a port?
Think of it this way: How many applications on your computer communicate with other machines? Web browser, e-mail client, SSH client, online games, etc. Not to mention all of the stuff running under the hood.
Now think: how many physical ports do you have on your machine? Most desktop machines have one. Occasionally two or three. If a single application had to take complete control over your network interface nothing else would be able to use it! So TCP ports are a way of turning 1 connection into 65536 connections.
For example, when you make an ssh connection, you are connected to
port 22. What happens then?
Think of it like sending a package. Your SSH client in front of you needs to send information to a process running on the other machine. So you supply the destination address in the form of "user#[ip or hostname]" (so that it knows which machine on the network to send it to), and "port 22" (so it gets to the right application running on the machine). Your application then packs up a TCP parcel and stamps a destination and a return address and sends it to the network.
The network finds the destination computer and delivers the package. So now it's at the right machine, but it still needs to get to the right application. What do you think would happen if your SSH packet got delivered to an e-mail client? That's what the port number is for. It effectively tells your computer's local TCP mailman where to make the final delivery. Then the application does whatever it needs to with the data (such as verify authentication) and sends a response packet using your machine's return address. The back and forth continues as long as the connection is active.
Hope that helps. :)
The port is meant to allow applications on TCP/IP to exchange data. Each machine on the internet has one single address which is its IP. The port allows different applications on one machine to send and receive data with multiple servers on the network/internet. Common application like ftp and http servers communicate on default ports like 21 and 80 unless network administrators change those default ports for security reasons

Methods for p2p transfer behind firewalls and NATs on multiple device types

I'm building a system that relies on a central server to send the IP address and port of the first user (on mobile or desktop app) to a second user (on mobile or desktop app). The second user establishes a P2P encrypted connection with the first user, using the IP address and port sent by the central server, to send a large file directly (ideally, the actual file doesn't pass through the central server).
This system needs to work even if the users are behind different firewalls / NATs and on mobile or desktop devices, without requiring users to manually open ports.
I've been looking into NAT Traversal Protocol (Teredo IPv6), libjingle (Google's open source suite), STUN, direct socket connections, and direct VPNs between the users.
I'm confused if I'm approaching this correctly. Would all of these options solve this problem independently? Or am I approaching this wrong? Would direct IPv6 connections would straight out, even behind IPv4 routers?
P2P connection is not guaranteed to succeed always. It can fail for the following reasons:
1) Two peers are behind symmetric NAT. (Although Teredo works if one peer is behind symmetric nat.) 2) UDP is blocked
3) If the peer is behind proxy.
4) Double NAT scenarios.
There are three types of ipv6 address - link local, private address & global. Two peers can connect directly over the internet if they have global address. Global address prefix is (200:....). If your building P2P system, you should have fallback mechanism in which case the central server should relay the data between the peers. This way you can make your application reliable at the time make connection faster for most peers using p2p.

Communicating between networks using sockets

I have a question about network connections among computers.
I've made some applications where messages pass through the Internet (via sockets) to make a connection between two devices. However, a strong condition is that two devices must be connected to the same network.
Can anyone give me a trick how to create a communication using sockets between two computers even if they are connected to different netwkorks?
Thank you in advance.
Here is a great tutorial on how to use sockets and general networking
(in java) http://www.thenewboston.org/watch.php?cat=25&number=38
In order to communicate between two diffrent networks over the internet, you will need to do something called port forwarding. What that does is that when your public IP of your network receives a packet with a spesific port number. The router knows where to send that packet to which local IP.
If you dont port forward and receive some data. The router doesent know where to send the packet. Therefore it discards it, which means others wont be able to connect to you.
You will only need to port forward the network with your server (using the example i linked). How you do that is by logging in to your router, and say that a port which the server uses gets forwarded to the IP of the PC hosting the server.
On the other network (client) you will need to change the IP address of which the client shall connect to. That IP address needs to be your public IP of your server's network. You can find that by connecting to the server's network and go to: http://www.whatsmyip.org/ . Keep in mind that public IP addresses may change over time.
Hope this helped!
-Kad