How do BitTorrents connect with eachother? - router

I was just downloading a new distro of linux using uTorrent, and started to wonder how uTorrent (and other bittorrents) send files to eachother through NAT routers? They obviously use the trackers to get introduced, but how do they pass info to eachother?
Is there a whitepaper on this? I couldn't find one :/
Thanks

Most of the time, they don't. I have a restricted network, and every time I run my torrent program it warns me that some of the ports/functionality required is not available to me.
If one party has a restricted network and another has an open network, the restricted client will always connect to the open client. If you have two restricted clients they will not be able to connect to each other. The reason it works at all is that most (enough) of the people on the torrent network do have some kind of port forwarding or UPNP (universal plug and play) to facilitate this.

Torrent clients work on the basis of what are known as Distributed Hash Tables. They start off with a set of known roots, and branch out looking for other, connected nodes (i.e., neighbours). Establish connections to them, and keep this up, up to a set limit. Since the client is initiating the connection, all the remote has to do is feed the data back, and you get it through the NAT just fine. It's how network traffic works.

Related

How exactly do p2p networks connect?

If I establish a connection with a friend on Skype, the audio and video data does not go through Microsoft but directly. I also have a p2p client X that does a similar thing. I do not fully understand how this happens internally. How does a machine establish the connection with the other if there is no direct identifier such as a public IP address? Multiple computers in the same network can each do p2p or Skype calls at the same time.
I have been wondering about this for a week because I want to connect two Nodes with each other (like a socket server/client). Can you point me in the right direction?
If two clients want to connect, but neither knows the address of the other, some sort of intermediary that they both know has to be in place to help set up the connection. They both contact the intermediary with the desire to connect, the intermediary tells them both what the others direct address is, and the connection can be set up directly.
Sometimes (read: always) one or both machines share a public IP with others behind NAT, and NAT traversal techniques are needed to establish end-to-end connectivity, usually some form of ICE or SIP.

Connect sockets directly after introduction through server

I'm looking for the name of a protocol and example code that permits handing off IP/port connections to establish unmediated P2P after introduction through a server.
Simple example:
You and I both start chat programs that connect to chatintroduce.com (fictional server). I send you a "Hi! Wanna chat?" message. It doesn't get sent. Instead my chat program tells chatintroduce to send your chat program a request for connection. You respond to a prompt and your chat program tells chatintroduce to broker the connection. Chatintroduce establishes an initial two-way connection between us. Now, this final step is important, chatintroduce releases control and our two chat programs now talk directly to each other without any traffic through chatintroduce.
In other words, I construct packets which have your IP address and you receive them without interference from firewalls, NATs or any other technologies. In other words, true peer-to-peer connection independent of intermediate server.
I need to know what search terms to use to find appropriate technology. An RFC name would suffice. I've been searching for days without success.
I think what you are looking for is TCP/UDP hole punching which typically coordinates the P2P connection using a STUN server to determine the "capabilities" of the firewalls (e.g. is it a full cone nat? symmetric?).
https://en.wikipedia.org/wiki/Hole_punching_(networking)
We employed this at a company I worked for to create a kind of BitTorrent that could circumvent firewalls for streaming video between two peers.
Note that sometimes it is NOT possible to establish a connection without the intermediary.
What you are looking for is ICE protocol. RFC 5245. This protocol is used for connecting two peers through NAT traversal. There are some open source libraries and also some proprietary libraries for this. You can search google with ICE implementation.
You will also need to read about some additional protocols. These are used with ICE protocol. They are STUN and TURN.
For some cases you can't make P2P call 100% time. You will have to use a relay server. Like if the NAT combination of two peers are Symmetric vs Symmetric/PRC. That relay server is called TURN server.
Some technique like Port forwarding and TCP/UDP hole punching will help you to increase P2P rates.
See this answer for more information about which combination of NAT will require a relay server and which don't.
Thank you. I will be looking further into ICE, STUN, TURN, and hole-punching.
I also found n2n which looks like almost exactly what I wanted.
https://github.com/meyerd/n2n
http://xmodulo.com/configure-peer-to-peer-vpn-linux.html
With n2n, one makes a VPN with a super node that all other edge nodes know.
But once the introductions are made, the super node can be absent.
This was exactly what I wanted. I hope it works across platforms (linux, MacOS, Windows).
Again, I am still researching before implementation, so your advice was very important to me.
Thank you.
Use PJNATH. Its open source.
http://www.pjsip.org/pjnath/docs/html/
There is not much open source on NAT Traversal. As far as I know PJNATH is good.
For server you can use Google's Open source STUN and TURN server.

Socket programming via two network interface simuntaneoulsy

I want to have two tcp connection in a single machine via socket programming, But this two connection should connect to two different network interfaces. One is say my 3g dongle and the other is wifi modem. But is it possible for a single machine(OS) to have two connection active at a time? If possible how to program the tcp connection via socket programming?
This can definitely be done, if you just create two programs and run each of them, they will both be able to communicate over their respective network. When you run a program, the operating system creates a process dedicated to running that program, which is assigned time on the CPU by the scheduling algorithm in the OS. So long as your CPU can keep up with any processing associated with the networks, they will both be able to run simultaneously.
You make no mention of your plans for this, but be aware that I/O times can also limit your speeds. If you're using an older computer, it may not be able to transmit a lot of data very quickly due to an out-dated (or just low powered) network card.
Next time try to research your question first, information about this can be found with relative ease using any popular search engine, including the search bar at the top of this page. Also read this, or one of the other several help articles about asking questions well, that are available from the page you had to go through before asking the question.

Where would I learn more about interpreting network packets?

I'm working on a personal project. It's to recreate server software for the game "Chu Chu Rocket" for the Sega Dreamcast. Its' servers went down in 2004 I believe. My approach is to use dnsmasq to change the originl hostname that the game originally connected to, to my own system. With a DC-PC server set up, I have done just that, now instead of it looking up a non-existent dns record, it connects to my computer which will eventually run the server software. I've used tshark (cli wireshark) to capture what's going on between the client (dreamcast) and the server (my computer). The problem is, I'm getting data, but I'm not sure how to interpret it, I don't know what it's saying, but I'm sure it can be done because private PSO servers were created, those are far more complex.
Very simply, where would I go about learning how to interpret data packets, and possibly creating packets that will respond to such queries from the client?
Thanks,
Dragos240
If you can get the source code for the server software on your PC, then that is the best place to look.
Otherwise, all you can do is look at the protocol, compare runs, and make notes of similarities and differences. With any luck, the protocol won't be encrypted.

Can socket connections be multiplexed?

Is it possible to multiplex sa ocket connection?
I need to establish multiple connections to yahoo messenger and i am looking for a way to do this efficiently without having to hold a socket open for each client connection.
so far i have to use one socket for each client and this does not scale well above 50,000 connections.
oh, my solution is for a TELCO, so i need to at least hit 250,000 to 500,000 connections
i'm planing to bind multiple IP addresses to a single NIC to beat the 65k port restriction per IP address.
Please i would any help, insight i can get.
**most of my other questions on this site have gone un-answered :) **
Thanks
This is an interesting question about scaling in a serious situation.
You are essentially asking, "How do I establish N connections to an internet service, where N is >= 250,000".
The only way to do this effectively and efficiently is to cluster. You cannot do this on a single host, so you will need to be able to fragment and partition your client base into a number of different servers, so that each is only handling a subset.
The idea would be for a single server to hold open as few connections as possible (spreading out the connectivity evenly) while holding enough connections to make whatever service you're hosting viable by keeping inter-server communication to a minimum level. This will mean that any two connections that are related (such as two accounts that talk to each other a lot) will have to be on the same host.
You will need servers and network infrastructure that can handle this. You will need a subnet of ip addresses, each server will have to have stateless communication with the internet (i.e. your router will not be doing any NAT in order to not have to track 250,000+ connections).
You will have to talk to AOL. There is no way that AOL will be able to handle this level of connectivity without considering cutting your connection off. Any service of this scale would have to be negotiated with AOL so both you and they would be able to handle the connectivity.
There are i/o multiplexing technologies that you should investigate. Kqueue and epoll come to mind.
In order to write this massively concurrent and teleco grade solution, I would recommend investigating erlang. Erlang is designed for situations such as these (multi-server, massively-multi-client, massively-multithreaded telecommunications grade software). It is currently used for running Ericsson telephone exchanges.
While you can listen on a socket for multiple incoming connection requests, when the connection is established, it connects a unique port on the server to a unique port on the client. In order to multiplex a connection, you need to control both ends of the pipe and have a protocol that allows you to switch contexts from one virtual connection to another or use a stateless protocol that doesn't care about the client's identity. In the former case you'd need to implement it in the application layer so that you could reuse existing connections. In the latter case you could get by using a proxy that keeps track of which server response goes to which client. Since you're connecting to Yahoo Messenger, I don't think you'll be able to do this since it requires an authenticated connection and it assumes that each connection corresponds to a single user.
You can only multiplex multiple connections over a single socket if the other end supports such an operation.
In other words it's a function protocol - sockets don't have any native support for it.
I doubt yahoo messenger protocol has any support for it.
An alternative (to multiple IPs on a single NIC) is to design your own multiplexing protocol and have satellite servers that convert from the multiplex protocol to the yahoo protocol.
I'll trow in another approach you could consider (depending on how desperate you are).
Note that operating system TCP/IP implementations need to be general purpose, but you are only interested in a very specific use-case. So it might make sense to implement a cut-down version of TCP/IP (which only handles your use-case, but does that very well) in your application code.
For example, if you are using Linux, you could route a couple of IP addresses to a tun interface and have your application handle the IP packets for that tun interface. That way you can implement TCP/IP (optimised for your use-case) entirely in your application and avoid any operating system restriction on the number of open connections.
Of course, it's quite a bit of work doing the TCP/IP yourself, but it really depends on how desperate you are - i.e. how much hardware can you afford to throw at the problem.
500,000 arbitrary yahoo messenger connections - is your telco doing this on behalf of Yahoo? It seems like whatever solution has been in place for many years now should be scalable with the help of Moore's Law - and as far as I know all the IM clients have been pretty effective for a long time, and there's no pressing increase in demand that I can think of.
Why isn't this a reasonable problem to address with hardware plus traditional solutions?