How efficient BSD sockets to writew server client application on iphone? - iphone

I am creating server-client application for iPhone. I want to communicate between two application in same network.
For this functionality i am planning to use sockets. How much efficient BSD sockets to use with iphone??
Is there any option available to implement same functionality?
Thanks,
Jim.

See this thread on the iPhone Dev SDK website.
The CF networking stuff is a bit
confusing and hard to wrap your head
around. But, it's just a set of
functions that use BSD sockets and
integrate them with the run loop so
you don't have to create threads. You
can still use BSD sockets yourself
Basically, the thread points out multiple libraries / frameworks which integrate well with the iPhone environment, and using any of them instead of straight BSD sockets probably won't make any significant performance difference. Unless you're really comfortable with low level socket programming you're probably better of with one of the libraries.
Don't do premature optimization - use whatever socket interface you are most comfortable with and which will help you get the job done quickly and produce clear, maintainable code.
EDIT
In response to Jim's question below:
Yes. There are a few factors that determine the system wide and per process socket limits. Take a look at this article for a discussion of these issues. iPhone and Linux are both Unix based OS's so they probably share some of these system admin related socket limitations, but you'll have to look up the system specific admin details.
Second, there are limits imposed by the architecture of UDP and TCP. Basically, UDP and TCP are both limited to 2^16 listening sockets per machine IP address since a listening socket is defined by a fixed 32 bit IP address and a 16 bit Port number. However, since a connected socket is defined by the set of [ [src IP] [src Port] [dst IP] [dst Port] ] then the number of connected sockets you can theoretically have on a single machine IP is significantly higher, something like 2^64 although practically your OS would probably barf way before you hit that limit.

Related

Is using Erlang's gen_tcp a scalable way to construct a high traffic socket server

I am trying to learn Erlang to do some simple but scalable network programming. I basically want to write a program that does what servers on the backbone of the internet do--but on a smaller scale. I want to try to set up an intranet with web accessible servers which would act as gateways to the intranet [sic] and route data to connected clients and/or other gateways.
The high traffic would come from the fact that data would not only flow from client to gateway to client, but might have to bounce around a few gateways to get to the destination (like how data travels on the internet). This means that the gateways would have to not only handle traffic from their clients, but traffic from other gateways' clients.
I figured this would lead to unusually high levels of traffic, even for a medium number of clients and gateways.
Coming from a background in Python and, to a lesser extent, other scripting languages, I am used to digging for a customized module to solve my problems. I understand Erlang is designed for high traffic network programming, but all I could find in terms of libraries/modules for this kind of thing was gen_tcp.
Does this mean that Erlang is already so optimized for this kind of thing that you can fire it up with its most basic modules and expect it to scale nicely?
You can expect gen_tcp to perform extremely well, even under conditions of massive load. If you are just going to pass around data and not process it much, then my guess is you will be able to scale quite nicely - effectively you will just be passing around pointers.
All of the known scalable solutions written in Erlang uses gen_tcp:
Cowboy, Mochiweb, Yaws, ...
Riak
Etorrent
RabbitMQ
and so on. When using it, there is a hint worth mentioning though: Make sure you run erl as erl +K true so you get access to the kernel polling. That is, epoll() on Linux, kqueue()/kevent() on BSD and /dev/poll on Solaris. Also note that you can give commands to TCP ports to set their options w.r.t. buffer size and so on. Finally, for certain types of packets, you can have the C-layer parse the packet for you, see erl -man inet and the setopts/2 call. An example would be {packet, 4} which is quite popular.
In general, Erlang has a quite fast I/O sublayer. You can expect it to perform really quickly, even for large complex interactions.

Would we see any speedup using ZeroMQ instead of TCP Sockets if the two processes communicating are on the same machine?

I understand that 0MQ is supposed to be faster than TCP Sockets in a clustered environment and I can see where that would be the case (I think that's what they're referring to when they say "Faster than TCP, for clustered products and supercomputing" on the 0MQ website). However, will I see any kind of speedup using 0MQ instead of TCP sockets to communicate between two processes running on the same machine?
Well, the short version is give it a try.
The slightly longer version is that writing TCP sockets can be hard, there's a lot of things that are easy to have problems with, but 0MQ guarantees the message will be delivered in its entirety. It is also written by experts in network sockets, which, with the best will in the world, you probably aren't, and they use a few advanced tricks to speed things along.
You are not actually running on one machine because the VM is treated as a separate machine. This means that TCP sockets have to run through the whole network stack and cannot take shortcuts like they do when you communicate between processes on one machine.
However, you could try UDP multicast under ZeroMQ to see if that speeds up your application. UDP is less reliable on a wide area network, but in a closed environment of a VM talking to its host, you can safely skip all the TCP reliability stuff.
I guess IPC should be faster than TCP. If you are willing to move to a single process, INPROC is definitely going to be much faster.
I think (have not tested) that the answer is false as ZMQ likely uses the same standard C lib and adds some message headers.
Same thing applies for UDP.
Same thing applies for IPC pipes.
ZMQ could be just as fast but since it adds headers it is not likely.
Now it could be a different story if you really need some sort of header and ZMQ has implemented it better than you. Like for message size or type, but I digress.

UDP for interprocess communications

I have to implement IPC mechanism (Sending short messages) between java/c++/python process running on the same system. One way to implement is using socket using TCP protocol. This requires maintain connection and other associated activities.
Instead I am thinking of using UDP protocol which does not requires connection and I can send messages.
My question is , does UDP on same machine ( for IPC ) still has same disadvantage has it is applicable when communicating across machines ( like un reliable packet delivery, out of order packet.
Yes, is still unrealiable. For local communication try to use named pipes or shared memory
Edit:
Don't know the requirements of your applications, did you considered something like MPI (altough Java is not well supported...) or, Thrift? ( http://thrift.apache.org/ )
Local UDP is still unreliable, but the major advantage is UDP multicast. You can have one data publisher and many data subscribers. The kernel does the job of delivering a copy of the datagram to each subscriber for you.
Unix local datagram sockets, on the other hand, are required to be reliable but they do not support multicast.
Local UDP is more unreliable than on a network, like 50%+ packet drop unreliable. It is a terrible choice, kernel developers have attributed the quality down to lack of demand.
I would recommend investigating message based middleware preferably with a BSD socket compatible interface for easy learning curve. A suggestion would be ZeroMQ which includes C++, Java and Python bindings.
Local UDP is both still unreliable and sometimes blocked by firewalls. We faced this in our MsgConnect product which uses local UDP for interthread communication. BTW MsgConnect can be an option for your task so that you don't need to deal with sockets. Unfortunately there's no Python binding, but "native" C++ and Java implementations exist.

Understanding socket basics

I've been reading up on basic network programming, but am having a difficult time finding a straight-forward explanation for what exactly and socket is, and how it relates to either the OSI or TCP/IP stack.
Can someone explain to me what a socket is? Is it a programmer- or API-defined data structure, or is it a hardware device on a network card?
What layers of the mentioned network models deal with "raw" sockets? Transport layer? Network layer?
In terms of the data they pass between them, are socket text-based or binary?
Is there an alternative to sockets-based network programming? Or do all networked applications use some form of socket?
If I can get this much I should have a pretty clear understanding of everything else I'm reading. Thanks for any help!
Short answers:
Socket is an abstraction of an IP connection endpoint - so if you think of it as an API structure, you are not very far off. Please do read http://en.wikipedia.org/wiki/Internet_socket
Internet layer i.e. IP Protocol. In practice you usually use explicitly sockets that bind to a certain transport layer parameters (datagram/UDP or stream/TCP)
Sockets send data, in network byte order - whether it is text or binary, depends on the upper layer protocol.
Theoretically, probably yes - but in practice all IP traffic is done using 'sockets'
Socket is a software mechanism provided by the operating system. Like its name implies, you can think of it like an "electrical outlet" or some electrical connector, even though socket is not a physical device, but a software mechanism. In real world when you have two electrical connectors, you can connect them with a wire. In the same way in network programming you can create one socket on one computer and another socket on another computer and then connect those sockets. And when you write data to one of them, you receive it on the other one. There are also a few different kinds of sockets. For example if you are programming a server software, you want to have a listening socket which never sends or receives actual data but only listens for and accepts incoming connections and creates a new socket for each new connection.
A socket, in C parlance, is a data structure in kernel space, corresponding to one end-point of a UDP or TCP session (I am using session very loosely when talking about UDP). It's normally associated with one single port number on the local side and seldom more than one "well-known" number on either side of the session.
A "raw socket" is an end-point on, more or less, the physical transport. They're seldom used in applications programming, but sometimes used for various diagnostic things (traceroute, ping, possibly others) and may required elevated privileges to open.
Sockets are, in their nature, a binary octet-transport. It is not uncommon to treat sockets (TCP sockets, at least) as being text-based streams.
I have not yet seen a programming model that doesn't involve something like sockets, if you dig deep enough, but there have certainly been other models of doing networking. The "/net/" pseudo-filesystem, where opening "/net/127.0.0.0.1/tcp/80" (or "tcp/www") would give you a file handle where writes end up on a web server on localhost is but one.
Suppose your PC at home, and you have two browser windows open.
One looking at the facebook website, and the other at the Yahoo website.
The connection to facebook would be:
Your PC – IP1+port 30200 ——– facebook IP2 +port 80 (standard port)
The combination IP1+30200 = the socket on the client computer and IP2 + port 80 = destination socket on the facebook server.
The connection to Yahoo would be:
your PC – IP1+port 60401 ——–Yahoo IP3 +port 80 (standard port)
The combination IP1+60401 = the socket on the client computer andIP3 + port 80 = destination socket on the Yahoo server.

Can socket connections be multiplexed?

Is it possible to multiplex sa ocket connection?
I need to establish multiple connections to yahoo messenger and i am looking for a way to do this efficiently without having to hold a socket open for each client connection.
so far i have to use one socket for each client and this does not scale well above 50,000 connections.
oh, my solution is for a TELCO, so i need to at least hit 250,000 to 500,000 connections
i'm planing to bind multiple IP addresses to a single NIC to beat the 65k port restriction per IP address.
Please i would any help, insight i can get.
**most of my other questions on this site have gone un-answered :) **
Thanks
This is an interesting question about scaling in a serious situation.
You are essentially asking, "How do I establish N connections to an internet service, where N is >= 250,000".
The only way to do this effectively and efficiently is to cluster. You cannot do this on a single host, so you will need to be able to fragment and partition your client base into a number of different servers, so that each is only handling a subset.
The idea would be for a single server to hold open as few connections as possible (spreading out the connectivity evenly) while holding enough connections to make whatever service you're hosting viable by keeping inter-server communication to a minimum level. This will mean that any two connections that are related (such as two accounts that talk to each other a lot) will have to be on the same host.
You will need servers and network infrastructure that can handle this. You will need a subnet of ip addresses, each server will have to have stateless communication with the internet (i.e. your router will not be doing any NAT in order to not have to track 250,000+ connections).
You will have to talk to AOL. There is no way that AOL will be able to handle this level of connectivity without considering cutting your connection off. Any service of this scale would have to be negotiated with AOL so both you and they would be able to handle the connectivity.
There are i/o multiplexing technologies that you should investigate. Kqueue and epoll come to mind.
In order to write this massively concurrent and teleco grade solution, I would recommend investigating erlang. Erlang is designed for situations such as these (multi-server, massively-multi-client, massively-multithreaded telecommunications grade software). It is currently used for running Ericsson telephone exchanges.
While you can listen on a socket for multiple incoming connection requests, when the connection is established, it connects a unique port on the server to a unique port on the client. In order to multiplex a connection, you need to control both ends of the pipe and have a protocol that allows you to switch contexts from one virtual connection to another or use a stateless protocol that doesn't care about the client's identity. In the former case you'd need to implement it in the application layer so that you could reuse existing connections. In the latter case you could get by using a proxy that keeps track of which server response goes to which client. Since you're connecting to Yahoo Messenger, I don't think you'll be able to do this since it requires an authenticated connection and it assumes that each connection corresponds to a single user.
You can only multiplex multiple connections over a single socket if the other end supports such an operation.
In other words it's a function protocol - sockets don't have any native support for it.
I doubt yahoo messenger protocol has any support for it.
An alternative (to multiple IPs on a single NIC) is to design your own multiplexing protocol and have satellite servers that convert from the multiplex protocol to the yahoo protocol.
I'll trow in another approach you could consider (depending on how desperate you are).
Note that operating system TCP/IP implementations need to be general purpose, but you are only interested in a very specific use-case. So it might make sense to implement a cut-down version of TCP/IP (which only handles your use-case, but does that very well) in your application code.
For example, if you are using Linux, you could route a couple of IP addresses to a tun interface and have your application handle the IP packets for that tun interface. That way you can implement TCP/IP (optimised for your use-case) entirely in your application and avoid any operating system restriction on the number of open connections.
Of course, it's quite a bit of work doing the TCP/IP yourself, but it really depends on how desperate you are - i.e. how much hardware can you afford to throw at the problem.
500,000 arbitrary yahoo messenger connections - is your telco doing this on behalf of Yahoo? It seems like whatever solution has been in place for many years now should be scalable with the help of Moore's Law - and as far as I know all the IM clients have been pretty effective for a long time, and there's no pressing increase in demand that I can think of.
Why isn't this a reasonable problem to address with hardware plus traditional solutions?