This Morning I Booted my Computer and Had multiple applications needing an update. While I was waiting for the applications to update, a question came to mind which I thought I'd ask in here.
The question is How does each application known which internet data being retrieved is theirs?
The applications don't even care about it, they let the Kernel sort out that information.
When an application establishes a connection with a remote computer, the Kernel assigns that application a local port, which is a number among 0-65535. This port on the receiving end can either be requested by the application or the kernel will assign a random port. Generally there is only one application per port, however it is possible for multiple applications to receive the same data, though this is rare.
When a packet is received by the network interface, the kernel will sort the packet by its destination port. There will be a table in the kernel mapping ports to processes, and each application will receive the relevant data without caring about any other possible data that could be coming in the computer.
If you are a programmer, you can learn about all this stuff by reading about socket programming:
http://en.wikipedia.org/wiki/Network_socket
is a good place to start. You can also google "socket programming" with your preferred programming language to get an idea of how this is set up on the programming end.
Related
I am working on a legacy modbus program for an industrial SCADA system.
Currently, the c++ program acts as both a modbus TCP server and client.
Client behaviour:
It reads from a number of vendor PLCs (servers) on site, performs calculations and sends control commands back to the PLCs based on the data received across the site.
Server behaviour:
responds to a variety of TCP read and write requests from web interfaces and laptops on site.
Until now, this has worked fine, but we have recently installed a logging client on the network which polls our program very frequently (sub-second) and this has revealed timing issues: the program can potentially take a very long time in its client loop performing calculations and reading PLC values before acting as a server and responding to incoming requests.
Easy solution would be to split the programs into a modbus server and client instance, and keep them both running on the same embedded PC.
The issue I have is that the remote web interface (HMI) must be able to control the behaviour of vendor PLC 2 and Vendor PLC 2 will only allow one TCP connection from the embedded PC. In the past the program has handled writes requests from the HMI by forwarding them on to the PLC 2 via the open socket.
I'd be keen to gather thoughts on best practices here.
My thinking:
the modbus server program will need to respond to the HMI requests and somehow store the information required for vendor PLC 2, and it will also need to set a status register to inform the modbus client that there is data for vendor PLC 2.
The modbus client program will need to read the status register (and data) from the server and pass this on to vendor PLC 2.
Am I heading in the right direction?
Without having details on your implementation I can only guess the problem is that your program is single-threaded, and delays are caused by waiting responses from PLCs.
So, if my assumption is correct, you need to switch to 'select' function and redesign your software to be totally async. You have to put all sockets (both connected and accepted) in a FDs set and wait events on them.
win32:
https://learn.microsoft.com/en-us/windows/desktop/api/winsock2/nf-winsock2-select
linux:
https://www.opennet.ru/cgi-bin/opennet/man.cgi?topic=select&category=2
I've written the same app ages ago on win32 (but without calculations) and it easily processed about 200 PLCs, working on the same machine with SCADA.
I want to have two tcp connection in a single machine via socket programming, But this two connection should connect to two different network interfaces. One is say my 3g dongle and the other is wifi modem. But is it possible for a single machine(OS) to have two connection active at a time? If possible how to program the tcp connection via socket programming?
This can definitely be done, if you just create two programs and run each of them, they will both be able to communicate over their respective network. When you run a program, the operating system creates a process dedicated to running that program, which is assigned time on the CPU by the scheduling algorithm in the OS. So long as your CPU can keep up with any processing associated with the networks, they will both be able to run simultaneously.
You make no mention of your plans for this, but be aware that I/O times can also limit your speeds. If you're using an older computer, it may not be able to transmit a lot of data very quickly due to an out-dated (or just low powered) network card.
Next time try to research your question first, information about this can be found with relative ease using any popular search engine, including the search bar at the top of this page. Also read this, or one of the other several help articles about asking questions well, that are available from the page you had to go through before asking the question.
The server consists of several services with which a user interacts: profiles, game logics, physics.
I heard that it's a bad practice to have multiple client connections to the same server.
I'm not sure whether I will use UDP or TCP.
The services are realtime, they should reply as fast as possible so I don't want to include any additional rerouting if there are no really important reasons. So are there any reasons to rerote traffic through one external endpoint service to specific internal services in my case?
This seems to be multiple questions in one package. I will try to answer the ones I can identify as separate...
UDP vs TCP: You're saying "real-time", this usually means UDP is the right choice. However, that means having to deal with lost packets and possible re-ordering of packets. But, using UDP leaves a couple of possible delay-decreasing tricks open.
Multiple connections from a single client to a single server: This consumes resources (end-points, as it were) on both the client (probably ignorable) and on the server (possibly a problem, possibly ignorable). The advantage of using separate connections for separate concerns (profiles, physics, ...) is that when you need to separate these onto separate servers (or server farms), you don't need to update the clients, they just need to connect to other end-points, using code that's already tested.
"Re-router" (or "load balancer") needed: Probably not going to be an issue initially. However, it will probably become an issue later. Depending on your overall design and server OS, using UDP may actually become an asset here. UDP packet arrives at the load balancer, dispatched to the right backend and that could then in theory send back a reply with the source IP of the load balancer.
An alternative would be to have a "session broker". The client makes an initial connection to a well-known endpoint, says "I am a client, tell me where my profile, physics, what-have0-you servers are", the broker considers the current load, possibly the location of the client and other things that may make sense and the client then connects to the relevant backends on its own. The downside of this is that it's harder (not impossible, but harder) to silently migrate an ongoing session to a new backend, when there's a load-balancer in the way, this can be done essentially-transparently.
I am making a unix ssl server/client. So far I have implemented FD_SET with select to handle all connections concurrently in one master server process. However due to __FD_SETSIZE the number of clients can only be 1024. I need to increase the number of clients and efficiency of the server. Changing the __FD_SETSIZE has potential problems (apparently?) so I am stuck.
So far the network includes: errno.h detection, signal detection -> atomic handling, fd_set -> select(), successful stream socket based communication.
I would really appreciate it if someone can tell me what should I do? do I fork() after 1024 (which presents its own problems, if its even doable?) do I implement threads to handle each client request, or just client data or both?
What is the best network architecture in your opinion? keep in mind its a socket stream based connection that is meant to handle as much punishment as possible and allowing as many clients to the server as possible.
Don't write your own production web server.
There are too many open source servers out there all written by people who know more about high connectivity and SSL than you do. They also have the advantage of being tested to a degree that you'd never be able to accomplish with your homebrew server.
Is it possible to multiplex sa ocket connection?
I need to establish multiple connections to yahoo messenger and i am looking for a way to do this efficiently without having to hold a socket open for each client connection.
so far i have to use one socket for each client and this does not scale well above 50,000 connections.
oh, my solution is for a TELCO, so i need to at least hit 250,000 to 500,000 connections
i'm planing to bind multiple IP addresses to a single NIC to beat the 65k port restriction per IP address.
Please i would any help, insight i can get.
**most of my other questions on this site have gone un-answered :) **
Thanks
This is an interesting question about scaling in a serious situation.
You are essentially asking, "How do I establish N connections to an internet service, where N is >= 250,000".
The only way to do this effectively and efficiently is to cluster. You cannot do this on a single host, so you will need to be able to fragment and partition your client base into a number of different servers, so that each is only handling a subset.
The idea would be for a single server to hold open as few connections as possible (spreading out the connectivity evenly) while holding enough connections to make whatever service you're hosting viable by keeping inter-server communication to a minimum level. This will mean that any two connections that are related (such as two accounts that talk to each other a lot) will have to be on the same host.
You will need servers and network infrastructure that can handle this. You will need a subnet of ip addresses, each server will have to have stateless communication with the internet (i.e. your router will not be doing any NAT in order to not have to track 250,000+ connections).
You will have to talk to AOL. There is no way that AOL will be able to handle this level of connectivity without considering cutting your connection off. Any service of this scale would have to be negotiated with AOL so both you and they would be able to handle the connectivity.
There are i/o multiplexing technologies that you should investigate. Kqueue and epoll come to mind.
In order to write this massively concurrent and teleco grade solution, I would recommend investigating erlang. Erlang is designed for situations such as these (multi-server, massively-multi-client, massively-multithreaded telecommunications grade software). It is currently used for running Ericsson telephone exchanges.
While you can listen on a socket for multiple incoming connection requests, when the connection is established, it connects a unique port on the server to a unique port on the client. In order to multiplex a connection, you need to control both ends of the pipe and have a protocol that allows you to switch contexts from one virtual connection to another or use a stateless protocol that doesn't care about the client's identity. In the former case you'd need to implement it in the application layer so that you could reuse existing connections. In the latter case you could get by using a proxy that keeps track of which server response goes to which client. Since you're connecting to Yahoo Messenger, I don't think you'll be able to do this since it requires an authenticated connection and it assumes that each connection corresponds to a single user.
You can only multiplex multiple connections over a single socket if the other end supports such an operation.
In other words it's a function protocol - sockets don't have any native support for it.
I doubt yahoo messenger protocol has any support for it.
An alternative (to multiple IPs on a single NIC) is to design your own multiplexing protocol and have satellite servers that convert from the multiplex protocol to the yahoo protocol.
I'll trow in another approach you could consider (depending on how desperate you are).
Note that operating system TCP/IP implementations need to be general purpose, but you are only interested in a very specific use-case. So it might make sense to implement a cut-down version of TCP/IP (which only handles your use-case, but does that very well) in your application code.
For example, if you are using Linux, you could route a couple of IP addresses to a tun interface and have your application handle the IP packets for that tun interface. That way you can implement TCP/IP (optimised for your use-case) entirely in your application and avoid any operating system restriction on the number of open connections.
Of course, it's quite a bit of work doing the TCP/IP yourself, but it really depends on how desperate you are - i.e. how much hardware can you afford to throw at the problem.
500,000 arbitrary yahoo messenger connections - is your telco doing this on behalf of Yahoo? It seems like whatever solution has been in place for many years now should be scalable with the help of Moore's Law - and as far as I know all the IM clients have been pretty effective for a long time, and there's no pressing increase in demand that I can think of.
Why isn't this a reasonable problem to address with hardware plus traditional solutions?