How to maintain a persistant network-connection between two applications over a network? - sockets

I was recently approached by my management with an interesting problem - where I am pretty sure I am telling my bosses the correct information but I really want to make sure I am telling them the correct stuff.
I am being asked to develop some software that has this function:
An application at one location is constantly processing real-time data every second and only generates data if the underlying data has changed in any way.
On the event that the data has changed send the results to another box over a network
Maintains a persistent connection between the both machines, altering the remote box if for some reason the network connection went down
From what I understand, I imagine that I need to do some reading on doing some sort of TCP/IP socket-level stuff. That way if the connection is dropped the remote location will be aware that the data it has received may be stale.
However management seems to be very convinced that this can be accomplished using SOAP. I was under the impression that SOAP is more or less a way for a client to initiate a procedure from a server and get some results via the HTTP protocol. Am I wrong in assuming this? I haven't been able to find much information on how SOAP might be able to solve a problem like this.
I feel like a lot of people around my office are using SOAP as a buzzword and that has generated a bit of confusion over what SOAP actually is - and is capable of.
Any thoughts on how to accomplish this task would be appreciated!

I think SOAP is the wrong tool. SOAP is a spec for exchanging structured data. For your problem, the simplest thing would be to write a program to just transfer data and figure out if the other end is alive. Sockets are a good way to go. There are lots of socket programming tutorials on the net. Pick your language, and ask Mr. Google. Write a couple of demo programs to teach yourself how it works. Ask if you have more specific questions.
For the problem, you'll need a sender and a receiver. The sender sends data when it gets it, the receiver waits for data and hands it off when it arrives. Get that working first. Next, add in heartbeats; a message that says "I'm alive", sent periodically. Get that working next. You'll need to be determine the exact behavior you want -- should both sides send heartbeats to the other end, the maximum time you are willing to wait for a heartbeat, and what action you take should heartbeats stop arriving. The network connection can drop, the other end can crash, the other end can hang, and perhaps there are other conditions you should think about (e.g., what if the real time data is nonsense?). Figure out how to handle each condition, and code up the error handling. Test it out, and serve with a side of documentation.

SOAP certainly won't tell you when the data source goes down, though you could use "heartbeats" to add that.
Probably you are right and they are just repeating a buzz word, and don't actually know much about what SOAP is or does or have any real argument for why it ought to be used here.

Related

low connectivity protocols or technologies

I'm trying to enhance a server-app-website architecture in reliability, another programmer has developed.
At the moment, android smartphones start a tcp connection to a server component to exchange data. The server takes the data, writes them into a DB and another user can have a look on the data through a website. The problem is that the smartphones very regularly are in locations where connectivity is really bad. The consequence is that the smartphones lose the tcp connection and it's hard to reconnect. Now my question is, if there are any protocols that are so lightweight or accomodating concerning bad connectivity that the data exchange could work better or more reliable.
For example, I was thinking about replacing the raw TCP interface with a RESTful API, but I don't really know how well REST works in this scenario, as I don't have any experience in this area.
Maybe useful to know for answering this question: The server component is programmed in c#. The connecting components are android smartphones.
Please understand that I dont add some code to this question, because in my opinion its just a theoretically question.
Thank you in advance !
REST runs over HTTP which runs over TCP so it would have the same issues with connectivity.
Moving up the stack to the application you could perhaps think in terms of 'interference'. I quite often have to use technical stuff in remote areas with limited reception and it reminds of trying to communicate in a storm. If you think about it, if you're trying to get someone to do something in a storm where they can hardly hear you and the words get blown away (dropped signal), you don't read them the manual on how to fix something, you shout key words such as 'handle', 'pull', 'pull', 'PULL', 'ok'. So the information reaches them in small bursts you can repeat (pull, what? pull, eh? PULL! oh righto!)
Can you redesign the communications between the android app and the server so the server can recognise key 'words' with corresponding data and build up the request over a period of time? If you consider idempotency, each burst of data would not alter the request if it has already been received (pull, PULL!) and over time the android app could send/receive smaller chunks of the request. If the signal stays up, just keep sending. If it goes down, note which parts of the request haven't been sent and retry them when the signal comes back.
So you're sending the request jigsaw-style but the server knows how to reassemble the pieces in the right order. A STOP word at the end tells the server ok this request is complete, go work on it. Until that word arrives the server can store the incomplete request or discard it if no more data comes in.
If the server respond to the first request chunk with an id, the app can use the id to get the response and keep trying until the full response comes back, at which point the server can remove the response from its jigsaw cache. A fair amount of work though.

Architecture diagram involving the flow of data between trading engine, order routing engine,quickfix and the exchange

If I write an order routing system based on QuickfixJ, can I just start submitting my trades to an exchange? Or do I need to register myself with the exchange or get permission or something like that?
I am not able to understand how QuickfixJ, the order routing system, the actual trading engine and the exchange fits together. Any online architecture diagram would be very helpful for how these components fit together.
FIX is just a transmission protocol. By itself, it's pretty dumb. QuickFIX (any language port) is just an engine that does all the boring dirty work of managing a FIX connection.
The FIX specification includes a list of messages and fields. In reality, you can treat these as suggestions that, in practice, no commercial FIX counterparty uses as-is. Every counterparty I've connected to makes modifications to those messages and fields, sometimes adding entirely new messages. No counterparty supports every message and field.
When connecting to a counterparty, do not assume anything. Your counterparty should provide documentation on how they expect their interface to be used, and which messages and fields they will send and which they expect to receive from you.
Their docs should tell you which message to send them to request market data and any special fields/options you must use.
Their docs will tell you how to submit a trade.
Their docs will tell you how to do anything that they support, and which messages/fields you will receive in return.
Do not try to send any message type to your counterparty unless their docs say they support it.
If you are writing the ORS side... then you have no docs. If you haven't written a FIX client before, you probably shouldn't be writing a FIX server without some assistance from someone who has. At the least, you should try to get ahold of some other systems' FIX interface docs to get an idea of how to go about it. (Unfortunately, such firms usually only give them to client-developers.)

What is the best, most efficient, Client pool technique with Erlang

I'm a real Erlang newbie (started 1 week ago), and I'm trying to learn this language by creating a small but efficient chat server. (When I say efficient I mean I have 5 servers used to stress test this with hundreds of thousands connected client - A million would be great !)
I have find some tutorials doing so, the only thing is, that every tutorial i found, are IRC like. If one user send a message, all user except sender will receive it.
I would like to change that a bit, and use one-to-one discussion.
What would be the most effective client pool for searching a connected user ?
I thought about registering the process, because it seems to do everything I need, but I really don't think this is the better way to do it. (Or most pretty way to do it anyway).
Does anyone would have any suggestions doing this ?
EDIT :
Every connected client is affected to an ID.
When the user is connected, it first send a login command to give it's id.
When an user wants to send a message to another one the message looks like this
[ID-NUMBER][Message] %% ID-NUMBER IS A FIXED LENGTH
When I ask for "the most effective client pool", I'm actually looking for the fastest way to retrieve/add/delete one client on the connected client list which could potentially be large (hundred of thousands -- maybe millions)
EDIT 2 :
For answering some questions :
I'm using Raw Socket (Using telnet right now to communicate with server) - will probably move to ssl later...
It is my own protocol
Every Client is a spawned Pid
Every Client's Pid is linked to it's own monitor (mostly for debugging reason - The client if disconnected should reconnect by it's own starting auth from scratch)
I have read a couple a book before starting coding, So I do not master yet every aspect of Erlang but I'm not unaware of it, I will read more about it when needed I guess.
What I'm really looking for is the best way to store and search thoses PIDs to send message directly from process to process.
Should I write my own search Client function using lists ?
or should I use ets ?
Or even use register/2 unregister/1 and whereis/1 to maintain my client list, using it's unique id as atom, it seems to be the simplest way to do so, I really don't know if it is efficient, but I'm pretty sure this is the ugly solution ;-) ?
I'm doing something similar to your chat program using gproc as a pubsub (similar to the demo on that page). Each client registers as it's id. To find a particular client, you do a lookup on that client id. To subscribe to a client, you add a property to that process of the client id being subscribed to. To publish, you call gproc:send(ClientId,Message). This covers your use case, the more general room based chat as well, and can handle distributed masterless registry of processes.
I haven't tested to see if it scales to millions, but it uses ets to do the storage and gproc is rock solid code by Ulf Wiger. I wouldn't count on being able to write a better implementation.
I'm also kind of new to Erlang (a couple of months), so I hope this can put you in the correct path :)
First of all, since you're a "newbie", you should know about these sites:
Erlang Official Documentation:
Most common modules are in the stdlib application, so start from there.
Alternative Documentation:
There's a real time search engine, so it is really good when searching
for specific modules.
Erlang Programming Rules:
For you to enter in the mindset of erlang programming.
Learn You Some Erlang Book:
A must read for everyone starting with Erlang. It's really comprehensive
and fun to read!
Trapexit.org:
Forum and cookbooks, to search for common problems faced by programmers.
Well, thinking about a non persistent database, I would suggest the sets or gb_sets modules (documentation here).
If you want persistence, you should try dets (see documentation above), but I can't state anything about efficiency, so you should research this topic a bit further.
In the book Learn You Some Erlang there is a chapter on data structures that says that sets are better for read intensive systems, while gb_sets is more appropriate for a balanced usage.
Now, Messaging systems are what everyone wants to do when they come to Erlang because the two naturally blend. However, there are a number of things to look into before one continues. Messaging basically involves the following things: User Registration, User Authentication, Sessions Management,Logging, Message Switching/routing e.t.c. Now, to do all or most of these, one needs to have a Database, certainly IN-MEMORY, thats leads me to either Mnesia or ETS Tables. Since you are new to Erlang, i suppose you have not yet really mastered working with these. At one moment, you will need to maintain Who is communicating with who, Who is available for Chat e.t.c. Hence you might need to look up things and write things some where.Another thing is you have not told us the Client. Is it going to be a Web Client (HTTP), is it an entirely new protocol you are implementing over raw Sockets ? Which ever way, you will need to master something called: Concurrency in Erlang. If a user connects and is assigned an ID, if your design is A process Per User, then you will have to save the Pids of these Processes or register them against some criteria, yet again monitor them if they die e.t.c. Which brings me to OTP and Supervision trees. There is quite alot, however, tell us more about the Client and Server interaction, the Network Communication you need e.t.c. Or is it just a simple Erlang RPC project you are doing for your own revision ?
EDIT Use ETS Tables, or use Mnesia RAM tables. Do not think of registering these Pids or Storing them in a list, Array or set. Look at this solution which was given to this question

How can I measure the breakdown of network time spent in iOS?

Uploads from my app are too slow, and I'd like to gather some real data as to where the time is being spent.
By way of example, here are a few stages a request goes through:
Initial radio connection (significant source of latency in EDGE)
DNS lookup (if not cached)
SSL/TLS handshake.
HTTP request upload, including data.
Server processing time.
HTTP response download.
I can address most of these (e.g. by powering up the radio earlier via a dummy request, establishing a dummy HTTP 1.1 connection, etc.), but I'd like to know which ones are actually contributing to network slowness, on actual devices, with my actual data, using actual cell towers.
If I were using WiFi, I could track a bunch of these with Wireshark and some synchronized clocks, but I need cellular data.
Is there any good way to get this detailed breakdown, short of having to (gak!) use very low level socket functions to reproduce my vanilla http request?
Ok, the method I would use is not easy, but it does work. Maybe you're already tried this, but bear with me.
I get a time-stamped log of the sending time of each message, the time each message is received, and the time it is acted upon. If this involves multiple processes or threads, I have each one generate a log, and then merge them into a common timeline.
Then I plot out the timeline. (A tool would be nice, but I did it by hand.)
What I look for is things like 1) messages re-transmitted due to timeouts, 2) delays between the time a message is received and the time it's acted upon.
Usually, this identifies problems that I can fix in the code I can control. This improves things, but then I do it all over again, because chances are pretty good that I missed something the last time.
The result was that a system of asynchronous message-passing can be made to run quite fast, once preventable sources of delay have been eliminated.
There is a tendency in posting questions about performance to look for magic fixes to improve the situation. But, the real magic fix is to refine your diagnostic technique so it tells you what to fix, because it will be different from anyone else's.
An easy solution to this would be once the application get's fired, make a Long Polling connection with the server (you can choose when this connection need's to establish prior hand, and when to disconnect), but that is a kind of a hack if you want to avoid all the sniffing of packets with less api exposure iOS provides.

Deciphering MMORPG Protocol Encoding

I plan on writing an automated bot for a game.
The tricky part is figuring out how they encoded their protocol... To make the bot run around is easy, simply make the character run and record what it does in wireshark. However, interpreting the environment is more difficult... It recieves about 5 packets each second if you are idle, hence lots of garbarge.
My plan: Because the game runs under TCP, I will use freecap (http://www.freecap.ru/eng) to force the game to connect to a proxy running on my machine. I will need this proxy to be capable of packet injection, or perhaps a server that is capable of resending captured packets. This way I can recreate and tinker around with what the server sends, and understand their protocol encoding.
Does anyone know where I can get a proxy that allows packet injection or where I can perform packet injection (not via hardware, as is the case with wireless or anything!)
Where of if I can find a server/proxy that resends captured packets (ie: replays a connection).
Any better tools or methodologies for pattern matching? Something which can highlight patterns from mutliple messages would be GREAT.
OR, is there a better way to decipher this here? Possibly a dissasembly strategy (via hooking a winsock function and starting the dissassembly from there) ? I have not done this before so I am not sure. OR , any other ideas?
Network traffic interception and protocol analysis is generally a less favored method to accomplish your goal here. For most modern games, encryption is a serious factor, and there are serious headaches associated with the protocol analysis for any but trivial factors of the most common gameplay scenarios.
Most modern implementations* of what you are trying to do rely on reading and manipulating the memory space and process of a running client. The client will have already done all the hard parts for you, including decrypting the traffic and sorting it into far more easy to read data structures. For interacting with the server you can call functions built into the client instead of crafting entire series of packets from scratch. The plus to this approach is that you have to do far less work to interpret the data and produce activity. The minus is that there is often some data in the network traffic that would be useful to a bot but is discarded by the client, or that you may want to send traffic to the server that the client cannot produce (which, in my own well-developed hierarchy for such, is a few steps farther down the "cheating" slope).
*...I say this having seen the evolution of the majority of MMORPG botting/hacking communities from network protocol analyzers like ShowEQ and Odin's Eye / Excalibur to memory-based applications like MacroQuest and InnerSpace. On that note, InnerSpace provides an excellent extensible framework for the memory/process-based variant of what you are attempting, and you should look into it as a basis for your project if you abandon the network analysis approach.
As I've done a few game bots in the past (for fun, not profit or griefing of course - writing game bots is a lot of fun), I recommend the following:
If you can code and there isn't cheat protection preventing you from doing it, I highly recommend writing an injected DLL for the following reasons:
Your DLL will be able to access the game's memory space directly, and once you reverse-engineer the data structures (either by poking around memory or by code disassembly), you'll have access to lots of data. This will also allow you to bypass any network encryption the game may have. The downside of accessing process memory directly is that offsets and data structures change between versions - however, data structures don't change very often with a stable game, and you can compensate offset changes by searching for code patterns instead of using fixed offsets.
Either way, you'll still be able to hook WinSock functions using API hooks (check out Microsoft Detours and the excellent but now-commercial madCodeHook).
otherwise, I can only advise that you give live/interactive packet editors like WPE Pro a try.
In most scenarios, the coolest methods (code reverse-engineering and direct memory access) tend to be the least productive. They require a lot of skill (to understand the code) and time, both initially (to go through all the code and develop code to interact with the data structure) and for maintainance (in case the game is being updated). (Of course, they sometimes do allow doing cool stuff which is impossible to do with the official client, but most of the time this is obvious as blatant cheating, and likely to attract the GMs quickly). Most of the time bots are made by replacing game graphics/textures with solid colours, and creating simple "pixel" bots which search for certain colours on the screen and react accordingly (e.g. click them).
Hope this helps, and remember - cheating is only fun when it doesn't make the game less fun for everyone else ;)
There are probably a few reasonable assumptions you can make that should simplify your task enormously. However, to make the best use of them you will probably need greater comfort with sleeves-rolled-up programming than it sounds like you have.
First, it's a safe bet that the encryption they are using falls into one of three categories:
None
Cheesy
Far better than you are likely to crack
With the odds of the middle case being very low.
Next, it's a safe bet that the packets are encrypted / decrypted close to the edge of the program (right as they come in, right before they go out) and that the body of the game deals with them in decrypted form.
Finally, the protocol they are using most likely consists of either
ascii with data blocks
binary goo
So do a little packet sniffing with a card set in promiscuous mode for unencrypted ascii. If you see some, great, you're ahead of the game. But if you don't give up the whole tapping-the-line idea and instead start following the code as it returns from the sending data out by breakpointing and stepping with a debugger. Figure the outermost layer or three will be standard network stuff, then will come the encryption layer, and beyond that the huge mass of stuff that deals with the protocol unencrypted.
You should be able to get this far in an hour if you're hot, a weekend if you're reasonably skilled, motivated, and diligent, and never if you are hopeless. But it is possible in principle (and doubtlessly far easier in practice) to do it this way.
Once you get to where something that looks like unencrypted goo comes in, gets mungled, and the mungled form goes out, then start worrying about what it means.
-- MarkusQ
A) I play a MMO and do not support bots, voting down...
B) Download Backtrack v.3, run an arpspoof on your default gateway and your host. There is an application that will spoof the remote host's SSL cert sslmitm (I believe is the name) which will then allow you to create a full connection through your host. Then fireup tcpdump/ethereal/wireshark (choose your pcap poison) and move around do random stuff to find out what packet is doing what. That will be your biggest challenge; but proxying with a Man in the Middle attack on yourself is the way to go.
C) I do not condone this activity, this information is only being provided as free information.
Sounds like there is not encryption going on, so you could do a network approach.
A great place to start would be to find the packet ID's - most of the time, something near the front of the packet is going to be an ID of the type of the packet. For example move could be 1, shoot fired could be "2", chat could be "4".
You can write your own proxy that listens on one port for your game to connect, and then connects to the server. You can make keypresses to your proxy fire off commands, or you can make your proxy write out debugging info to help you go further.
(I've written a bot for an online in game in PHP - of all things.)