low connectivity protocols or technologies - rest

I'm trying to enhance a server-app-website architecture in reliability, another programmer has developed.
At the moment, android smartphones start a tcp connection to a server component to exchange data. The server takes the data, writes them into a DB and another user can have a look on the data through a website. The problem is that the smartphones very regularly are in locations where connectivity is really bad. The consequence is that the smartphones lose the tcp connection and it's hard to reconnect. Now my question is, if there are any protocols that are so lightweight or accomodating concerning bad connectivity that the data exchange could work better or more reliable.
For example, I was thinking about replacing the raw TCP interface with a RESTful API, but I don't really know how well REST works in this scenario, as I don't have any experience in this area.
Maybe useful to know for answering this question: The server component is programmed in c#. The connecting components are android smartphones.
Please understand that I dont add some code to this question, because in my opinion its just a theoretically question.
Thank you in advance !

REST runs over HTTP which runs over TCP so it would have the same issues with connectivity.
Moving up the stack to the application you could perhaps think in terms of 'interference'. I quite often have to use technical stuff in remote areas with limited reception and it reminds of trying to communicate in a storm. If you think about it, if you're trying to get someone to do something in a storm where they can hardly hear you and the words get blown away (dropped signal), you don't read them the manual on how to fix something, you shout key words such as 'handle', 'pull', 'pull', 'PULL', 'ok'. So the information reaches them in small bursts you can repeat (pull, what? pull, eh? PULL! oh righto!)
Can you redesign the communications between the android app and the server so the server can recognise key 'words' with corresponding data and build up the request over a period of time? If you consider idempotency, each burst of data would not alter the request if it has already been received (pull, PULL!) and over time the android app could send/receive smaller chunks of the request. If the signal stays up, just keep sending. If it goes down, note which parts of the request haven't been sent and retry them when the signal comes back.
So you're sending the request jigsaw-style but the server knows how to reassemble the pieces in the right order. A STOP word at the end tells the server ok this request is complete, go work on it. Until that word arrives the server can store the incomplete request or discard it if no more data comes in.
If the server respond to the first request chunk with an id, the app can use the id to get the response and keep trying until the full response comes back, at which point the server can remove the response from its jigsaw cache. A fair amount of work though.

Related

WebSocket/REST: Client connections?

I understand the main principles behind both. I have however a thought which I can't answer.
Benchmarks show that WebSockets can serve more messages as this website shows: http://blog.arungupta.me/rest-vs-websocket-comparison-benchmarks/
This makes sense as it states the connections do not have to be closed and reopened, also the http headers etc.
My question is, what if the connections are always from different clients all the time (and perhaps maybe some from the same client). The benchmark suggests it's the same clients connecting from what I understand, which would make sense keeping a constant connection.
If a user only does a request every minute or so, would it not be beneficial for the communication to run over REST instead of WebSockets as the server frees up sockets and can handle a larger crowd as to speak?
To fix the issue of REST you would go by vertical scaling, and WebSockets would be horizontal?
Doe this make sense or am I out of it?
This is my experience so far, I am happy to discuss my conclusions about using WebSockets in big applications approached with CQRS:
Real Time Apps
Are you creating a financial application, game, chat or whatever kind of application that needs low latency, frequent, bidirectional communication? Go with WebSockets:
Well supported.
Standard.
You can use either publisher/subscriber model or request/response model (by creating a correlationId with each request and subscribing once to it).
Small size apps
Do you need push communication and/or pub/sub in your client and your application is not too big? Go with WebSockets. Probably there is no point in complicating things further.
Regular Apps with some degree of high load expected
If you do not need to send commands very fast, and you expect to do far more reads than writes, you should expose a REST API to perform CRUD (create, read, update, delete), specially C_UD.
Not all devices prefer WebSockets. For example, mobile devices may prefer to use REST, since maintaining a WebSocket connection may prevent the device from saving battery.
You expect an outcome, even if it is a time out. Even when you can do request/response in WebSockets using a correlationId, still the response is not guaranteed. When you send a command to the system, you need to know if the system has accepted it. Yes you can implement your own logic and achieve the same effect, but what I mean, is that an HTTP request has the semantics you need to send a command.
Does your application send commands very often? You should strive for chunky communication rather than chatty, so you should probably batch those change request.
You should then expose a WebSocket endpoint to subscribe to specific topics, and to perform low latency query-response, like filling autocomplete boxes, checking for unique items (eg: usernames) or any kind of search in your read model. Also to get notification on when a change request (write) was actually processed and completed.
What I am doing in a pet project, is to place the WebSocket endpoint in the read model, then on connection the server gives a connectionID to the client via WebSocket. When the client performs an operation via REST, includes an optional parameter that indicates "when done, notify me through this connectionID". The REST server returns saying if the command was sent correctly to a service bus. A queue consumer processes the command, and when done (well or wrong), if the command had notification request, another message is placed in a "web notification queue" indicating the outcome of the command and the connectionID to be notified. The read model is subscribed to this queue, gets messessages and forward them to the appropriate WebSocket connection.
However, if your REST API is going to be consumed by non-browser clients, you may want to offer a way to check of the completion of a command using the async REST approach: https://www.adayinthelifeof.nl/2011/06/02/asynchronous-operations-in-rest/
I know, that is quite appealing to have an low latency UP channel available to send commands, but if you do, your overall architecture gets messed up. For example, if you are using a CQRS architecture, where is your WebSocket endpoint? in the read model or in the write model?
If you place it on the read model, then you can easy access to your read DB to answer fast search queries, but then you have to couple somehow the logic to process commands, being the read model the responsible of send the commands to the write model and notify if it is unable to do so.
If you place it on the write model, then you have it easy to place commands, but then you need access to your read model and read DB if you want to answer search queries through the WebSocket.
By considering WebSockets part of your read model and leaving command processing to the REST interface, you keep your loose coupling between your read model and your write model.

RTP/RTSP start up latency: Would this method help to reduce it, and if yes, why we don't have it

This is probably not the best forum for such a specialized question, but at the moment I don't know of a better one (open to suggestions/recommendations).
I work on a video product which for the last 10+ years has been using proprietary communications protocol (DCOM-based) to send the video across the network. A while ago we recognized the need to standardize and currently are almost at a point of ripping out all that DCOM baggage and replacing it with a fully compliant RTP/RTSP client/server framework.
One thing we noticed during testing over the last few months is that when we switch the client to use RTP/RTSP, there's a noticeable increase in start-up latency. The problem is that it's not us but RTSP.
BEFORE (DCOM): we would send one DCOM command and before that command even returned back to the client, the server would already be sending video. -- total latency 1 RTT
NOW (RTSP): This is the sequence of commands, each one being a separate network request: DESCRIBE, SETUP, SETUP, PLAY (assuming the session has audio and video) -- total of 4 RTTs.
Works as designed - unfortunately it feels like a step backwards because prior user experience was actually better.
Can this be improved? If you stay with the standard, short answer is, NO. However, my team fully controls our entire RTP/RTSP stack and I've been thinking we could introduce a new RTSP command (without touching any of existing commands so we are still fully inter-operable) as a solution: DESCRIBE_SETUP_PLAY.
We could send this one command, pass in types of streams interested in (typically, there's only one video and 0..1 audio). Response would include the full SDP text, as well as all the port information and just like before, server would start streaming instantly without waiting for anything else from the client.
Would this work? any downside that I may not be seeing? I'm curious why this wasn't considered (or was dropped) from official spec, since latency even in local intranet is definitely noticeable.
FYI, it is possible according to the RTSP 1.0 specification:
9.1 Pipelining
A client that supports persistent connections or connectionless mode
MAY "pipeline" its requests (i.e., send multiple requests without
waiting for each response). A server MUST send its responses to those
requests in the same order that the requests were received.
The RTSP 2.0 draft also contains support for pipelining.
However none of the clients/servers I've used implement it AFAIK.

How can I measure the breakdown of network time spent in iOS?

Uploads from my app are too slow, and I'd like to gather some real data as to where the time is being spent.
By way of example, here are a few stages a request goes through:
Initial radio connection (significant source of latency in EDGE)
DNS lookup (if not cached)
SSL/TLS handshake.
HTTP request upload, including data.
Server processing time.
HTTP response download.
I can address most of these (e.g. by powering up the radio earlier via a dummy request, establishing a dummy HTTP 1.1 connection, etc.), but I'd like to know which ones are actually contributing to network slowness, on actual devices, with my actual data, using actual cell towers.
If I were using WiFi, I could track a bunch of these with Wireshark and some synchronized clocks, but I need cellular data.
Is there any good way to get this detailed breakdown, short of having to (gak!) use very low level socket functions to reproduce my vanilla http request?
Ok, the method I would use is not easy, but it does work. Maybe you're already tried this, but bear with me.
I get a time-stamped log of the sending time of each message, the time each message is received, and the time it is acted upon. If this involves multiple processes or threads, I have each one generate a log, and then merge them into a common timeline.
Then I plot out the timeline. (A tool would be nice, but I did it by hand.)
What I look for is things like 1) messages re-transmitted due to timeouts, 2) delays between the time a message is received and the time it's acted upon.
Usually, this identifies problems that I can fix in the code I can control. This improves things, but then I do it all over again, because chances are pretty good that I missed something the last time.
The result was that a system of asynchronous message-passing can be made to run quite fast, once preventable sources of delay have been eliminated.
There is a tendency in posting questions about performance to look for magic fixes to improve the situation. But, the real magic fix is to refine your diagnostic technique so it tells you what to fix, because it will be different from anyone else's.
An easy solution to this would be once the application get's fired, make a Long Polling connection with the server (you can choose when this connection need's to establish prior hand, and when to disconnect), but that is a kind of a hack if you want to avoid all the sniffing of packets with less api exposure iOS provides.

How to maintain a persistant network-connection between two applications over a network?

I was recently approached by my management with an interesting problem - where I am pretty sure I am telling my bosses the correct information but I really want to make sure I am telling them the correct stuff.
I am being asked to develop some software that has this function:
An application at one location is constantly processing real-time data every second and only generates data if the underlying data has changed in any way.
On the event that the data has changed send the results to another box over a network
Maintains a persistent connection between the both machines, altering the remote box if for some reason the network connection went down
From what I understand, I imagine that I need to do some reading on doing some sort of TCP/IP socket-level stuff. That way if the connection is dropped the remote location will be aware that the data it has received may be stale.
However management seems to be very convinced that this can be accomplished using SOAP. I was under the impression that SOAP is more or less a way for a client to initiate a procedure from a server and get some results via the HTTP protocol. Am I wrong in assuming this? I haven't been able to find much information on how SOAP might be able to solve a problem like this.
I feel like a lot of people around my office are using SOAP as a buzzword and that has generated a bit of confusion over what SOAP actually is - and is capable of.
Any thoughts on how to accomplish this task would be appreciated!
I think SOAP is the wrong tool. SOAP is a spec for exchanging structured data. For your problem, the simplest thing would be to write a program to just transfer data and figure out if the other end is alive. Sockets are a good way to go. There are lots of socket programming tutorials on the net. Pick your language, and ask Mr. Google. Write a couple of demo programs to teach yourself how it works. Ask if you have more specific questions.
For the problem, you'll need a sender and a receiver. The sender sends data when it gets it, the receiver waits for data and hands it off when it arrives. Get that working first. Next, add in heartbeats; a message that says "I'm alive", sent periodically. Get that working next. You'll need to be determine the exact behavior you want -- should both sides send heartbeats to the other end, the maximum time you are willing to wait for a heartbeat, and what action you take should heartbeats stop arriving. The network connection can drop, the other end can crash, the other end can hang, and perhaps there are other conditions you should think about (e.g., what if the real time data is nonsense?). Figure out how to handle each condition, and code up the error handling. Test it out, and serve with a side of documentation.
SOAP certainly won't tell you when the data source goes down, though you could use "heartbeats" to add that.
Probably you are right and they are just repeating a buzz word, and don't actually know much about what SOAP is or does or have any real argument for why it ought to be used here.

What is a RESTful way of monitoring a REST resource for changes?

If there is a REST resource that I want to monitor for changes or modifications from other clients, what is the best (and most RESTful) way of doing so?
One idea I've had for doing so is by providing specific resources that will keep the connection open rather than returning immediately if the resource does not (yet) exist. For example, given the resource:
/game/17/playerToMove
a "GET" on this resource might tell me that it's my opponent's turn to move. Rather than continually polling this resource to find out when it's my turn to move, I might note the move number (say 5) and attempt to retrieve the next move:
/game/17/move/5
In a "normal" REST model, it seems a GET request for this URL would return a 404 (not found) error. However, if instead, the server kept the connection open until my opponent played his move, i.e.:
PUT /game/17/move/5
then the server could return the contents that my opponent PUT into that resource. This would both provide me with the data I need, as well as a sort of notification for when my opponent has moved without requiring polling.
Is this sort of scheme RESTful? Or does it violate some sort of REST principle?
Your proposed solution sounds like long polling, which could work really well.
You would request /game/17/move/5 and the server will not send any data, until move 5 has been completed. If the connection drops, or you get a time-out, you simply reconnect until you get a valid response.
The benefit of this is it's very quick - as soon as the server has new data, the client will get it. It's also resilient to dropped connections, and works if the client is disconnected for a while (you could request /game/17/move/5 an hour after it's been moved and get the data instantly, then move onto move/6/ and so on)
The issue with long polling is each "poll" ties up a server thread, which quickly breaks servers like Apache (as it runs out of worker-threads, so can't accept other requests). You need a specialised web-server to serve the long-polling requests.. The Python module twisted (an "an event-driven networking engine") is great for this, but it's more work than regular polling..
In answer to your comment about Jetty/Tomcat, I don't have any experience with Java, but it seems they both use a similar pool-of-worker-threads system to Apache, so it will have that same problem. I did find this post which seems to address exactly this problem (for Tomcat)
I'd suggest a 404, if your intended client is a web browser, as keeping the connection open can actively block browser requests in the client to the same domain. It's up to the client how often to poll.
2021 Edit: The answer above was in 2009, for context.
Today, I would suggest using a WebSocket interface with push notifications.
Alternatively, in the above suggestion, I might suggest holding the connection for 500-1000ms and check twice at the server before returning the 404, to reduce the overhead of creating multiple connections at the client.
I found this article proposing a new HTTP header, "When-Modified-After", that essentially does the same thing--the server waits and keeps the connection open until the resource is modified.
I prefer a version-based approach rather than a timestamp-based approach, since it's less prone to race conditions and gives you a little more information about what it is you're retrieving. Any thoughts to this approach?