Streaming Data - streaming

I unsuccessfully searched Google for a good definition and understanding of streaming data and its characteristics. My questions are:
What is streaming data?
How can it be detected?
Correction:
"How can it be detected" is not an appropriate question. Instead my question is:
How is it different from buffered data and other data transfer mechanisms?

It depends in what context you mean but basically streaming data is analagous to asynchronous data. Take the Web as an example. The Web (or HTTP specifically) is (basically) a request-response mechanism in that a client makes a request and receives a response (typically a Web page of some kind).
HTTP doesn't natively support the ability for servers to push content to clients. There are a number of ways this can be faked, including:
Polling: forcing the client to make repeated requests, typically inconspicuously (as far as the client is concerned);
Long-lived connections: this is where the client makes a normal HTTP request but instead of returning immediately the server hangs on to the request until there's something to send back. When the request times out or a response is sent th eclient sends another request. In this way you can fake server push;
Plug-ins: Java applets, Flash, Silverlight and others can be used to achieve this.
Anything where the server effectively sends data to the client (rather than the client asking for it)--regardless of the mechanism and whether or not the client is polling for that data--can be characterised as streaming data.
With non-HTTP transports (eg vanilla TCP) server push is typically easier (but can still run afoul of firewalls and th elike). An example of this might be a sharetrading application that receives market information from a provider. That's streaming data.
How do you detect it? Bit of a vague question. I'm not really sure what you're getting at.

When you say streaming data I think of the following, although I'm not sure if this is what you're getting at. To me it's playing a video/audio file while it's downloading. That's what happens when you go to YouTube and watch a video and it starts playing even though you haven't downloaded the whole video yet. But you can see the video downloading - I'm sure you're familiar with the seek bar filling up as the file downloads. It doesn't necessarily have to be a video or audio file but that's the most common.

Related

low connectivity protocols or technologies

I'm trying to enhance a server-app-website architecture in reliability, another programmer has developed.
At the moment, android smartphones start a tcp connection to a server component to exchange data. The server takes the data, writes them into a DB and another user can have a look on the data through a website. The problem is that the smartphones very regularly are in locations where connectivity is really bad. The consequence is that the smartphones lose the tcp connection and it's hard to reconnect. Now my question is, if there are any protocols that are so lightweight or accomodating concerning bad connectivity that the data exchange could work better or more reliable.
For example, I was thinking about replacing the raw TCP interface with a RESTful API, but I don't really know how well REST works in this scenario, as I don't have any experience in this area.
Maybe useful to know for answering this question: The server component is programmed in c#. The connecting components are android smartphones.
Please understand that I dont add some code to this question, because in my opinion its just a theoretically question.
Thank you in advance !
REST runs over HTTP which runs over TCP so it would have the same issues with connectivity.
Moving up the stack to the application you could perhaps think in terms of 'interference'. I quite often have to use technical stuff in remote areas with limited reception and it reminds of trying to communicate in a storm. If you think about it, if you're trying to get someone to do something in a storm where they can hardly hear you and the words get blown away (dropped signal), you don't read them the manual on how to fix something, you shout key words such as 'handle', 'pull', 'pull', 'PULL', 'ok'. So the information reaches them in small bursts you can repeat (pull, what? pull, eh? PULL! oh righto!)
Can you redesign the communications between the android app and the server so the server can recognise key 'words' with corresponding data and build up the request over a period of time? If you consider idempotency, each burst of data would not alter the request if it has already been received (pull, PULL!) and over time the android app could send/receive smaller chunks of the request. If the signal stays up, just keep sending. If it goes down, note which parts of the request haven't been sent and retry them when the signal comes back.
So you're sending the request jigsaw-style but the server knows how to reassemble the pieces in the right order. A STOP word at the end tells the server ok this request is complete, go work on it. Until that word arrives the server can store the incomplete request or discard it if no more data comes in.
If the server respond to the first request chunk with an id, the app can use the id to get the response and keep trying until the full response comes back, at which point the server can remove the response from its jigsaw cache. A fair amount of work though.

WebSocket/REST: Client connections?

I understand the main principles behind both. I have however a thought which I can't answer.
Benchmarks show that WebSockets can serve more messages as this website shows: http://blog.arungupta.me/rest-vs-websocket-comparison-benchmarks/
This makes sense as it states the connections do not have to be closed and reopened, also the http headers etc.
My question is, what if the connections are always from different clients all the time (and perhaps maybe some from the same client). The benchmark suggests it's the same clients connecting from what I understand, which would make sense keeping a constant connection.
If a user only does a request every minute or so, would it not be beneficial for the communication to run over REST instead of WebSockets as the server frees up sockets and can handle a larger crowd as to speak?
To fix the issue of REST you would go by vertical scaling, and WebSockets would be horizontal?
Doe this make sense or am I out of it?
This is my experience so far, I am happy to discuss my conclusions about using WebSockets in big applications approached with CQRS:
Real Time Apps
Are you creating a financial application, game, chat or whatever kind of application that needs low latency, frequent, bidirectional communication? Go with WebSockets:
Well supported.
Standard.
You can use either publisher/subscriber model or request/response model (by creating a correlationId with each request and subscribing once to it).
Small size apps
Do you need push communication and/or pub/sub in your client and your application is not too big? Go with WebSockets. Probably there is no point in complicating things further.
Regular Apps with some degree of high load expected
If you do not need to send commands very fast, and you expect to do far more reads than writes, you should expose a REST API to perform CRUD (create, read, update, delete), specially C_UD.
Not all devices prefer WebSockets. For example, mobile devices may prefer to use REST, since maintaining a WebSocket connection may prevent the device from saving battery.
You expect an outcome, even if it is a time out. Even when you can do request/response in WebSockets using a correlationId, still the response is not guaranteed. When you send a command to the system, you need to know if the system has accepted it. Yes you can implement your own logic and achieve the same effect, but what I mean, is that an HTTP request has the semantics you need to send a command.
Does your application send commands very often? You should strive for chunky communication rather than chatty, so you should probably batch those change request.
You should then expose a WebSocket endpoint to subscribe to specific topics, and to perform low latency query-response, like filling autocomplete boxes, checking for unique items (eg: usernames) or any kind of search in your read model. Also to get notification on when a change request (write) was actually processed and completed.
What I am doing in a pet project, is to place the WebSocket endpoint in the read model, then on connection the server gives a connectionID to the client via WebSocket. When the client performs an operation via REST, includes an optional parameter that indicates "when done, notify me through this connectionID". The REST server returns saying if the command was sent correctly to a service bus. A queue consumer processes the command, and when done (well or wrong), if the command had notification request, another message is placed in a "web notification queue" indicating the outcome of the command and the connectionID to be notified. The read model is subscribed to this queue, gets messessages and forward them to the appropriate WebSocket connection.
However, if your REST API is going to be consumed by non-browser clients, you may want to offer a way to check of the completion of a command using the async REST approach: https://www.adayinthelifeof.nl/2011/06/02/asynchronous-operations-in-rest/
I know, that is quite appealing to have an low latency UP channel available to send commands, but if you do, your overall architecture gets messed up. For example, if you are using a CQRS architecture, where is your WebSocket endpoint? in the read model or in the write model?
If you place it on the read model, then you can easy access to your read DB to answer fast search queries, but then you have to couple somehow the logic to process commands, being the read model the responsible of send the commands to the write model and notify if it is unable to do so.
If you place it on the write model, then you have it easy to place commands, but then you need access to your read model and read DB if you want to answer search queries through the WebSocket.
By considering WebSockets part of your read model and leaving command processing to the REST interface, you keep your loose coupling between your read model and your write model.

RTP/RTSP start up latency: Would this method help to reduce it, and if yes, why we don't have it

This is probably not the best forum for such a specialized question, but at the moment I don't know of a better one (open to suggestions/recommendations).
I work on a video product which for the last 10+ years has been using proprietary communications protocol (DCOM-based) to send the video across the network. A while ago we recognized the need to standardize and currently are almost at a point of ripping out all that DCOM baggage and replacing it with a fully compliant RTP/RTSP client/server framework.
One thing we noticed during testing over the last few months is that when we switch the client to use RTP/RTSP, there's a noticeable increase in start-up latency. The problem is that it's not us but RTSP.
BEFORE (DCOM): we would send one DCOM command and before that command even returned back to the client, the server would already be sending video. -- total latency 1 RTT
NOW (RTSP): This is the sequence of commands, each one being a separate network request: DESCRIBE, SETUP, SETUP, PLAY (assuming the session has audio and video) -- total of 4 RTTs.
Works as designed - unfortunately it feels like a step backwards because prior user experience was actually better.
Can this be improved? If you stay with the standard, short answer is, NO. However, my team fully controls our entire RTP/RTSP stack and I've been thinking we could introduce a new RTSP command (without touching any of existing commands so we are still fully inter-operable) as a solution: DESCRIBE_SETUP_PLAY.
We could send this one command, pass in types of streams interested in (typically, there's only one video and 0..1 audio). Response would include the full SDP text, as well as all the port information and just like before, server would start streaming instantly without waiting for anything else from the client.
Would this work? any downside that I may not be seeing? I'm curious why this wasn't considered (or was dropped) from official spec, since latency even in local intranet is definitely noticeable.
FYI, it is possible according to the RTSP 1.0 specification:
9.1 Pipelining
A client that supports persistent connections or connectionless mode
MAY "pipeline" its requests (i.e., send multiple requests without
waiting for each response). A server MUST send its responses to those
requests in the same order that the requests were received.
The RTSP 2.0 draft also contains support for pipelining.
However none of the clients/servers I've used implement it AFAIK.

iPhone: Strategies for uploading large files from phone to server

We're running into issues uploading hires images from the iPhone to our backend (cloud) service. The call is a simple HTTP file upload, and the issue appears to be the connection breaking before the upload is complete - on the server side we're getting IOError: Client read error (Timeout?).
This happens sporadically: most of the time it works, sometimes it fails. When a good connection is present (ie. wifi) it always works.
We've tuned various timeout parameters on the client library to make sure we're not hitting any of them. The issue actually seems to be unreliable mobile connectivity.
I'm thinking about strategies for making the upload reliable even when faced with poor connectivity.
The first thing that came to mind was to break the file into smaller chunks and transfer it in pieces, increasing the likelihood of each piece getting there. But that introduces a fair bit of complexity on both the client and server side.
Do you have a cleverer approach? How would you tackle this?
I would use the ASIHTTPRequest library. It's have some great features like bandwidth throttling. It can upload files directly from the system instead of loading the file into memory first. Also I would break the photo into like 10 parts. So for a 5 meg photo, it would be like 500k each. You would just create each upload using a queue. Then when the app goes into background, it can complete the part it's currently uploading. If you cannot finish uploading all the parts in the allocated time, just post a local notification reminding the user it's not completed. Then after all the parts have been sent to your server, you would call a final request that would combine all the parts back into your photo on the server-side.
Yeah, timeouts are tricky in general, and get more complex when dealing with mobile connections.
Here are a couple ideas:
Attempt to upload to your cloud service as you are doing. After a few failures (timeouts), mark the file, and ask the user to connect their phone to a wifi network, or wait till they connect to the computer and have them manually upload via the web. This isn't ideal however, as it pushes more work to your users. The upside is that implementationwise, it's pretty straight forward.
Instead of doing an HTTP upload, do a raw socket send instead. Using raw socket, you can send binary data in chunks pretty easily, and if any chunk-send times out, resend it until the entire image file is sent. This is "more complex" as you have to manage binary socket transfer but I think it's easier than trying to chunk files through an HTTP upload.
Anyway that's how I would approach it.

Long polling with NSURLConnection

I'm working on an iPhone application which will use long-polling to send event notifications from the server to the client over HTTP. After opening a connection on the server I'm sending small bits of JSON that represent events, as they occur. I am finding that -[NSURLConnectionDelegate connection:didReceiveData] is not being called until after I close the connection, regardless of the cache settings I use when creating the NSURLRequest. I've verified that the server end is working as expected - the first JSON event will be sent immediately, and subsequent events will be sent over the wire as they occur. Is there a way to use NSURLConnection to receive these events as they occur, or will I need to instead drop down to the CFSocket API?
I'm starting to work on integrating CocoaAsyncSocket, but would prefer to continue using NSURLConnection if possible as it fits much better with the rest of my REST/JSON-based web service structure.
NSURLConnection will buffer the data while it is downloading and give it all back to you in one chunk with the didReceiveData method. The NSURLConnection class can't tell the difference between network lag and an intentional split in the data.
You would either need to use a lower-level network API like CFSocket as you mention (you would have access to each byte as it comes in from the network interface, and could distinguish the two parts of your payload), or you could take a look at a library like CURL and see what types of output buffering/non-buffering there is there.
I ran into this today. I wrote my own class to handle this, which mimics the basic functionality of NSURLConnection.
http://github.com/nall/SZUtilities/blob/master/SZURLConnection.h
It sounds as if you need to flush the socket on the server-side, although it's really difficult to say for sure. If you can't easily change the server to do that, then it may help to sniff the network connection to see when stuff is actually getting sent from the server.
You can use a tool like Wireshark to sniff your network.
Another option for seeing what's getting sent/received to/from the phone is described in the following article:
http://blog.jerodsanto.net/2009/06/sniff-your-iphones-network-traffic/
Good luck!
We're currently doing some R&D to port our StreamLink comet libraries to the iPhone.
I have found that in the emulator you will start to get didReceiveData callbacks once 1KB of data is received. So you can send a junk 1KB block to start getting callbacks. It seems that on the device, however, this doesn't happen. In safari (on device) you need to send 2KB, but using NSURLConnection I too am getting no callbacks. Looks like I may have to take the same approach.
I might also play with multipart-replace and some other more novel headers and mime types to see if it helps stimulate NSURLConnection.
There is another HTTP API Implementation named ASIHttpRequest. It doesn't have the problem stated above and provides a complete toolkit for almost every HTTP feature, including File Uploads, Cookies, Authentication, ...
http://allseeing-i.com/ASIHTTPRequest/