Best Practices for serving dynamic files in a backend - rest

does anyone know of best practices or common strategies in backend design for serving dynamic images and videos to client applications?
Background: I'm currently building an application that allows users to upload their own images and videos. I'm not really sure about how to serve these media files back to the client in the most efficient way. Do I store the files on the same VPS that my application server is running on? Do I need to save the files in different qualities / densities to better adjust for the clients' screen resolution? (I'll have mostly mobile clients)
I tried googling these questions but apparently I'm asking the wrong questions :-)
I would really appreciate maybe a reference or professional vocabulary on these topics.
Thanks in advance.

1) You need to split web server and application server.
First of all do not try to stream media files from your backend unless you can offload low-level stuff to OS - most likely you will do it wrong.
Use proxy server as an web server to serve such files.
nginx will do.
Also you need to have backup of your media files the same way as you do backup of your database.
Storing static huge media files along with application server is wrong move - it will not scale at all.
You can add cron task to move files to some CDN server - when your move is complete you replace URL in database to match new location.
So by using nginx you will save precious CPU and RAM while file is getting moved to external server.
And CDN will help you to dedicate bandwidth and CPU/RAM resources to application server.
2) Regarding image resolution and downsampling:
Screens of modern handsets have the same or even better resolution compared to typical office workstation.
Link speeds have much bigger impact on UX.
If client has smartphone with huge screen but with slow link you still have to deliver image or video as fast as possible even if quality of media will not be match the resolution of handset.
It makes sense to downsample images on demand and store result on disk for nginx/CDN to serve it again.
In case of videos it makes sense to make "bad" version with big compression(quality loss) for the cases of slow link - device will downsample it itself during playback.
And you can keep client statistics (screen sizes/downlink speeds) and generate optimized versions of such video file later when you see that it is "popular".
FYI: Several years ago some social meda giant dropped idea to prepare all possible versions of the same media file in favour of FPGA on-the-fly resampler.
I do not remember the name of the company and URL to the article. It was probably instagram.
Some cloud providers have offers with FPGA or CUDA on board to do heavy lifting.
So in some cases you could exchange storage for heave horsepower to do conversion on the fly.

Related

Where I can store data for spreading them online?

My company have an application which could be installed with Qt Online-Installers. The data are stored on the our personal server, but, with time, we found out, that the internet connection is a bit slow for users on the other edge of the world. So, there is a question - "What services are we able to use to store these data, which are designed for these purposes?". When I was investigating this question I found the Information about the thing which is called "Content Delivery Network", but I'm not sure if it's something fits or not.
Unfortunately, I don't have enough experience in this area, so, maybe somebody knows more and could give me an advice. Thank you!
Cloudfront on AWS . Depends on what your content is but can probably store it on s3 and then use Cloudfront to cache it at edge locations across the globe.
Your research led you to the right topic because it sounds like you could benefit from a CDN. CDNs store cached versions of your website, download files, video, etc. on their servers which is often a distributed network of servers across the globe, known as 'Points of Presence' (PoPs). When a user requests a file from your website, assuming it is leveraging a CDN, the user request actually goes to the closest POP and retrieves the file. This improves performance because the user may be very far from your origin server, or your origin server may not have enough resources to answer every request by itself.
The amount of time a CDN caches objects from your site depends on configurable settings. You can inform the CDN on how to cache objects using HTTP cache headers. Here is an intro video from Akamai, the largest CDN, with some helpful explanation of HTTP caching headers.
https://www.youtube.com/watch?v=zAxSE1M4yKE
Cheers.

Using CouchDB as interface. Is it appropriate way?

our devices (microscopes with cameras) produce images and additional information to each image.
Now a middleware supplies wants to connect these devices to lab automation system. They have to acquire the data and we have to provide it. An astonishing thing for me was their interface suggestion - a very cryptical token separated format (ASTM E1394-97). Unfortunatelly, they even can't accomodate images in their protocol, and are aiming to get file-paths.
I thought it is not the up-to date approach. While lookink for alternatives, I saw CoachDB.
So, my idea was, our devices would import data including images in CoachDB and they could get the data. It seems even, that using mustache, we could produce the format they want (ascii-text) and placing URLs as image references instead of path's.
My question is, did someone applied CoachDB for such a use case already? It seems to be a little-bit misuse of CoachDB, as the main intention is interface not data storage. Another point disturbing me is, that the inventor of CoachDB went to other project Coachbase. Could it mean lack of support for CoachDB in the future?
Thank you very much for any insights and suggestions!
It's ok use-case and actually we're using CouchDB in such way - as proxing middleware between medical laboratory analyzers and LIS. Some of them publish images or pdf data on shared folders and we'd just loading them into related document as attachments.
More over you'd like to know, CouchDB is able to serve external processes (aka os_daemons) and take care about their lifespan: restarting if someone had terminated and starting right after you update config options through HTTP interface. This helps to setup ASTM client and server processes since this protocol is different from HTTP (which is native for CouchDB) which communicates with devices and creates documents as regular CouchDB clients. In same way you may setup daemons to monitor shared folders for specific files. And all this is just CouchDB with few "low bounded" plugins.

How to troubleshoot streaming video (rtmp) performance?

I'm streaming videos via rtmp from Amazon Cloudfront. Videos are taking a loooong time to start playing, and I don't have any way of figuring out why. Normally I'd use the "Net" panel in Firebug or Web Inspector to get a good first impression of when an asset starts to load and how long it takes to be sent (which can indicate whether the problem is on the server end or network versus the browser rendering). But since the video is played within a Flash player (Flowplayer in this case), it's not possible to glean any info about the status of the stream. Also since it's served from Amazon Cloudfront, I can't put any kind of debugging or measuring tools on the server (if such a tool even exists).
So... my question is: what are some ways I can go about investigating this problem? I'm hoping there would be some settings I can tweak on either the front-end (flowplayer) or back-end (Cloudfront), but without being able to measure anything or even understand where the problem is, I'm at a loss as to what those could be.
Any ideas for how to troubleshoot streaming video performance?
You can use WireShark (can diessect RTMP) or Fiddler to check what is going on... another point (besides the client and the server) to keep in mind is your ISP.
To dig deeper you can use this http://rtmpdump.mplayerhq.hu/ OR http://www.fluorinefx.com/ OR http://www.broccoliproducts.com/softnotebook/rtmpclient/rtmpclient.php.
You need to keep in mind that RTMP isn't ideal since it usually bypasses proxies and tries to make direct connection... if this doesn't work it can fallback, but that means that some time has already passed (it wait for a connection timeout etc.)... if you have an option to set CloudFront/Flowplayer to RTMPT then I would recommend doing so since that uses Port 80 for the connection.
Presumably - if you go and attempt to view a video - then come back 20min later and hit it again - it loads quickly?
SAN -> Edge Servers ---> Client
This is all well and good in a specific use case (i.e. small filesize of the origin content, large long running cache) - but, it becomes an issue when it's scaled out, with lots of media hosts running content through the system i.e. CloudFront.
The media cache they keep on their edge servers gets dumped fairly often - after the cache is filled - start dumping from the oldest file in cache - so if you have large video files that are not viewed often - they won't be sitting in the edge server cache, and take a long time to transfer to the edges - thus, giving an utterly horrific end user experience.
The same is true of youtube, for example - go and watch some randomly obscure, high duration video - and try it through a couple of proxies, so you hit different edge servers, you'll see exactly the same thing occur.
I noticed a very noticable lag when streaming RMTP from cloudfront. I found that switching to straight http progressive from the amazon S3 bucket made the lag time go away.

iPhone web-app: HTML5 database and audio files

I'm having issues with audio files on the iPhone web-app. Seems as each time an audio file is played, it's loaded first then played, even if repeating the same audio on a page that hasn't refreshed (done via javascript). From what I've research manifest files would be great but they are for offline application. I'm now researching HTML5 databases.
Does anyone know if HTML5 databases can store audio files such as mp3? The end result it then to pull the mp3 from the database. It might still have to load the file each time from the database but I'm hoping it's quicker than retrieving it from a server.
Thank you.
I think what you are after is possible, however you have a significant hurdle in that the implementation of HTML5 databases on most browsers is limited to 5mb as per w3c recommendations:
A mostly arbitrary limit of five
megabytes per origin is recommended.
Having said that the way its implemented in iPhone Safari is that databases can grow until they reach 5MB in size at which point the browser will ask the user if they wish to allow for the extra size, asking again at 10, 50, 100 and 500MB (see section "Estimated Database Size" in this post by html5doctor).
There is no limit on the number of databases you can build per domain in safari, however according to this post by Cantina Consulting you can have a total of 50MB across all databases in a single domain.
Given these parameters, a possible work-around for this implementation is to split your mp3 blobs across multiple databases, creating a new database each time your reach 4.9MB, however even if you follow this design it may not be ideal as you will still experience the following:
50MB is not a lot of audio files, a typical 5/6min song is about 5MB at 128Khz, so that only gives you space for about 1CD (60 min) of mp3 songs, after this you will need user cooperation to use additional database space.
You will still have significant security issues trying to play the mp3 blobs from the javascript runtime, it may be possible to bypass these tricking flash into thinking they are mp3 stream but I'm not sure how you'd go about it.
Feel free to have a play around with this iPhone HTML5 SQL Client I put together, you may want to use something similar for experimenting with your local mp3 Database.

Keeping iPhone application in sync with GWT application

I'm working on an iPhone application that should work in offline and online modes.
In it's online mode it's supposed to feed all the information the user enters to a webservice backed by GWT/GAE.
In it's offline mode it's supposed to store the information locally, and when connection is available sync it up to the web service.
Currently my plan is as follows:
Provide a connection between an app and a webservice using Protobuffers for efficient over-the-wire communication
Work with local DB using Core Data
Poll the network status, and when available sync the database and keep some sort of local-db-to-remote-db key synchronization.
The question is - am I in the right direction? Are the standard patterns for implementing this? Maybe someone can point me to an open-source application that works in a similar fashion?
I am really new to iPhone coding, and would be very glad to hear any suggestions.
Thanks
I think you've blurring the questions together.
If you've got a question about making a GWT web interface, that's one question.
Questions about how to sync an iPhone to a web service are a different question. For that, you don't want to use GWT's RPCs for syncing, as you'd have to fake out the 'browser-side' of the serialization system in your iPhone code, which GWT normally provides for you.
about system design direction:
First if there is no REAL need do not create 2 different apps one GWT and other iPhone
create one but well written GWT app. It will work off line no problem and will manage your data using HTML feature -- offline application cache
If it a must to create 2 separate apps
than at least save yourself effort and do not write server twice as if you go with standard GWT aproach you will almost sertanly fail to talk to server from stand alone app (it is zipped JSON over HTTP with some tricky headers...) or will write things twise so look in to the RestLet library it well supported by the GAE.
About the way to keep sync with offline / online switching:
There are several aproaches to consider and all of them are not perfect. So when you conseder yours think of what youser expects... Do not be Microsoft Word do not try to outsmart the user.
If there at least one scenario in the use cases that demand user intervention to merge changes (And there will be - take it to the bank) - than you will have implement UI for this - than there is a good reason to use it often - user will get used to it. it better than it will see it in a while since he started to use the app because a need fro it is rare because you implemented a super duper merging logic that asks user only in very special cases... Don't do it.
balance the effort. Because the mess that a bug in such code will introduce to user is much more painful than the benefit all together.
so the HOW:
The one way is the Do-UnDo way.
While off line - keep the log of actions user did on data in timed order user did them
as soon as you connected - send to server and execute them. Same from server to client.
Will work fine in most cases as long as you are not writing a Photoshop kind of software with huge amounts of data per operation. Also referred as Action Pattern by the GangOfFour.
Another way is a source control way. - Versions and may be even locks. very application dependent. DBMS internally some times use it for transactions implementations.
And there is always an option to be Read Only when Ofline :-)
Wonder if you have considered using a Sync Framework to manage the synchronization. If that interests you can take a look at the open source project, OpenMobster's Sync service. You can do the following sync operations
two-way
one-way client
one-way device
bootup
Besides that, all modifications are automatically tracked and synced with the Cloud. You can have your app offline when network connection is down. It will track any changes and automatically in the background synchronize it with the cloud when the connection returns. It also provides synchronization like iCloud across multiple devices
Also, modifications in the Cloud are synched using Push notifications, so the data is always current even if it is stored locally.
Here is a link to the open source project: http://openmobster.googlecode.com
Here is a link to iPhone App Sync: http://code.google.com/p/openmobster/wiki/iPhoneSyncApp