WebApi supporting Range requests without querying the db multiple times - entity-framework

Currently I have a dotnetcore WebApi that is serving up videos. The videos are stored in a SQL server table as a varbinary(MAX). This was working however I was reading that to support on IOS safari we needed to accept the ranges header, so I have added support for this (I think).
However now I am noticing two things (could be unrelated):
1) Whenever a call is made to this API the CPU throttles to 100%. I can only assume that is EntityFramework querying the db for a 25MB file. Seems crazy but the API is doing nothing else? Can this be improved as the server just grinds.
2) Multiple requests are made to the API with different range bytes requested. However my api in turn queries the db on each request and so sends the CPU into overdrive for a long period.
Is there a better way of handling range requests when querying for a large object?

If you ask me, EF is not really well suited for this, it's too clunky and resources consuming. You can write your own T-SQL using something like substring. This being said, from a practical point of view, depending on how many and how big these files are and how many users you have, I would not go with such a solution.
I don't think a SQL database should be how you store your data at all for this.
You could start doing some research on how netflix does it: https://www.techhive.com/article/2158040/how-netflix-streams-movies-to-your-tv.html
You probably want something like that, a CDN system, some sort of caching. Your way of doing it now might work while you build it, with one or two users but if this is an API used by lots of people, you will quickly find out that it won't scale.

Related

REST stateless with database

A little backstory
I have to develop a web application for college. This web application has to do with managing different locations using google maps like pinning new locations adding custom descriptions and so on. The login part is done using facebook (login with facebook). The more interesting part would be that the queries (client-server) would have to be done by using REST.
The part that i try to understand
If i use a database to store my user's unique ID, their online status (online/offline) and somehow (didn't settle actually on the idea) to keep a JSON on the server that would contain each user's pinned locations, would all this actually be ok with the REST paradigm ?
I find mixed answers on the internet and i don't know how to think of the statelessness of the application correctly. A session would not be created but the credentials from the database would be necessary for the users to communicate with each other.
The other side of the question
Considering that i'm mistaken and i shouldn't use the database to store the credentials and locations like that, how am i supposed to keep all that data ? I'm thinking something like JSON cached client-side but what if my client changes the computer, wouldn't this mean that he loses all his data? (Also wouldn't this make MVC handicapped by not having a model?) How do i really keep track of all things.
You're making this way too hard on yourself, try to keep it simple since you probably have a deadline. REST is a way of using APIs with HTTP verbs like GET, POST, PUT, and DELETE. It says nothing about how to store the data behind your APIs.
As for storing the data, a database should be fine. Storing it as JSON in the db could work, but in the end you'll have to parse the json every time that you want to use it, so I would suggest that you store it in a DB in such a way that it can be read easily.
For a beginner (especially if you're doing this for a school project), I would definitely suggest that you set up a relational database like Microsoft SQL Database (Microsoft Stack), or a MySQL/PosGres Database (I think this is what they'd use in linux), but if you wanna skip the relational db approach (because it might not be all that "easy" to get going), you can always try a NoSQL database like MongoDB.
Relevant links to help:
http://rest.elkstein.org/ (REST explained)
http://www.restapitutorial.com/lessons/httpmethods.html (REST verbs)
http://en.wikipedia.org/wiki/Relational_database (what is a relational db)
http://en.wikipedia.org/wiki/Database_normalization (Kinda the goal of relational db.. but note you can go too far...http://lemire.me/blog/archives/2010/12/02/over-normalization-is-bad-for-you/)
http://www.mongodb.com/nosql-explained (NoSQL explanation)

Programmatic export/dump/mass data retrieval (BaaS)

Does anyone have experiences with programmatic exports of data in conjunction with BaaS providers like e.g. parse.com or StackMob?
I am aware that both providers (as far as I can tell from the marketing talk) offer a REST API which will allow for queries against the database, not only to be used by mobile clients but also by e.g. custom web apps.
I am also aware that both providers offer a manual export of data (parse.com via their web interface, StackMob via support).
But lets say I would like to dump all data nightly, so that I can import it into a reporting system for instance. Or maybe simply to have an up-to-date backup.
In this case, I would need a programmatic way to export/replicate the data stored in the backend. Manual exports are not an option for obvious reasons.
The REST APIs offered however seem to be designed for specific queries, not for mass reads (performance?). Let alone the pricing - I assume none of the providers would be happy about a nightly X Gigabyte data export via their REST API, so their probably will be a price tag.
I just couldn't find any specific information on this topic so far, so I was wondering if anyone else has already gone through this. Also, any suggestions on StackMob/parse alternatives are welcome, especially if related to the data export topic.
Cheers, Alex
Did you see the section of the Parse REST API on Batch operations? Batch operations reduce the number of API calls needed to grab data so that you are not using a call for every row you retrieve. Keep in mind that there is still a limit (the default is 100, but you can set it to a maximum of 1000). That means you are still limited to pulling down 1000 rows per API call.
I can't comment on StackMob because I haven't used it. At my present job, we are using Parse and we wrote a C# app which compares the data in a Parse class with a SQL table and pulls down any changes.

How should I architect my iPhone app to talk to my website?

I'm planning my first iPhone app and I'd like to get some inputs as to how to build it, right from the start. The iPhone app is being built to be paired with a public facing web application that is already built in PHP.
I'd like the web platform to be central (data is housed in a mySQL database), and have the
iPhone clients talk to it and use REST'ful methods to perform the functions of the site
(fetching latest content, posting content, voting, account management as examples).
I'd like the clients to get a local copy of the data in a SQLite database, but refresh to get the latest version of the feed (similar to the Twitter app).
Couple of thoughts I have right now:
Use something like ASIHTTPRequest to send/recieve data to PHP files on the server listening for requests
JSON - would I be better off to send the GET/POSTS to a PHP that returns JSON objects, and work with some sort of wrapper that manages the data and communicates changes to the local SQLite database?
Am I totally off in how I should be building this thing to communicate with the web? Is
there a best practice for this?
I'd really appreciate any input on how you would architect this sort of a setup.
Thank you,
EDIT: After reading my own post again, I know it sounds like a Twitter client, but it is NOT, although it has similar features/structure of a Twitter type setup. Thanks!
As you already outlined in your plan, XML and REST are a great way to communicate with a web application. I want to suggest few details about how to actually design and build it, or what you should keep in mind.
First of all, I believe it's important to stick with MVC. I've seen people creating HTTP connections in view-controllers, controllers being NSXMLParser's delegate, controllers containing data in member variables. I've even seen UITableCells establishing HTTP connections. Don't do it!
Your model and its basic manipulation code should be as much extracted from user interface as possible. As you already have created the model in your web-application, try to recreate the entities in your iPhone project. Don't be afraid of having some simple methods in entity classes, but do not make them use external resources, especially tcp connections. As an example of methods in entity class you might have methods that formats data in specific ways (dates as an example, or returning fullname as concatenation of firstname and surname), or you can even have a method like - (void)update that would act as a wrapper to call class responsible to update the model.
Create another class for updating the model - fetching the XMLs from web-app. Do not even consider using synchronous connections, not even from a dedicated thread. Asynchronous connections with delegate is the way to go. Sometimes multiple requests need to be made to get all required data. You might want to create some kind of state-machine to keep the information about in which stage of downloading you are, and progress from stage to stage, skipping to the end if error occurs, re-executing from failed stage after some moments.
Download data somewhere temporarily, and first when you have it all, make a switch and update user interface. This helps responsiveness during launching the app - user gets to work immediately with data stored locally, while the update mechanism is downloading the new data.
If you need to download lots of files, try to download them simultaneously, if dependencies between files allow for that. This involves creating a connection per request, probably delegate instance for each of them. You can of course have only one delegate instance for all of those connections, but it gets a bit more complex to track the data. Downloading simultaneously might decrease latency considerably, making the mechanism much faster for the user.
To save the time and bandwidth, consider using HTTP's If-Modified-Since and/or ETag headers. Remember the time or tag when you requested the data the last time, and next time send it in HTTP's header. Your web-application should return HTTP code 304 if content has not been changed. iPhone app should react on this code accordingly in connection:didReceiveResponse:.
Create a dedicated class to parse the XML and update the model. You can use NSXMLParser, but if your files are not huge I strongly recommend TouchXML, it's such a pleasure to work with XML as document (it also supports XPath), instead of an event based API. You can use this parser also when files are downloaded to check their validity - re-download if parsing fails. That's when dedicated class for parsing comes handy.
If your dataset is not huge, if you do not need to persist downloaded data on iPhone forever, you probably don't need to store them in SQLite database, you can simply store them in XML format - just a simple caching. That at least might be the way for a twitter app. It gets easier that way, but for bigger data sets XML consumes lots of memory and processing power - in that case SQLite is better.
I'd suggest using Core Data, but you mention this is your first iPhone app, so I suggest you don't use it. Yet.
Do not forget about multitasking - your app can go to sleep in the middle of download, you need to cancel connections, and cleanup your update mechanisms. On app's wake-up you might want to resume the update.
Regarding the view part of the application - use Interface Builder. It might be painful in the beginning, but it pays off in the long run.
View controllers are the glue between model and views. Do not store data in there. Think twice about what to implement where, and who should call it.
This is not related to architecture of the app, but I want to remind that Objective-C is very expressive language. Code should read much like a sentence. Extend classes with protocols. As an example the other day I needed first line of a string. Sure, you can write a one-liner where you find first occurrence of a new-line, and get a substring from beginning till there. But it doesn't look right. I've added - (NSString*)firstLine into my NSString's protocol. Code looks so much better this way, it doesn't need any comments.
There are lots of things to consider in both architecture and design of any project, they both should go hand in hand. If one is causing trouble to the other, you need to adapt. Nothing is written in stone.
I'm currently working on an app that sounds similar to yours. I'd also suggest ASIHTTPRequest, and probably something like TouchJSON for JSON parsing, or extending/making a delegate of NSXMLParser if you want to parse XML.
As suggested by JosephH, depending on how your app works you may want to consider alternate authentication methods: I'd take a look at something token-based like OAuth, which has ready-made libraries for people to dig in to.
SQLite is totally viable for feed caching, although I prefer NSCoding so that you can freeze-dry your custom data structures.
As a general suggestion, make sure to spend a lot of time thinking about every use case and corner case for connections: it's easy to assume a user will only contact the server in certain ways and at certain times, and then after you throw in multitasking/incoming calls/lock screen/memory warnings, things can get hairy without any planning.
All in all, you seem to be on the right track, just make sure you plan out everything beforehand :)
Apple have a brand new in depth piece of sample code - MVCNetworking that shows in depth how to use subclasses of NSHTTPRequests and NSOperationQueues.
As others mentioned, I think you are asking the right questions and are heading in the right direction. All of the replies above are valuable advice. Here is my advice, and I hope you'll find it useful.
No matter which method/library you choose to talk to your web services, I think it's important to make a clean separation in the way you design your data model on the phone VS. the data model in your web application. You have 3 major distinctions to keep in mind for your design:
Data model on the web application (reflected by your existing mySQL database)
Since this is already there, there is not much to say about it, except that it will influence a lot your design for the following 2 parts. I suggest to make this model the 'master reference' for how your data is represented across platforms.
Data model on the iPhone app (reflected by the information you need to display in the iPhone app)
This is where the fun begins. First, you need a good understanding of what data you need to display in the phone app. So have a good, high level design of your app first (use pen and paper, draw mock-ups of each view and the interactions between them, model the navigation between your view controllers etc.). It really helps to understand the interactions between your view controllers and the various bits and pieces of data you want to show in the app. This will help you create the requirements for the data model on the phone. Based on these requirements, map the existing (web) data model to a new model, suited to your iPhone app. This new model may or may not include all tables and fields found in your web app. But the general representation of the 2 models should be very similar (e.g. relationships, data types, etc.)
Data model used to communicate between the 2 above (this is your 'data exchange protocol')
Once you have the 2 representations of your data above, you need to 'translate' from one to the other, both ways. Design your data exchange protocol to be as simple and compact as possible. You don't want to waste bytes on useless information, as transmissions over the network are costly. (As a side note, you might think of compressing the transmitted data later on, but it's just as important to have a good design from the beginning). It's probably best to begin with a protocol in which the metadata is the same as the one in your web application model (e.g. same relationships, names of tables, attributes, etc.). But remember, you'll only have to serialize/de-serialize those entities and relationships that you listed in point 2) above. So design accordingly. Your exchange protocol may also include session tokens, authentication info, a version number, or other metadata, if you need it.
Remember: your data exchange protocol is what will de-couple your web application and iPhone application models. I found that it's best to de-couple them because they may both evolve over time. The data model on the iPhone for example, may evolve a lot especially when you will find that you need to re-model some relationships or add/remove attributes from your entities in order to improve application responsiveness, or the user experience, the navigation, or whatever.
Since this is a whole concern in and by itself, well, you need to design a generic serialization/de-serialization mechanism on top of your (JSON/XML/whatever parser you choose) that is flexible enough to sustain the potential differences between your 2 data models. These differences might be: entity/attribute/relationship names, primary key identifier names, data types, attributes to ignore, and the list goes on. I would definitely implement a serializer/de-serializer utility class in the iPhone app, backed by a .plist configuration file containing all supported entities, concerns, aliases you might have. Of course, each model object should 'know' how to serialize, de-serialize itself and its relationships (i.e. the required object graph depth).
One last note, since you will end up with 2 representations of your data, you will need a way to uniquely identify an object on both sides. So for example, think of adding a uuid attribute to all data that needs to be exchanged, or use any other approach that suits your needs.
I am building an app that has similar requirements to yours, and these are the approaches I found to be best so far. Also, you might find this video useful (it inspired me a lot on how to implement some of the issues I mentioned above and is especially interesting if you're using CoreData) :
http://itunes.apple.com/ca/podcast/linkedin-important-life-lessons/id384233225?i=85092597
(see the lecture entitled "LinkedIn: Important Life Lessons on CoreData & GameKit (March 12, 2010)" )
Good luck!
It's quite a broad question, and I think you're going in the right way anyway, however I'll do my best to give some advice:
JSON, ASIHTTPRequest and POSTs to PHP scripts sound like a great way to go.
If the data is not really sensitive, I'd use http most of the time, and use https only for a login page that either sets a cookie or returns a "token" that you use in subsequent requests. (HTTPS can be quite slow over a 3G connection as the overhead in terms of number of packets to setup an SSL connection is higher than a plain TCP connection.)
You should make sure you correctly pass any data from the input to the PHP scripts to the database, to avoid any SQL injection attacks - ie. used parameterised SQL, don't create sql queries by doing "SELECT * from users where username="+$_GET['username']"
I would do this like I have done with a lot of AJAX web-page stuff. i.e.:
Have a URL on your server side package the information to be transmitted into XML format. (This can be through a CGI/PHP script or whatever). Your transmitting XML in the message body - so it's easy to human read and debug with a standard web browser.
Use the standard iPhone NSXMLParser methods to parse out the individual data fields from the XML doc, and write it back to your database. This method is equiped to both fetch the data from a URL and parse it in one call - like:
NSURL *xmlURL = [NSURL URLWithString:#"http://www.example.com/livefeed.cgi"];
NSXMLParser *myParser = [[NSXMLParser alloc] initWithContentsOfURL:xmlURL];
Walk through the data hierarchy with the NSXMLParser methods and populate your database accordingly.

Strategies for "Always-Connected" Windows Client Data Architecture

Let me start by saying: this is my 1st post here, this is a bit lenghty, and I havent done Windows Forms development in years....with that in mind please excuse me if this isn't directly a programming question and please bear with me as I really need the help!!
I have been asked to develop a Windows Forms app for our company that talks to a central (local area network) Linux Server hosting a PostgreSQL database. The app is to allow users to authenticate themselves into the system and thereafter conduct the usual transactions with the PG database. Ordinarily, I would propose writing a webforms app against Mono, but the clients need to utilise local resources such as USB peripheral devices, so that is out of the question. While it might not seem clear, my questions are italised below:
Dilemma #1:
The application is meant to be always connected. How should I structure my DAL/BLL - Should this reside on the server or with the client?
Dilemma #2:
I have been reading up on Client Application Services (CAS), and it seems like a great fit for authentication, as everything is exposed via URIs. I know that a .NET Data Provider exists for PostgreSQL, but not too sure if CAS will all work on a Linux (Debian) server? Believe me, I would get my hands dirty and try myself, but I need to come up with a logical design first before resources are allocated to me for "trial purposes"!
Dilemma #3:
If the DAL/BLL is to reside on the server, is there any way I can create data services, and expose only these services to authenticated clients. There is a (security) requirement whereby a connection string with username and password to the database cannot be present on any client machines...even if security on the database side is quite rigid. I'm guessing that the only way for this to work would be to create the various CRUD data service methods that are exposed by an ASP.NET app, and have the WindowsForms make a request for data or persist data to the ASP.NET app (thru a URI) and have that return a resultset or value. Would I be correct in assuming this? Should I be looking into WCF Data Services? and will WCF work with a non-SQL Server database?
Thank you for taking the time out to read this, but know that I am desperately seeking any advice on this! THANKS A MILLION!!!!
EDIT:
I am considering also using NHibernate as my ORM
Some parts of your questions are complicated and beyond my expertise. However, in general you can do almost anything you put effort into, CAP theorem and the like aside.
DAL/BLL stuff in general can reside in any of the tiers. I put a lot of this in my database and some in the middle tier, however this is to allow re-use in different environments which may or may not be a goal for you. The thing is I would think through carefully the separation of concerns issues here and what sorts of centralization of logic you want to place. The further back, the more re-usable this becomes but this is not always a free tradeoff.
I am not entirely familiar with CAS but it looked like AJAX kinds of stuff from what I saw on the MSDN web site. That could be wrong, but if it is right, then you have an issue in that such requests may be stateless and this could be an issue if you need a constant connection.
On the whole based on what you are saying it sounds cleanest to do a two tier rather than a three tier app, and have the DAL/BLL sit on the client, possibly supported by stored procedures in the server. You can then set PostgreSQL up to authenticate against whatever you use on your network (KRB5 if AD is what I would recommend). This simplifies your data access, and it allows you to control permissions based on the authentication against the database. Since you can authenticate users based on AD, you can then set permissions accordingly.
One important consideration is going to be number of connections. PostgreSQL does have some places where every current connection must be checked and iterated through, and connection startup and tear-down overhead in some cases can be significant. So one important decision will involve connection pooling. Whether or not you use connection pooling to boost performance will depend on what you are doing but I have seen cases where PostgreSQL has handled 600 connections without serious problems.

How do we share data between two different services

I am currently working on a web service which is periodically polled. It does not store its state and is instantiated everytime it is queried. Essentially, it retrieves the state of other external entities e.g. databases and delivers it back to the requester.
Recently, the need to store state as arisen in that
There is the need to continously collect data from a particular source and store the bits that are important/relevant
There is the need to collect the aggregate of a particular data source over a period of time
I came up with the following idea:
My main concern here is the fact that I am using a static class (essentially a global) to share data between the two services. Is there a better way to doing this?
edit: Thanks for the responses thus far. Apologies for the vaguesness of this question: just trying to work out what is the best way to share data across different services and am unsure as to the specifics (i.e. what is required). The platform that I am developing on is the .NET framework and both services are simply WCF services hosted as a Windows service.
The database route sounds like the most conventional way to go - however I am reluctant to go down that path for now (mainly for deployment/setup issues; it introduces the need to create new tables, etc in addition to simply installing the software) for at this point the transfer of relatively small amounts of data. This may of course change in the future and going the database route might be the way to go at that point.
Is there any other way besides adding a database persistance layer?
If you need to collect and aggregate data, you might want to consider using a database between the two layers. Or have I misunderstood something?
You should consider enhancing your question with more requirements: pretty much all options are open here.
Sure - how about data binding? I don't have a lot of information to go on here - about your platform but most sufficiently advanced systems offer it in some form.
You could replace your static shared data with some database representation, with a caching layer (like memcached) between the database and the webservice, so that most of the time the data is available very quickly from the cache, but can be retrieved from the database as needed.
I appreciate that you want to keep the architecture simple. Depending on the magnitude of items you have to look up and there permanency, you might just consider leveraging your file system or a message queue. It sounds like you want a file system, because that sounds the least amount of impact to your design.
If you start dealing with tens of thousands of small files, your directories could get hard to navigate and slow to do file lookups on. I typically shoot for about 1000 - 10000 files per directory, and concoct a routine that can generate a path to the file depending on the file name pattern. Keeping the number of subdirectories even is important, some file systems have a limit on the number of subdirectories in a parent directory.