On an iPhone app (or mobile in general) that constantly needs to send requests to a Web Service, is it better to work with one single requests that will fetch a large amount of data or multiple (possibly simultaneous) requests for each element with smaller amount of data fetched.
Example
I want to load a list of elements in a node. I have the node's ID. The 2 ways I can fetch the elements are the following :
send a single request with the node ID and get all the information about the n first elements in the node in a single response ;
send a first request with the node ID to get the IDs of the n first elements in the node, then for each one send another request to have one response per element.
I'm balanced between about that.
the heavyweight single response may cause more lag and timeouts because of the very unstable and slow mobile internet connection ;
the phone may have trouble handling too many responses at the same time.
What's your opinion ?
Since there is overhead for every request, one large request is generally faster than several small ones of the same size. This applies to high speed networks too, but in mobile networks the ratio between transfer speed and latency is even bigger.
I don't think the phone will have any problem handling the responses, so the multiple requests approach seems better for large requests/answers. However, depending on the size of your requests/responses, it may actually be faster to do it in a single request, in order to reduce the delay associated with multiple requests. The single request approach will also need to transfer slightly less data than the multiple request one.
Every call will have its overhead (i.e. network load), the number of connections might also be limited.
You might or might not be able to update your user interface during download, depending on how often your callbacks are called - you may be able to process partial data as it arrives.
If your data is easy to compress (typically text data), then using a single call might even reduce your total network usage footprint even more.
If the chunks of data are large, I'd go with several individual ones. This will also make things easier in case of network errors. Bottom line for me is to just get the right balance - make the packets reasonable sized and don't flood the server.
This is depend upon the situation. If you don't want to bother your user to waiting everytime throughout the app then you can use single request to load all the data at a time.
If you don't mind to let user wait then you can use multiple request on demand. For example if you just want to show title in tableview and detail when user tap on any title. So you can first get the title only and then when user tap you can get details for that title by ID. so that would be pretty good way to request on demand only.
Sometimes the situation merits use for only single requests for say a certain category. Say you have a twitter app and the tweets are seperated out into categories. Someone who has the app but only cares about sports may only look at the sports section which could be a single ajax call. Another user may only be intersted in two categories out of 15 categories. This means the user doesn't have to load unneccessary data. The important thing you need to determine is this.
Does all of the data need to be loaded all at once for the app to work correctly and are your users generally going to want all that data in the first place.
Related
I know that from most of today's REST APIs, web calls responses have to be paginated.
But, I don't see on the web any insight on how to select the ideal size of a batch returned by an API call: should it be 10, 100, 1000. To be short: on what factors should we base the reflection of the size of an API response?
Some people state that it should be based on the number of elements displayed by the UI. I don't agree with this, as not all APIs are directly linked with an UI, and, in any cases, modern REST APIs allow to chose the number of items in output batch with a configurable parameter up to a certain amount.
So, how could we define the value for this "maximum number of elements returned by an HTTP request"?
Should it be based on the payload size? Internal architecture of the API? Should it be based on performance measurement?
Any insight on this? I'm not really looking for an explicit figure, but more some techniques that could help to find the answer. The best for me would be the process followed by some successful APIs. But, I cannot find any.
My general approach to this is to:
Try to avoid paging
Make REST clients resilient against paging changes, by using next and previous links, instead of using a ?page= attribute. A REST client knows another page is available strictly by the appearance of the link.
I don't use hard rules or measurements to figure out when paging is needed. Paging is generally painful, so my first approach would be to try to figure out what requirements drives the need for paging, and if that requirement can be removed.
Once I've determined it's not possible to remove this requirement in another way, I would set the cut-off of a page as large as reasonable, to remove the likelyhood clients need to do additional requests.
If it's a closed API (used only by clients you control), pick whatever the UI wants. It's trivial to change. If clients can choose from several options, you can include a pageSize parameter. Or, better..
If it's an open API (used by clients you don't control), then let clients control what size paging they want. Rather than support a pageNumber parameter, support offset and limit. Offset is how many records to skip before starting to return records, and limit is the maximum number of records to return. If the client is not happy with how their request performs, they can adjust the parameters to suit their needs. Any upper limit your API has should be driven by what your service can handle. It's neither possible nor desirable for you to try to figure out the Magic Maximum Page Size that makes all clients happy and no clients sad.
Also, please note that none of this has anything to do with ReST, which is silent when it comes to paging.
I usually make a rough performance measurement by hand. I want as few requests as necessary, but do not want to risk timeouts.
I need some guidance on how to properly build out a system that will be able to scale. I will give you some information about what I am trying to do and then ask my specific question.
I have a site where I want visitors to send some data to be processed. They input the data into a textarea or upload it in a file. Simple. The data is somewhat preprocessed on the client side before a POST request is made to a REST endpoint.
What I am stuck on is what is a good way to take this posted data store it and then associate an id with it that references the user since I cannot process the data fast enough for it to be returned to the user in a reasonable amount of time?
This question is a bit vague and open to opinion, I admit it. I just need a push in the right direction to keep moving. What I have been considering is throwing the data into a message queue and then having some workers process the data elsewhere and when the data is processed alert the user as to where to find it with some sort of link to an S3 bucket or just a URL to a file. The other idea was to just run the request for each item to be processed against another end-point that already processes individual records in some sort of loop client side. The problem is as follows with this idea:
To process the data it may take somewhere from 30 minutes to 2 hours depending upon the amount that they want processed. It's not ideal for them to just sit there and wait for that to finish depending on the amount of records they need processed, so I have ruled this out mostly.
Any guidance would be very much appreciated as I don't have any coworkers to bounce things off of, nor do I know many people with the domain knowledge that I could freely ask. If this isn't the right place to ask this, could you point me in the right direction as to where it should be asked?
Chris
If I've got you right, your pipeline is:
Accept item from user
Possibly preprocess/validate it (?)
Put into some queue
Process data
Return result.
You man use one or several queues on stage (3). Entity from user gets added to one of the queues. If it's big enough, it could be stored in S3 or storage alike, and only info about it put into the queue: link, add date, user id (or email of alike). Processors can pull items from queue and give feedback to users.
If you have no strict requirements on order, things get much simpler: you don't need any sync between them. Treat all the components: upload acceptors, queues, storages and processors as independent pools of processes. Monitor each pool separately. If there's some bottlenecks - add machines to that pool.
I have a small APP which allows users to view information on Beers and Beers they have tried for a local Bars Beer Club.
I have 4 Views. Beer Menu, All Stats, My Stats and Settings.
Originally, I thought to pull all of the data via a web service and return xml at initial load of the app, and use it throughout.
OR...
I could just pull what I need when I need it. This would result in just pulling the data I need, which would be faster, but it would result in more requests. What would be better:
a) pull all data, store globally, build views as needed.
b) pull only data I need, when I need it. This means if they click on a beer, I would make a request for that beers info. If they clicked on 10 different beers, then that would be 10 different requests.
What is better? Or does it even matter.
yeah, I think on mobile devices these kind of decisions to matter.
With these kind of concerns I think sometimes there is no right answer but here are a few pointers:
Use json, not xml (if you can)
it's less verbose and, depending on the data, could make a difference to the speed.
Do not block the UI thread
This is really a general guide to all app development, in my opinion. The worst thing you can do is block the UI thread.
Coding for a progressive UI that loads data separately will always be more fiddly than just doing a batch load, and then returning everything. But the extra work will really make your User Experience a lot more pleasurable.
Be clever about your requests
This kinda of carries on from the last point. I'm not saying do a million request, but do try and find a balance before less requests, and loading data as needed (which would suggest more requests).
Try and really think about how the user is going to use your app, and see if you can do some clever pre-fetching based on what you THINK the user might need more in the certain view.
i.e What is the most likely view to be used next? can you pre fetch the data for that?
This last part is really the fine tuning, and will result in a lot of trial and error. But the end result will hopefully be a really great app that just feels fast, and feels right.
I'd go with loading cached data on launch (if it exists) and then load fresh data in the background as needed. This keeps your app as responsive as possible. it's a balance between draining battery life on requests VS responsiveness and data availability. I think the balance is caching information with a timestamp (if the data changes, if not it's even better) and then update as needed.
So, here's the problem. iPhones are awesome, but bandwidth and latency are serious issues with apps that have serverside requirements. My initial plan to solve this was to make multiple requests for bits of data (pun unintended) and have that be how the issue of lots of incoming//outgoing data was handled. This is a bad idea for a lot of reasons, most obvious to me is that my poor database (MySQL) can't handle this very well. From what I understand it's better to request large chunks all at once, especially if I'm going to ask for all of it anyways.
The problem is now I'm waiting again for a large amount of data to get through. I was wondering if there's a way to basically send the server a bunch of IDs to get from the database, and then that SINGLE request then sends a lot of little responses, each one containing all the information about a single db entry. Order is irrelevant, and ideally I'd be able to send another request to the server telling it to stop sending me things because I have what I need.
I realize this is probably NOT a simple thing to do so if you (awesome) guys could point me in the right direction that would also be incredible.
Current system is iPhone (Cocoa//Objective-C) -> PHP -> MySQL
Thanks a ton in advance.
AFAIK, a single request cannot get multiple responses. From what you are asking, it seems that you need to do this in two parts.
Part 1: Send a single call with the IDs.
Your server responds with a single message that contains the URLs or the information needed to call the unique "smaller" answers.
Part 2: Working from that list of responses, fire off multiple requests that run on their own threads.
I am thinking of this similar to how a web page works. You call the HTML URL in a web browser. The HTML tells the browser all the places/URLS it needs to get additional pieces (images, css, js, etc) to build the full page.
Hope this helps.
Does anyone know how sites that have a real-time feed of a lot of data work? I am referring to something like a stock site, where they can tell you in real time (well, 20 minute delay mostly, but still real-time - 20 minutes as I understand it).
They have thousands of data pieces delivered to them every second, I would imagine: MSFT 25.00 +.23 VOL 12000 ???? for each stock that had a change during some interval.
So, is there just a constant feed of small pushes going on? Or do you think a site will pull from the place that has the real data and say "give me all changes since 12:23:45 CST to now" type query?
I ask this because at work we might have a situation where we need to have at our application's fingertips real time information like this, and it won't make sense to hit our third party provider over and over and over again every second...
Generally there is a server/client protocol defined between the 2 parties. In the company I work for the connection is maintained at all times.
Here is info on real time data feeds to go with your stock example
NYSE,NASDAQ
It is common for data providers to also have FTP sites with (delayed) batched data. One that comes to mind is the NWS EMWIN
Sites like Twitter feed data to certain approved sites in real-time via XMPP (Wiki link).
In the broadest terms, a push model is going to be the best way of achieving "real time" transfer, particularly if you're talking about a large amount of data.
However you do always have a problem when using a purely push model of how to recover from missed data.
Depending on the nature of your data that may not be a problem (thinking of video delivery as an analogue, where the amount of data is huge but there is sufficient redundancy for it to recover from missing data). And if you have any control over the data you may be able to build some redundancy in. For example, on every change event you can provide absolute values rather than changes, or previous value and new value.
I've done this making an attempt to retrieve the stock quote from the source, and falling back to a timestamped on-disk cache of the quote when the main source fails or times out.