High traffic site. >10 million user a day. VPS or dedicated server?

High traffic site. >10 million user a day. VPS or dedicated server? - iphone

We're launching an iPhone app soon, and if everything goes well, we might reach up to tens of millions of user each day.
What server solution would you use for this? I guess a small VPS isn't enough. Is dedicated server a better choice? Is there any good hosting provider that can provide such servers?
I'm a newbie when It comes to servers, and would like some basic info about how to handle this.
Thanks in advance

Unfortunately, you are not really going to know the apps requirements until the app is launched. It all depends on how much the app needs to communicate with the server, and how often users are using the app. Depending on those variables and even more, a VPS might be enough, or you may need a dedicated box, or several. It also depends a lot on the performance of the VPS and dedicated boxes, furthermore it depends on how much access to the system you need.
Ultimately, it seems you may not even know how well the app is going to do, so I suggest you take the cheap/efficient route of using cloud computing. That way you will limit your expenses initially when you app has a small user base. Then your performance can amp up as quickly as your app requires (of course so will the price). That is the benefit of cloud computing, you will not be losing money in the beginning until you have the user base to use your server to its limit. Furthermore, you do not have downtime, etc when/if your server is no longer enough.
Check out Google's Cloud Computing to get a hint of what is possible. I personally like Google's cloud experience, but you have many more options with varying degrees of freedom that you will have to check out. Amazon of course is another possibility.

Related

Google cloud VMs and Cloud SQL

Basically, I'm confused after looking at so many Google Cloud products. I'm starting up a new project that includes a website, an iOS app, and an android app. I've decided to move forward with the Compute Engine as I'll have the flexiblility to do a lot stuff.
I'm thinking of using Cloud SQL for database service. I know that I can install MySQL on my VM. But I'm not sure what's the pros and cons. I'm still researching on this but in the mean time some experts opinion would be greatly appreciated.

TL;DR: Go with managed Cloud SQL. Better than doing it yourself and it doesn't cost much.
I'm no expert but I can tell you from previous experience that a managed database solution feels like much less of a hassle than doing it from scratch. Installing and configuring MySQL isn't especially hard, but it can get tedious (especially for devs like me who have done this many times over).
Also, when your app begins to grow, it'll just be a matter of pushing a few sliders to make your DB respond better to all the traffic. Trust me, you can enjoy a higher quality of life with words like "sharding" and "replication" not being part of your technical vocabulary.
Lastly, I don't remember Cloud SQL to be very expensive.

Getting started with servers / networking for an iOS application

I'm developing an iOS social networking application that will involve users sharing and rating photos. I've spent about the last year or so on and off teaching myself how to develop in cocoa touch and now I'm ready to get started with the networking aspect of the app. Unfortunately, I have 0 networking / database experience and was wondering if anyone had any good advice on what things to consider and where / how to get started. In all likelihood I'm probably not going to build my own server and instead will go with something like rack space. Any advice would be appreciated.

Getting started to iOS Networking application. Here are the required things:-
You really need good backend developing experience in MySQL and complex database queries Plus experience in developing web services. At the root you need server host for backend and Admin module as you mentioned in question like race space or any other. You need to spend some months with mysql backend,web service and admin module implementation.
For fully functional social networking app your first task will be to manage users. Log in/Sign up will be there. Every user can post his status and can comment to other user's status.All status posts or comments will have their unique id and relationships to userid's table. May be image uploading and comments on photo will also be there So there will be lots of tables and relationships between them on backend side.
For Social Networking app, amount of work will be bigger on both backend/web service side and iOS side.

This is a pretty abstract question.
Eventually, you're going to have to focus on a particular architecture ( which database, which network technology), but before you do that, you need to get an outline idea of what the available options are and what the strengths and weaknesses of each are.
The server database is probably the easiest, as it doesn't make such a big difference. The choices are an sql database ( mysql, sqllite ....), not an sql database ( nosql, in memory tables) or some higher level abstraction where you can hide the differences and decide later ( core data for example). This may be constrained by your deployment decisions, you'll have more choices of rack options if you go for LAMP (Linux, Apache, MySql, PHP) for example. If you stick with apple kit and use a MacOS X back end, you may have additional options for network architecture ( DistributedObjects for instance.
The network architecture is also a difficult choice from a wide range of choices. There are really dozens of options with lots of different pros and cons. As this is your first foray into this area, I'd count ease of use and availability of help high among your priorities. Here are a few popular technologies you might want to investigate ( in alphabetical order to minimise flammability :) ). DistributedObjects ( apple), JAXP, JSON, RPC, SOAP, XML ( bare, without the soapy bits).
One more question you should ask yourself is "Can you get away with just a database connection, or do you need to do processing on the back end?". If you can, you might be able to get away with just using a remote database and then you only need to learn core data ( which will still keep you busy for a long time).
Once you've decided what technologies you want to use, then you can start learning.
You mentioned using a hosted server. You'll want to be able to run a test server locally. Fortunately almost all the worthwhile options for both database and network technology will run on any unix-like machine, so you can probably use your regular dev machine.
Also bear in mind that some of these choices are religious, so everyone you read will have strong biases ( myself definitely included).

Which database back-end shall I use?

I am writing an iPhone app, that requires cloud back-end DB storage. I have a couple options in mind, and was wondering which one is better fit?
What I need:
be able to perform GRUD in the cloud from the iPhone app
the DB needs to scale (speed-wise) without much or any management
schema free
all i need is to store maybe 1 million records
Google App Engine:
Uses bigTable, scales, and schema free, but I need to write a RESTful interface
CouchDB:
Recently released iOS support, RESTful built-in, but I worry about scaling when syncing with remote server
SimpleDB: (seems to be my best pick)
Has iOS SDK, so I can do GRUD directly, auto scale (I probably won't be running into the 10GB limit), schema free
MongoDB:
Don't know much about, from what I hear, it's faster than SimpleDB, and easy to setup, but again I need to do the admin work
Cassandra:
Too much work, for what I need.
Any insight or feedback or correction is great appreciated.
Regards,
Johnny

If you're looking for zero management on your end, then you've already answered yourself that SimpleDB or GAE are probably your best options.
SimpleDB is probably better in your case, because it'll save you from having to write a simple RESTful interface on top of GAE.
Note that both of them aren't great in terms of speed. I worked with both and there's visible query latency. Unfortunately there's no way for you to tune that - you're completely in the hands of Amazon/Google. That's the price you pay for not managing the datastore yourself, so I guess you'll have to decide if you're willing to pay that price.
I recommend that you try SimpleDB, which is simple enough, first. If latency is a problem then you can move to hosting and tuning your own Mongo or some other option.

SQL Azure Services. Meets your requirements above.
http://en.wikipedia.org/wiki/SQL_Azure

Host at Facebook to avoid traffic or other possibilities?

is it possible to let my own facebook apps (not generating revenue) being hosted by facebook?
The problem is that by using the iframe-version the traffic/requests are killing the server :-(
But I need to connect to a database and print/calculate values, so I think there is no other way than hosting everything on own servers. But maybe there are things I don't know.
What is the way you would go?

I don't think Facebook has an option to host apps, at least not that I've ever heard of or was quickly able to find on their developers site.
Honestly, when it comes to hosting a high-demand website, there's no free way to do it. Resources cost money. You can pick from tons of hosting providers and see who gives you the features you need at the best rate. Maybe some will offer free hosting if you include ads in the Facebook app, maybe some will offer free hosting for other means, etc.
For a non-revenue-generating app, when it becomes popular and successful and requires real resources to keep it running, it's generally time to start thinking about how to generate revenue from it. Maybe use it as a free gateway app to other revenue-generating apps (a loss leader), maybe have ads, maybe use it to generate useful marketing data, etc. For a successful site it may involve a good bit of personal investment and risk before the profits roll in (Facebook being a good, though extreme and uncommon example of this).

You have to host the application on your own, there's no way that FB does it for you.

What to do when you've really screwed up the design of a distributed system?

Related question: What is the most efficient way to break up a centralised database?
I'm going to try and make this question fairly general so it will benefit others.
About 3 years ago, I implemented an integrated CRM and website. Because I wanted to impress the customer, I implemented the cheapest architecture I could think of, which was to host the central database and website on the web server. I created a desktop application which communicates with the web server via a web service (this application runs from their main office).
In hindsight this was rather foolish, as now that the company has grown, their internet connection becomes slower and slower each month. Now, because of the speed issues, the desktop software times out on a regular basis, the customer is left with 3 options:
Purchase a faster internet connection.
Move the database (and website) to an in-house server.
Re-design the architecture so that the CRM and web databases are separate.
The first option is the "easiest", but certainly not the cheapest long term. Second option; if we move the website to in-house hosting, the client has to combat issues like overloaded/poor/offline internet connection, loss of power, etc. And the final option; the client is loathed to pay a whole whack of cash for me to re-design and re-code the architecture, and I can't afford to do this for free (I need to eat).
Is there any way to recover from when you've screwed up the design of a distributed system so bad, that none of the options work? Or is it a case of cutting your losses and just learning from the mistake? I feel terrible that there's no quick fix for this problem.

You didn't screw up. The customer wanted the cheapest option, you gave it to them, this is the cost that they put off. I hope you haven't assumed blame with your customer. If they're blaming you, it's a classic case of them paying for a Chevy while wanting a Mercedes.
Pursuant to that:
Your customer needs to make a business decision about what to do. Your job is to explain to them the consequences of each of the choices in as honest and professional a way as possible and leave the choice up to them.
Just remember, you didn't screw up! You provided for them a solution that served their needs for years, and they were happy with it until they exceeded the system's design basis. If they don't want to have to maintain the system's scalability again three years from now, they're going to have to be willing to pay for it now. Software isn't magic.

I wouldn't call it a screw up unless:
It was known how much traffic or performance requirements would grow. And
You deliberately designed the system to under-perform. And
You deliberately designed the system to be rigid and non adaptable to change.
A screw up would have been to over-engineer a highly complex system costing more than what the scale at the time demanded.
In fact it is good practice to only invest as much as can currently be leveraged by the business, using growth to fund further investment in scalability, should it be required. It is simple risk management.
Surely as the business has grown over time, presumably with the help of your software, they have also set aside something for the next level up. They should be thanking you for helping grow their business beyond expectations, and throwing money at you so you can help them carry through to the next level of growth.
All of those three options could be good. Which one is the best depends on cost benefits analysis, ROI etc. It is partially a technical decision but mostly a business one.
Congratulations on helping build a growing business up til now, and on to the future.

Are you sure that the cause of the timeouts is the internet connection, and not some performance issues in the web service / CRM system? By timeout I'm going to assume you mean something like ~30 seconds, in which case:
Either the internet connection is to blame and so you would see these sorts of timeouts to other websites (e.g. google), which is clearly unacceptable and so sorting the internet is your only real option.
Or the timeout is caused either by the desktop application, the web serice, or due to exessively large amounts of information being passed backwards and forwards, in which case you should either address the performance issue how you might any other bug, or look into ways of optimising the Desktop application so that less information is passed backwards and forwards.
In sort: the architecture that you currently have seems (fundamentally) fine to me, on the basis that (performance problems aside) access for the company to the CRM system should be comparable to accesss for the public to the system - as long as your customers have reasonable response times, so should the company.

Install a copy of the database on the local network. Then let the client software communicate with the local copy and let the database software do the synchronization between the local database server and the database on the webserver. It depends on which database you use, but some of them have tools to make that work. In MSSQL it is called replication.

First things first how much of the code do you really have to throw away? What language did you use for the Desktop client? Something .NET and you may be able to salvage a good chuck of the logic of the system and only need to redo the UI and some of the connections.
My thoughts are that 1 and 2 are out of the question, while 1 might be a good idea it doesn't solve the real problem. And we as engineers should try and build solutions not dependent on the client when ever possible. And 2 makes them get into something they aren't experts at and it is better to keep the hosting else where.
Also since you mention a web service is all you are really losing the UI? You can alway reuse the webservices for the web server interface.
Lastly you could look at using a framework to help provide a simple web based CRUD to start and then expand from there.

Are you sure the connection is saturated? You could be hitting all sorts of network, I/O and database problems... Unless you've already done so, use wireshark to analyze the traffic; measure the throughput and share the results with us.