Serving static files from Amazon S3 - deployment

What is better for serving static files of most websites (javascripts, css, images, html): S3, something like EC2 or yet another option?

S3 with CloudFront enabled would be my choice; then you get the benefits from S3, plus fast access for your users through their edge-locations.
If you have some really high performance demands, you should look into something that is physically located near your users. In my case, swedish providers are a bit faster than any of Amazons solutions (but usually not as nice to use).
EC2 is not really made for static files (or rather, it's made for so much more).

Why go to the trouble of EC2 when Amazon S3 will do this for you, and is a LOT easier manage? Just use something like BucketExplorer for upload/download, and you don't need to worry about starting images, making sure they're still running, keeping software versions unto date, paying for idle time, etc.
Also look at Amazon's CloudFront service, which is a CDN for Amazon S3. It may or not be economic for you, but it can greatly improve download times for your users.

Related

Application hosting best approach

I'm finishing building my MERN stack web application and I started to think about the optimal way of deploying it.
What should be mentioned is that one of the features of the app is to allow the user to upload and modify a moderate amount of images. I know that two main approaches would be to either:
host the app on a shared server like Namecheap and have the images stored in a filesystem
host it using cloud-based PaaS solutions like Heroku with additional storage like Cloudinary.
My question is - which of those would be an optimal solution for an app that will serve mostly for academic purposes + as an addition to a portfolio? What would be the best approach if the goal would be to create the portfolio's website along with the project's one?
If I understand correctly, I would say the first option is probably cheaper if you have the disk space for it. You would also be under the hood more, connecting the filesystem to wherever your app is hosted. Depending on how many users you anticipate, the second option could scale and end up costing a bit.

Is it a bad idea to host a rest api on a cdn?

I'm new to server architecture and have been reading around a lot but have not yet had a solid opinion on if the setup below is good practice or not and was hoping someone with more experienced can give me confirmation if I'm setting up my architecture correctly:
Use Angular Universal to Pre render html to CDN (e.g. Cloudflare)
Cloudinary for Image assets
One/Few strong machines with ngix handling bus load and sending off to other servers listed below (all hosted in digital ocean):
Rest API (Express Server)
Database MongoDB
I'm really concerned about the speed of my rest api as the regions offered in digital ocean seem significantly smaller in contrast to a cdn like cloudflare. How much does this matter when affecting my speed and is a service?
I know this might sound ridiculous but the region issue makes me wonder if hosting a rest api express server on a cdn would be better than a place like digital ocean. (my instincts tell me I should't do this on a cdn but am at a loss for reasons and hope someone can provide clear reasons why I can or shouldn't host an express rest api server there.)
From my knowledge I would do this a little differently.
A CDN is used to serve content hence the name CDN (Content Delivery Network). The CDN its self doesn't serve the content but it routes the user to a server which serves it. For example if you have a server in the US, France and Asia and you where from the UK and requested the website with images hosted on these servers. The CDN would direct you the the closest/best server for you. In this case that would be the server in France.
So to answer your question it isn't a bad idea to host the RESTful API on the CDN but you would need multiple servers around the world (if you are going for worldwide) and use Cloudflare CDN to direct your traffic.
This is what I would do:
If your not expecting loads of traffic (like millions) just have 1-2 servers in each location so 1-2 in North America, South America, France (EU), Asia and maybe Australia. This will give you decent coverage. Then when you setup your CDN that should handle who goes where. Using node and nginx will help you a lot this will allow you to get cheaper not as powerful servers because they are pretty light weight.
Now for your databases you can do one of two things have one dedicated solution somewhere which will be as little latency for all regions somewhere like France (EU) so its more central or you can have multiple and have them sync. Having multiple databases which sync will be more work and will require quite a bit of research. Having the one server is a lot easier to manage.
The database will be your biggest problem deciding whether to do with one and deal with latency or multiple and have to manage them and keep them in sync. Keep in mind you could go with a cloud hosting platform to host your database this would help you with the issue because a lot of platforms will offer worldwide coverage as well as providing synchronised databases. You will however run into the cost issue when using cloud platforms.
Hope this answers your questions and provides you with the knowledge you need!

High traffic site. >10 million user a day. VPS or dedicated server?

We're launching an iPhone app soon, and if everything goes well, we might reach up to tens of millions of user each day.
What server solution would you use for this? I guess a small VPS isn't enough. Is dedicated server a better choice? Is there any good hosting provider that can provide such servers?
I'm a newbie when It comes to servers, and would like some basic info about how to handle this.
Thanks in advance
Unfortunately, you are not really going to know the apps requirements until the app is launched. It all depends on how much the app needs to communicate with the server, and how often users are using the app. Depending on those variables and even more, a VPS might be enough, or you may need a dedicated box, or several. It also depends a lot on the performance of the VPS and dedicated boxes, furthermore it depends on how much access to the system you need.
Ultimately, it seems you may not even know how well the app is going to do, so I suggest you take the cheap/efficient route of using cloud computing. That way you will limit your expenses initially when you app has a small user base. Then your performance can amp up as quickly as your app requires (of course so will the price). That is the benefit of cloud computing, you will not be losing money in the beginning until you have the user base to use your server to its limit. Furthermore, you do not have downtime, etc when/if your server is no longer enough.
Check out Google's Cloud Computing to get a hint of what is possible. I personally like Google's cloud experience, but you have many more options with varying degrees of freedom that you will have to check out. Amazon of course is another possibility.

I need suggestions for a distributed media storage data store

I want to develop one multimedia system, the system need to save millions videos and images, so I want to select a distributed storage subsystem. who can give me some suggestion ? thanks!
I guess that best option for the 'millions videos and images' is content distribution/delivery network (CDN):
CDN is a server setup which allows for
faster, more efficient delivery of
your media files. It does this by
maintaining copies of your media at
different points of presence (POPs)
along a global network to ensure quick
client access and the fastest delivery
possible
If you will use CDN you no need care about many problems(distribution, fast access). Integration with CDN also should be very simple.
#yi_H
You can configure your writes to be first replicated to multiple nodes before it return to the client. Now whether or not that is needed is of course unto the use case. And definitely involves a performance hit. So if you are implementing a write heavy analytical database, it will have a significant impact on write throughput.
All other points you make about the question in terms of lack of requirements etc, I second that.
Having replicated file system with metadata in a nosql database is a very common way of doing things. #why did you consider this kinda approach?
Have you taken a look at Mongodb gridfs? I have never used it, but it is something I would take a look at to see if it gives you any ideas.
Yo gave us (near) zero information about what your requirements are. Eg:
Do you want atomic transactions?
Is the system read or write heavy?
Do you need fast queries or want to batch-process the data set?
How big are the videos?
Do you want to distribute data locally (on a LAN) or spanning multiple data centers / continents?
How are we supposed to pick the right tool if we don't know what it needs to support?
Without any knowledge of the system I would advise using some kind of FS replication for the videos and images and then storing the metadata associated with the items either in MongoDB, MySQL Master-Master or MySQL Cluster.
Distributed related to what?
If you are talking of replication to distribute:
MongoDb only restricted to Master-Slave replication, so only one node is able to read/write which leaves you with a single point of failure for a really distributed system.
CouchDB is able to peer-to-peer replicate.
Find a very good comparison here and here also compared with hbase.
With CouchDB you also have to be aware that you are going to talk http to the database and have build in webservices.
Regards,
Chris
An alternative is to use MongoDB's GridFS, serving as a (very easily manageable) redundant and distributed filesystem.
Some will say that it's slow on reads, (and it is, mostly because of the nature of its design) but that doesn't have to mean it's a dealbreaker for your system in whole, because if you need performance later on, you could always put Varnish or Squid in front of the filesystem tier.
For all I know, Squid also supports on-disk cache for all the less-hot files.
Sources:
http://www.mongodb.org/display/DOCS/GridFS
http://www.squid-cache.org/Doc/config/cache_dir/

Cloud Computing need some regulations?

I was involved in couple of cloud computing platform recently.
First of all please note that I am not trying to criticize any platform.
Cloud computing is large area but to make my point simple and understandable. Let me come up with very simple scenario and that is data storage services hosted on the cloud.
If you take any storage service like Amazon EC2, SQL Data Service(SDS), Salesforce.com services.
If you want to consume any of such data storage service platform goal of all such service are same and that is to serve requested data on demand. Without warring about how it store and where it stored and who is maintaining it etc... (all cloud goodies)
Now my area of concern is the way ANSI-SQL regulated platform venders to make sure they follow similar language across all the product can’t they regulate similar concept across
service providers?
Why no such initiatives??
Any thoughts appreciated
It seems to me like you're worried about vendor lock-in with cloud computing. I may be naive but I would normally choose technologies and then go look for cloud vendors that'd be able to deliver these technologies. And if I was aiming for a "write once run anywhere approach" I'd have to select technology that'd make this as realistic as possible.
With the fairly rapid speed of development I really think standardization committees would struggle to keep up. ANSI-SQL has had 20 + years of history. It seems to me like you're requesting for standardization long before we even know what the cloud is up to....
I think that this emerging cloud computing initiative is just too young in order to have standards.
Service providers right now just worry about rushing into the market, rather than interoperability and standards.
Later on, when the situation is more established, some common guidelines may emerge. But there is still a long way to go.
You seem to be asking specifically about cloud storage services, rather than cloud computing in general. So your Amazon example would be S3, not EC2.
I think the field is a little young to be standardising on an API just yet. The services differentiate themselves in ways which rule this out. For example, S3 trades sophistication for scalability/reliability/performance: you can't do a complex SQL LIKE query. You can store and retrieve blobs of data based on a key, and that's about it.
I think as such services become more and more the mainstream way to do things, standards will emerge. Users will want the freedom to switch providers on a whim, move their data around, test against free local storage, etc.
The APIs used are all based on Web Standards already. Making an abstraction layer to make them look the same is fairly trivial.