For putting together a site from scratch, what are the advantages and disadvantages of using external database services, e.g. MongoHQ, Amazon RDS?
Advantage: you don't have to fix it yourself when it breaks.
Disadvantage: you can't fix it yourself when it breaks.
My take on this is simple:
If you have an application hosted on Amazon then you should go for Amazon RDS or MongoHQ(which also is hosted on Amazon). The rational is, since both your application and the database are on the same network (internally) you will get a significant performance advantage.
If your application is hosted else-where then go for a local install.
a couple more points
for
do not have to administer the Hardware
I guess they take care of security, software updates of the server (Software admin)
saves room. you do not have to find room in your building for a database cluster
disadvantages
depending on your internet speed, the speed you transfer data can be affected. if the application and data are in the same network you could say that you have 1gbit speed vs a 50mbit internet connection. times this by 1000 concurrent users?
you have to work to their release schedule. if you use a 3rd party and they update the database version which has a breaking change. You will be forced you update. if you host it yourself this upgrade will be under your terms.
Related
I've created a few apps which utilize a postgres database, but in all of those projects, I've either used the pool or client function from the pg npm package. Recently I came across the pg-promise node package, and was just wondering if there were any drawbacks to using pg-promise over pool or client. I'm just worried about changes in runtime that would affect how many clients the app could service at one time.
pg-promise is "Built on top of node-postgres". You're still using the same pools and clients.
Nothing changes regarding the amount of connections your database will be able to handle, and unless you use a different approach to building your application (like, using transactions instead of not using transactions, or using individual clients instead of pooling), nothing will change regarding the amount of clients your app will be able to serve.
I'm creating a video chat web app and I need to deploy a coturn TURN server somewhere. What I'm asking is do I need to purchase 2 droplets on Digital ocean, one for the TURN server and one for my web server or can I place them in the same droplet thus saving money.
Is there any good reason in buying 2 hosting spots, maybe performance issues?
Technically, you can deploy it at the same server. But you shouldn't.
There will be a time when you need to upgrade the turn server, i.e. the numbers of users are increasing. If you split these applications, the node server will not be affected, since it's not necessary to upgrade it.
Another example would be, that you want to use more than 1 turn server, for load balancing for example. Also in case of a problem, once application will not be affected and it's easier to integrate a replacement / backup server.
My advice would be:
yourdomain.com -> Main application
turn1.yourdomain -> Turnserver
turnX.yourdomain -> maybe an additional server in the future.
With that solution you can also make sure, that the turn server is not using all the bandwidth, so yes, that's the possible performance issue you mentioned.
This is rather a set of questions than one very specific question. In the last couple weeks/days I puzzled together information regarding how to properly host a JAVA PLAY application "in the cloud", as lots of this information is scattered over different services, I felt like gathering up all these small pieces to one, because lots of things are important to be seen in full context. However, I moved my considerations to the bottom of the question, as they are mainly my opinions and subjective findings, which I don't want to be held responsible for. If I got something wrong, please don't hesitate to point that out.
Hosting Java PLAY + MySQL on AWS for world wide accessibility
Our Scenario: we have a quite straight forward application written within the Java PLAY framework (https://www.playframework.com/), working on iOS and Android as well as with a backend-system (for administration, content management and API), storing data in a MySQL DB. While most of the users' interactions with the server is quick and easy (login, sync some data) there are also some more data-intensive tasks (download some <100mb data zips to the mobile phone, upload a couple of mb to the server). Therefore we were looking for a solution to properly provide users far away from our servers with reasonable response times. The obvious next step was hosting in the cloud.
Hosting setup within AWS:
Horizontal scaling: for the start, only 1 EC2 instance with our app will be running in eu-1a. We will need to evaluate how much resources one instance actually requires, if more instances are needed and if more instances would actually benefit to quicker response times.
Horizontal scaling across regions: once the app generates heavy user load from another region, the whole EC2 instance should be duplicated and put to another region, running a db read replica (see Setting up a globally available web app on amazon web services and https://aws.amazon.com/de/blogs/aws/cross-region-read-replicas-for-amazon-rds-for-mysql/ ).
Vertical scaling of EC2 instances: in recent tests of the old hosting setup, the database proved to be the bottleneck rather than the play app and its server's hardware specifications. Therefore it is not yet fully clear how much vertical scaling would affect response times. If a t2.micro instance serves as good as a m3.xlarge instance, of course we would rather climb our way up from the bottom here.
Vertical scaling of RDS: we will need to estimate how much traffic hits the DB server and what CPU/RAM/etc will be required. Probably we will work our way up here aswell.
Global Redirection: done using Amazon Route 53 (?). A user from Tokio should be redirected to the EC2 instance running in Asia; a user from Rome to the EC2 instance in Europe. This does not only affect API calls within the app, but also content delivery (in both directions).
Open Questions regarding the setup
Is this setup conclusive? Am I missing crucial components?
Regarding global redirection: is Amazon Route 53 the right tool? How does it differ from CloudFront (which strikes me to be purely for content / media distribution?).
How do I define correct data/api endpoints for my app? Of course I don't want to define the database endpoint of a db read replica during app deployment. Will this also happen during the AR53 (question 2) setup? Same goes for API calls, of course the app should direct it's calls to https://myurl.com/api and from there it should be redirected. Is this realistic?
I would highly appreciate all kinds of thoughts (!), also regarding the background info written below. If you can point me to further reading to solve my questions on my own, I am also very thankful - there is simply a huge load of information regarding this, but this makes it hard to narrow the answers down. I do have knowledge in hosting/servers, but I am pretty sure there are true experts out there waiting to slap me with knowledge. :)
Background-Information
Current Hosting Setup: a load balancer distributes the traffic on 2 root linux servers, both of them running the PLAY app, one of them also holding the MySQL installation.
The current hosting setup has 3 big flaws:
No vertical scalability: the hosting company would take money for each scaling step. Currently the servers are running idle, but if the app booms, we could run short on capacity quickly. Running idle is still paid as if permanently under full load. This is expensive!
No deployment support: currently, we connect through SSH, manually deploy the correct folders to the file system, recompile on the server, set privileges, apply database evolutions; do the same for the second server (with different db connection parameters). What could possibly go wrong. ;)
No worldwide availability: to set up another server in another region of the world would mean a huge effort. To have a synchronized replica of our DB can be done, but once again deploying would mean downtime, room for errors and therefore time and money.
Hosting Options for Java PLAY:
There are lot of different blog posts about this. In short:
AWS: Amazon Web Services is one of the first places you start looking. Here you get everything that's possible, at a flexible price. You set yourself up an EC2 instance, a MySQL RDS and you're good to go - all of this in the free tier, so you can experiment, play around, test your stuff.
Microsoft Azure: similar to AWS regarding pricing and possibilities. However, I did not dive into setting up and deploying our application for test purposes.
Heroku: super easy deployment from within PLAY, scalable servers. However (on the first glance?) lacks possibility to supply remote regions with high speed content.
Jelastic: even easier deployment from within PLAY / IntelliJ IDEA. You push your app image to jelastic, jelastic distributes it further to their infrastructure providers.
RedHat OpenShift (https://www.openshift.com/): sounds promising, yet not as complete as AWS.
Lots of choices and possible setups/prices. Especially after finding out about deployment using boxfuse (https://cloudcaptain.sh/) I made my choice for AWS, as it offers absolutely all we need from 1 source. Boxfuse has low monthly costs but is perfectly integrated into AWS. Scaling is supported as well as the 3 common environments (dev/test/prod). Support is outstanding.
The setup looks good. I would however make one change: your large up- & downloads. As mobile speeds may not be ideal, have your app serve long-running requests is something you should avoid as this will needlessly tie up server threads. Instead consider having users upload and download straight from S3 using presigned URLs. You can then later add CloudFront to the mix when it makes financial sense to do so.
R53 will work just fine for picking the best server(s) for each end user.
For EC2 consider having an ELB + Auto-Scaling Group setup. Even just for a single instance you get the benefit of permanent health monitoring and auto-respawns. If you expect more load you can then auto-scale based on your expected bottleneck (cpu, network i/o). This will give you a more autonomous and robust setup than manually having to scale up and down based on your own monitoring analysis (even though the scaling part is very easy if you stick with immutable infrastructure & blue/green deployments like what Boxfuse offers).
Your focus on vertical server scaling might not serve you well on AWS. I would start thinking about horizontal scaling of app servers behind an Elastic Load Balancer, and possibly look into Elastic Beanstalk.
I'm not sure you can setup a read replica in another region via RDS, you might have to set that up via MySQL servers running on standard EC2 instances. And even if you can, that's going to be some expensive and high-latency data transfer.
If file uploads and downloads are all you are worried about, you just need to put CloudFront (Amazon's CDN service) in front of your application, and allow it to handle file uploads and downloads via its global edge servers. You could even do this without moving your entire application into AWS. I would recommend reading this blog post as a start.
I'm in the early stages of planning out a virtualised environment for our production system (Moodle). The layers are relatively simple:
web - Apache 2.2
Database - MySQL 5
PHP 5.2
My question is this, what is the generally accepted approach for distributing the above layers amongst phycsical servers? In this case, we are planning to have 2 physical servers. Should I aim to keep my web server cluster on a single physical server and database cluster on another? Or, replicate a full stack on both servers, in case one fails? Any insights into this would be a great help to me.
thanks,
Cathal.
We use separate (virtual) servers, but do maintain separate stacks on each simply because the overheads are small and it allows for flexibility if we want to scale up/down. This is not for fallback however, because if one server is so broken that it's not web accessible, you probably won't be able to get data off it and onto the second server in order for it to be a useful replacement. Use proper backups for fallback and practice restoring from them regularly.
Moodle generally blocks on the PHP side rather than the DB side and we see roughly 3.5:1 PHP:MySQL CPU loads when they are on separate machines. With that in mind, you need to consider what the maximum capacity of one server is: you will get best performance if you have no network overhead between the machines at all, so bigger is better. If you can't do it with one, then making 2 VMS, one larger for PHP and one smaller for MySQL is the best option, but do benchmark the differences under load for your particular setup (use Apache JMeter for this).
Our largest installs involve 70,000 users or so and we have two 4-CPU/8GB VMs, one for PHP and one for MySQL (although the DB one rarely goes above 30% CPU). This allows for about 400 concurrent connections via Apache. However, we are using a large farm of VMs and can scale up and down between 2 and 16 CPUs easily, so you may wish to consider one monster machine if you want flexibility.
For more information on Moodle performance, look here, particularly under 'scalability'.
My company is hosting a few separate, but related, moderately hit, web sites. Accordingly, a production database server, staging database server, production web server, staging web server, etc are needed. My question is, should we invest in physically separate servers for each of our needs, or should we put that money together and invest in a much higher end server and virtualize all of the aforementioned servers? Which route would you guys decide on, and why?
That depends on a lot of things, here are the main considerations.
If you have a lot of servers with low to moderate usage, virtualization should generally save you money on hardware, power, and floorspace. There is a tipping point, however, based on the overhead of the VM layer itself. Honestly, you will have to experiment to find the right cost/performance balance on this. I am sure the VM vendors would be happy to help you with the math.
The downside is that virtualization creates a single point of failure. If that box fails, you have downtime for all of your servers. Having them separate makes it far less likely for everything to take a dive at once.
You certainly want physical separation between the development and the production servers. You shouldn't ever have to worry that something you do in dev could bring down the machine on which production is hosted. And, there are some problems in development that really require either a hard reset of the physical machine or a ludicrous work-around to avoid a hard reset.
As for production web server and production database, you're not really introducing any new points of failure by virtualizing them on the same machine, particularly if you can colocate a static version of the site on another server. For any modern web site of even moderate complexity, database failure is web site failure anyway.
From my experience, for low or moderate usage a VM is the way to go - if you get just one very powerful server instead on several moderately powerful servers you save money, power and space and make the application faster at the same time because it's running on faster hardware.
A VM also have same another nice advantages, if the server hardware fails you can load the same VM on different physical hardware and continue running like nothing happened (you do have full backups, don't you?) and you can take a copy of the actual production server and run it in isolation on a development machine to debug those annoying bug that only appear in production.
But I would put the development (and maybe testing) servers on a different physical machine than production, you need to make sure no matter what stupid mistake you made in development it wouldn't take down the production server.