Queuing in Rackspace Cloud [closed] - deployment

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
I've been using EC2 for deployment all the time and now I wanna give Rackspace a try ,My application is have to be scalable, so I used RabbitMQ as the main queuing system . The actions on the front-end could lead to a very large amount of jobs that need execution which I want to queue somewhere.
Due to the expected load profile of the application it makes sense to use a scalable infrastructure like the rackspace cloud. Now I am wondering where it would be best to queue the jobs. Queueing them on the front-end server means that the number of front-end servers can only be scalled back down once the queues are processed which is a waste of resources if the peak load on the front-end is over we want to scale that down and scale up on machines that process the queue items.
If we queue them on the database server we are adding the load onto a single machine which in the current setup is already the most likely botleneck. How would you design this?
is there any built-in queuing for Rackspace something like amazon SQS or something ?

They don't have anything like SQS but there are a few good services that you may be able to take advantage of:
Cloud Files
With Akamai CDN - push all your static stuff right out to your clients (I'm in Gold Coast Australia and cloud files public content comes to me from some server in Brisbane (13 msec vs 250 msec ping times for USA servers) and due to the effect of distance on download speed - faster download times for your users, plus absolutely no clogging the pipes on the web server during the Christmas rush.
The way I use it is:
I create a Cloud files container; this gets a unique hostname.
I create a CNAME DNS record (for example: cdn.supa.ws) pointing to that unique hostname.
I use cloudfuse to mount the directory both on my cloud server and on my home linux box.
Then just copy or upload files straight to that directory, then serve them from http://cdn.yourdomain.com
Load balancers as a service
http://www.rackspace.com/cloud/cloud_hosting_products/loadbalancers/ - Basically a bunch of Zeus load balancers that you can use to push requests to your back end servers. Cool because they're API programmable, so you can scale on the fly and add more backend servers as needed. They also have nice weighting algorithms, so you can send more traffic to certain servers if needed.
Internal VLAN
I would recommend using the 'internal IPs' (10.x.y.z) (the eth1 interface) for message queuing and DB data between Cloud Servers as they give you a higher outgoing bandwidth cap.
Outgoing Bandwidth (speed) caps:
256 MB Ram - 10 Mb/s eth0 - 20 Mb/s eth1
512 MB Ram - 20 Mb/s eth0 - 40 Mb/s eth1
1 GB Ram - 30 eth0 - 60 Mb/s eth1
2 GB Ram - 40 eth0 - 80 Mb/s eth1
4 GB Ram - 50 eth0 - 100 Mb/s eth1
8 GB Ram - 60 eth0 - 120 Mb/s eth1
15.5 GB Ram - 70 eth0 - 140 Mb/s eth1
eth1 is called an Internal VLAN, but it is shared with other customers, so best to firewall off your eth1 as well as your eth0, for example only allow mysql connections from your Cloud Servers; and if you have really sensetive stuff maybe use myqsl with ssl, just in case :)
MySQL as a service
There is also a MySQL as a service private beta. I haven't tried it yet, but looks like it has a lot of potential coolness: http://www.rackspace.com/cloud/blog/2011/12/01/announcing-the-rackspace-mysql-cloud-database-private-beta/

Rackspace don't offer a hosted queuing system.
I've been running RabbitMQ on their Cloud Servers for more than 2 years and things are good.
I haven't tried clustering though so I don't know how easy it would be to setup over there, nor how stable it would be in their environment.

Beanstalkd just rocks- Tubes function as pub-sub and can just work like a charm on any cloud vendor. 3-7 minutes to setup. Blazingly fast since uses memcache like queue.
You can write workers in any language you chose from. You cannot go wrong with this one.
Link:
http://kr.github.com/beanstalkd/

Related

Monitor Nestjs backend

In the old days, when we wanted to monitor a "Daemon" / Service, we were asking the software editor the list of all the services running in the background in Windows.
If a "Daemon / service" would be down, it would be restarted.
On top of that, we would use a software like NAGIOS or Centreon to monitore this particular "Daemon / service".
I have a team of Software developper in charge of implementing a nice Nest JS.
Here is what we are going to implement:
2 differents VMs running on a high availability VMWARE cluster with a SAN
the two VMs has Vmotion / High availabity settings
an HA Proxy is setup in order to provide load balancing and additional high availability
Our questions are, how can we detect that :
one of our backend is down ?
one of our backend moving from 50ms average response time to 800ms ?
one of our backend consumes more that 15Gb of ram ?
etc
When we were using "old school" daemon, it was enough, when it comes to JS backend, I am a bit clue less.
Cheers
Kynes
nb : the datacenter in charge of our infrastructure is not "docker / kubernetes / ansible etc compliant)
To be fair, all of these seem doable out of the box for Centreon/Nagios. I'd say check the documentation...
one of our backend is down ?
VM DOWN: the centreon-vmware plugins provides monitoring of VM status.
VM UP but Backend DOWN : use the native http/https url checks provided by Centreon/Nagios to load the web page.
Or use the native SNMP plugins to monitor the status of your node process.
one of our backend moving from 50ms average response time to 800ms ?
Ping Response time: Use the native ping check
Status of the network interfaces of the VM: the centreon-vmware plugin has network interface checks for VMs.
Page loading time: use the native http/https url checks provided by Centreon/Nagios.
You may go even further and use a browser automation tool like selenium to run scenarios on your pages and monitor the time for each step.
one of our backend consumes more that 15Gb of ram ?
Total RAM consumed on server: use the native SNMP memory checks from centreon/nagios.
RAM consumed by a specific process: possible through the native SNMP memory plugin.
Like so:
/usr/lib/centreon/plugins/centreon_linux_snmp.pl --plugin os::linux::snmp::plugin --mode processcount --hostname=127.0.0.1 --process-name="centengine" --memory --cpu
OK: Number of current processes running: 1 - Total memory usage: 8.56 MB - Average memory usage: 8.56 MB - Total CPU usage: 0.00 % | 'nbproc'=1;;;0; 'mem_total'=8978432B;;;0; 'mem_avg'=8978432.00B;;;0; 'cpu_total'=0.00%;;;0;`

Redis Simple Production Server Specification

Currently I use redislabs to host my redis server, but redislabs cloud server not available in my web server hosting (softlayer) so the performance of my web server is decrease because of network latency (~20ms for 1 trip)
Because of that reason, I want to create a VPS to host redis in softlayer so my web server can connect to the redis server through LAN.
From redislabs i know that it consume ~400MB memory and has ~250 ops/sec in normal day, but can go to ~1500 ops/sec when we have an event like flash sale.
The question is which server specification can handle that kind of traffic?
Is VPS using 1 CPU x 4GB memory is enough?
Thank you
In the softlayer portal control when ordering a VPS there are many options with the characteristics that you want, we can not give you the specific characteristics for your requirements because we do not know if it will fulfill your expectations.
I could suggest you to order a hourly VPS with the characteristics you want and you can try it, if it does not work you can cancel it immediately to do not incur huge costs as with a monthly server.

Amazon EC2 Elastic Load Balancer TCP disconnect after couple of hours

I am testing the reliability of TCP connections using Amazon Elastic Load Balancer compared to not using the Load Balancer to see if it has any impact.
I have setup a small Elastic Load Balancer on Amazon EC2 us-east zones with 8 t2.micro instances using an auto scaling group without policy and set to 8 min/max instance.
Each instance run a simple TCP server that accept connections on port 8017 and relay some data to the clients coming from another remote server located in my network. The same data is send to all clients.
For the purpose of the test, the servers running on the micro instances are only sending 1 byte of data every 60 seconds (to be sure the connection don't time out).
I connected multiple clients from various outside networks using the ELB DNS name provided, and after maybe 6-24 hours, I always stop receiving data and eventually the connections all die.
All clients stops around the same time, even though they are on different network/ISP. Each "client" application is doing about 10 TCP connections and they all stop receiving data.
All server instances look fine after this happen, they still send data.
To do further testing and eliminate the TCP server code problem, I also have external clients connected directly to the public IP of a single instance, without the ELB, and the data doesn't stop and the connection is not lost in this case (so far).
The Load balancer Idle Timeout is set to 900 seconds.
The Cross-Zone load balancing is enabled and I am using the following zones: us-east-1e, us-east-1b, us-east-1c, us-east-1d
I read the documentation, and searched everywhere to see if this is a known behaviour, but I couldn't find any clear answer or confirmation of others having the same issue, but it seems clear it is happening in my case.
My question: Is this a known/expected behaviour for TCP load balancer? Otherwise, any idea what could be the problem in my setup?

Docker instead of multiple VMs

So we have around 8 VMs running on a 32 GB RAM and 8 Physical core server. Six of them run a mail server each(Zimbra), two of them run multiple web applications. The load on the servers are very high primarily because of heavy load on each VMs.
We recently came across Docker. It seems to be a cool idea to create containers of applications. Do you think it's a viable idea to run applications of each of these VMs inside 8 Docker Containers. Currently the server is heavily utilized because multiple VMs have serious I/O issues.
Or can docker be utilized in cases where we are only running web applications, and not email or any other infra apps. Do advise...
Docker will certainly alleviate your server's CPU load, removing the overhead from the hypervisor's with that aspect.
Regarding I/O, my tests revealed that Docker has its own overhead on I/O, due to how AUFS (or lately device mapper) works. In that front you will still gain some benefits over the hypervisor's I/O overhead, but not bare-metal performance on I/O. My observations, for my own needs, pointed that Docker was not "bare-metal performance like" when dealing with intense I/O services.
Have you thought about adding more RAM. 64GB or more? For a large zimbra deployment 4GB per VM may not be enough. Zimbra like all messaging and collaboration systems, is an IO bound application.
Having zmdiaglog (/opt/zimbra/libexec/zmdiaglog) data to see if you are allocating memory correctly would help. as per here;
http://wiki.zimbra.com/wiki/Performance_Tuning_Guidelines_for_Large_Deployments#Memory_Allocation

JMeter throughput drops when hitting Amazon ELB

I am hosting a web application on Amazon's AWS Servers. I am currently in the process of load testing the application with JMeter. My main problem seems to be that when I go through an Elastic Load Balancer (ELB) to hit the Amazon server's rather than hitting the servers directly - I seem to hit a cap in my throughput.
If I hit my web application directly - for each server I am able to achieve a throughput of 50 RPS per server.
If I hit my web application via Amazon's ELB - I am only able to achieve a max throughput of 50 RPS (total)
I was wondering if anyone else has experienced similar behavior when load testing using Jmeter via Amazon's ELB.
For more context my web application is a REST application which allows users to download content (~150 kb) via HTTP requests.
I am running Jmeter with the following flag "-Dsun.net.inetaddr.ttl=0" and running it with 10 threads. I have tried running these tests with multiple clients on different machines.
Thanks for any help in advance.
Load balancers may be tricky to test as they may have different mechanisms of orchestrating traffic depending on origin. The most commonly used approach to distinguish origin of the request and redirect it to the same host, which served previous request is a cookie. You can look into HTTP Cookie Manager to correctly manipulate your cookies and make sure than you have different ones for each testing thread or thread group (depending on your use case). Another flaky area is origin host IP. You may require to bind each testing thread to different IP address in order to hit different servers behind the load balancer. There can be also some issues with DNS in regards to Amazon LBs. useful guide on how to test Amazon ELBs
Most probable cause would be DNS caching by jmeter. ELB returns IPs of additional servers depending on how autoscaling is set but JMeter does not use these additional servers. This problem can be solved by ensuring that Jmeter does not cache DNS results...
The ELB is a name, not IP, and can suffer from DNS caching. Make sure you use "-Dsun.net.inetaddr.ttl=0" when starting JMeter
http://wiki.apache.org/jmeter/JMeterAndAmazon
A really late response, and slightly different than the original question, but I hope this can help others as it took me a while to get it all straight. My original problem was not reduced throughput as a result of the ELB, but the introduction of HTTP 503 errors. Actually, the ELB increased my throughput as compared to querying the web application directly, though even with 1 hour tests, the results were sporadic to say the least.
First, the ELB has 2-staged load balancing going on. The first load balance is across the ELB's themselves. That's done by associating multiple IP addresses to the hostname provided by AWS for the ELB you provision. The second is then, of course, across your application instances behind the ELB.
Without trying to offend the SO gods, this is a really helpful article.
https://blazemeter.com/blog/dns-cache-manager-right-way-test-load-balanced-apps
The most helpful information in there was to use the DNS Cache Manager module in JMeter. This will query multiple DNS servers, and wipe out your DNS cache.
I implemented that module and then setup Wireshark, filtering on the two IP addresses belonging to the ELB hostname and sure enough, it was querying both IP addresses, though clearly favored one over the other.
That didn't make a big difference, at least not over short tests.
The real difference (2-3 times more throughput) came when I tweaked the ELB health settings. I initially had a high error rate, however after reducing the unhealthy threshold and the interval between health checks, my error rates dropped dramatically.
Additionally, whereas all my other tests had been 60 - 90 minutes in duration, this one was 8 hours. I started out with decent throughput and it then quickly dropped (by about 2/3). After about 20 minutes or more, the throughput then started ticking back up and by the end of the test, it had sustained throughput of about 5 times what I was getting without the ELB (which was similar to what the throughput was when it dropped shortly after beginning this test).