configuring load/traffic split using haproxy - haproxy

I have two app servers which are behind an haproxy load balancer. Is there a configuration available using which I can split the traffic between the two as per my requirement. Like sending x% of the requests to server A and rest to server B.

You have several options, but I do not think that you can directly do what you want to do. Dividing the traffic close to 50/50 is as easy as setting the load balancing algorithm to "round robin". What it sounds like you want to do is be able to send 15% of traffic to server A and then 85% to server B. In order to do this simply set a cookie on the client (some random number between 1-100 for example) and then send all traffic with a cookie value of less than 16 to server A and the rest to server B.

Related

Is redirection a valid strategy for an API Gateway?

I read this article about the API Gateway pattern. I realize that API Gateways typically serve as reverse proxies, but this forces a bottleneck situation. If all requests to an application's public services go through a single gateway, or even a single load balancer across multiple replicas of a gateway (perhaps a hardware load balancer which can handle large amounts of bandwidth more easily than an API gateway), then that single access point is the bottleneck.
I also understand that it is a wide bottleneck, as it simply has to deliver messages in proxy, as the gateways and load balancers themselves are not responsible for any processing or querying. However, imagining a very large application with many users, one would require extremely powerful hardware to not notice the massive bandwidth traveling over the gateway or load balancer, given that every request to every microservice exposed by the gateway travels through that single access point.
If the API gateway instead simply redirected the client to publicly exposed microservices (sort of like a custom DNS lookup), the hardware requirements would be much lower. This is because the messages traveling to and from the API Gateway would be very small, the requests consisting only of a microservice name, and the responses consisting only of the associated public IP address.
I recognize that this pattern would involve greater latency due to increased external requests. It would also be more difficult to secure, as every microservice is publicly exposed, rather than providing authentication at a single entrypoint. However, it would allow for bandwidth to be distributed much more evenly, and provide a much wider bottleneck, thus making the application much more scalable. Is this a valid strategy?
A DNS/Public IP based approach not good from a lot of perspectives:
Higher attack surface area as you have too many exposed points and each need to be protected
No. of public IPs needed is higher
You may need more DNS settings with subdomains or domains to be there for these APIs
A lot of times your APIs will run on a root path but you may want to expose them on a folder path example.com/service1, which requires
you to use some gateway for the same
Handling SSL certificates for these public exposures
Security on a focussed set of nodes vs securing every publically exposed service becomes a very difficult task
While theoretically it is possible to redirect clients directly to the nodes, there are a few pitfalls.
Security, Certificate and DNS management has been covered by #Tarun
Issues with High Availability
DNS's cache the domains vs IP's they serve fairly aggressively because these seldom change. If we use DNS to expose multiple instances of services publicly, and one of the servers goes down, or if we're doing a deployment, DNS's will continue routing the requests to the nodes which are down for a fair amount of time. We have no control over external DNS's and their policies.
Using reverse proxies, we avoid hitting those nodes based on health checks.

Knative/Kubernetes unique IP for outbound traffic

Question:
Does Knative expose low-level network components that allow me to configure the stack in such a way, that each instance has a unique IP address available for outbound networking?
Info
I have a workload that has to happen on queue event. The incoming event will start the fetching on an API. Due to rate limiting and amount of request (around 100), the process is long-running and with wait / request / wait / request / wait / .. . What the code (JS) basically does is, hitting an API endpoint with parameters from the queues message and sending the result of the 100 API requests back with another queue.
Serverless on Lamdba is therefore expensive, also on AWS multiple instances are likely to be spawned on the same VM (tested), resulting in the same IP for outbound traffic. Therefore Lambda is not an option for me.
I read a lot about Knative lately and I imagine that the Kubernetes stack offers better configurability. I need to have concurrent instances of my service, but I need to have a unique outbound IP per instance.
Currently, the solution is deployed on AWS Beanstalk where I scale them out based on queue-length. Therefore 1 - 10 instances exist at the same time and perform the API requests. I use micro instances since CPU/../.. load is really low. There have been multiple issues with Beanstalk, that's why we'd like to move.
I do not expect a monthly cost advantage (IPs are expensive, that's ok), I am just unhappy with the deployment on Beanstalk.
IMHO, going with KNative/Kubernetes is probably not the way to proceed here. You will have to manage a ton of complexity just to get some IP addresses. Beanstalk will seem like a walk in the park.
Depending on how many IPs you need, you can just setup a few EC2 instances loaded up with IP addresses. One cheap t3.small instance can host 12 IPv4 addresses (ref) and your JS code can simply send requests from each of the different IP addresses. (Depending on your JS http client, usually there's a localAddress option you can set.)

Is there a way to have a proxied request respond straight to the requestor?

So let's say I have the following (obviously simplified) architecture: I have 10 servers running the same REST API endpoint. I have an intermediate API which fields requests, and then forwards it to one of the servers (a load balancer).
Now let's imagine that this is a really big, streaming response. As such, I obviously don't want the data to have to go back through the load balancer -- because wouldn't this bog down and defeat the purpose of the load balancing server?. What would be the proper way to implement a load balancing system that would delegate a request to node but not force the response back through the load balancing server?
Further, are there any REST frameworks on the JVM that implement this?
What you are looking for is called DSR (direct server return). you can attempt to google it a bit. AFAIK most hardware load balancers have this option.
The question is what load balancer are you using? Is it hardware, ELB on AWS, HAProxy?
For example:
http://blog.haproxy.com/2011/07/29/layer-4-load-balancing-direct-server-return-mode/
If you're not really into load balancers, you could attempt to set this up in 2 stages: first - the client hits the API and gets the ip of the server, second the client talks to the servers. The hard part will be not to overload some servers when leaving others idle (both initial setup and rebalancing workloads as time goes by)

HA-Proxy balancing by source doesnt appear consistent

Using HA-Proxy 1.4.18 I am using balance source as the option to balance a tcp stream to 2 servers. However from an admittedly very small sample set of connections it appears that they all just go to the one server - the server listed first in the haproxy config.
listen videos *:1935
balance source
mode tcp
server server1 192.168.0.1:1935
server server2 192.168.0.2:1935
I have not seen it split the load onto the 2 boxes. This does work when I use balance roundrobin however for this particular application I cannot use this method.
Any ideas for an otherwise persistent session loadbalanced between these 2 machines from the clients?
Cheers
How did you test the balance ?, the doc says :
The source IP address is hashed and divided by the total
weight of the running servers to designate which server will
receive the request. This ensures that the same client IP
address will always reach the same server as long as no
server goes down or up. If the hash result changes due to the
number of running servers changing, many clients will be
directed to a different server. This algorithm is generally
used in TCP mode where no cookie may be inserted. It may also
be used on the Internet to provide a best-effort stickiness
to clients which refuse session cookies. This algorithm is
static by default, which means that changing a server's
weight on the fly will have no effect, but this can be
changed using "hash-type"
If you tested with just 2 different IP source you maybe fall in a particular case.

Redirect users to the nearest server based on their location without changing url

This is my case:
I have 6 servers across US and Europe. All servers are on a load balancer. When you visit the website (www.example.com) its pointing on the load balancer IP address and from their you are redirect to one of the servers. Currently, if you visit the website from Germany for example, you are transfered randomly in one of the server. You could transfer to the Germany server or the server in San Fransisco.
I am looking for a way to redirect users to the nearest server based on their location but without changing url. So I am NOT looking of having many url's such as www.example.com, www.example.co.uk, www.example.dk etc
I am looking for something like a CDN where you retrieve your files from the nearest server (?) so I can get rid of the load balancer because if it crashes, the website does not respond (?)
For example:
If you are from uk, redirect to IP 53.235.xx.xxx
If you are from west us, redirect to IP ....
if you are from south europe, redirect to IP ... etc
DNSMadeeasy offers a feature similar to this but they are charging a 600 dollars upfront price and for a startup that doesnt know if that feature will work as expected or there is no trial version we cannot afford: http://www.dnsmadeeasy.com/enterprise-dns/global-traffic-director/
What is another way of doing this?
Also another question on the current setup. Even with 6 servers all connected to the load balancer, if the load balancer has lag issues, it takes everything with it, right? or if by any change it goes down, the website does not respond. So what is the best way to eliminate that downtime so that if one server IP address does not respond, move to the next (as a load balancer would do but load balancers can have issues themselves)
Would help to know what type of application servers you're talking about; i.e. J2EE (like JBoss/Tomcat), IIS, etc?
You can use a hardware or software load balancer with Sticky IP and define ranges of IPs to stick to different application servers. Each country's ISPs should have it's own block of IPs.
There's a list at the website below.
http://www.nirsoft.net/countryip/
Here's also a really, really good article on load balancing in general, with many high availability / persistence issues addressed. That should answer your second question on the single point of failure at your load balancer; there's many different techniques to provide both high availability and load distribution. Alot depends on what kind of application your run and whether you require persistent sessions or not. Load balancing by sticky IP, if persistence isn't required and you're LB does health checks properly, can provide high availability with easy failover. The downside is that load isn't evenly distributed, but it seems you're looking for distribution based on proximity, not on load.
http://1wt.eu/articles/2006_lb/index.html