Getting 504 gateway timeout error when accessing node application through haproxy - haproxy

I am facing following situations when configuring haproxy with node/express application. I am trying to
achieve following.
(https) (http)
browser ======> haproxy =====> node application
When loading the node application through the browser I am getting http 504 gateway time-out error.
Below is my haproxy configurtions.
haproxy configurations
Following are the haproxy logs.
vm-2 haproxy[21255]: 127.0.0.1:45948 [23/Dec/2019:10:57:51.411] https-in~ servers/server1 0/0/0/-1/100001 504 194 - - sH-- 1/1/0/0/0 0/0 "GET / HTTP/1.1"
vm-2 haproxy[21255]: 127.0.0.1:45948 [23/Dec/2019:10:57:51.411] https-in~ servers/server1 0/0/0/-1/100001 504 194 - - sH-- 1/1/0/0/0 0/0 "GET / HTTP/1.1"
vm-2 haproxy[21255]: 127.0.0.1:46122 [23/Dec/2019:10:59:31.435] https-in~ servers/server1 0/0/0/-1/100002 504 194 - - sH-- 1/1/0/0/0 0/0 "GET /favicon.ico HTTP/1.1"
Any help would be appreciated.

You're haproxy logs indicate that it's taking over 100 seconds (ie 100001/100002) for the request to complete and that it's being aborted (ie -1) before your backend server can send the full response.
If you're looking for a strictly haproxy solution (ie. you can't/won't tune your application) then you would need to play with haproxy timeout settings.

We faced the same problem, the client requests to the server were 504s sent by the HAProxy. We found out that the defaults configurations in the haproxy.cfg file had the timeout server property that defined the 504 response (setting it to a lower value, 1s in our case, would automatically result in a 504). Increasing that value is a way to have a longer connection between the proxy and the backend.

Related

haproxy - layer 7 health check failure

I am getting occasional layer 7 health check failures. This happens on production machine seemingly at random, maybe once a minute or every few minutes on average. Here is the configuration:
backend api
mode http
option httpchk GET /api/v1/status HTTP/1.0
http-check expect status 200
balance roundrobin
server api1 127.0.0.1:8001 check fall 3 rise 2
server api2 127.0.0.1:8002 check fall 3 rise 2
The HAproxy log tells me the following:
Health check for server api/api2 failed, reason: Layer7 timeout, check duration: 10001ms, status: 2/3 UP.
Strange thing is when I run a script to fetch the same URL at a much faster pace than HAproxy, it never fails to return 200 response. It never hangs like it seems to do for HAproxy.
In addition, I'm getting occasional HAProxy error for various API calls, not just health checks, all looking quite similar:
https-in~ api/api1 45/0/0/-1/30045 504 194 - - sHVN 50/49/13/10/0 0/0 "POST /api/v1/accounts HTTP/1.1"
What could be the issue here? This one really got me stumped.

Long request returns with empty response after 120 seconds, caused by Network Load Balancer

I have a GKE cluster with 2 nodes, with a service of type LoadBalancer.
When I call the service internally a long request will not timeout after 120 seconds.
But if I call the external IP of the Network Load Balancer that forwards to the internal service, I get a "Empty reply from server" response.
External call example:
curl -v "http://<public-ip>/longResponse"
* Trying <public-ip>...
* TCP_NODELAY set
* Connected to <public-ip> (<public-ip>) port 80 (#0)
> GET /longResponse HTTP/1.1
> Host: <public-ip>
> User-Agent: curl/7.54.0
> Accept: */*
>
* Empty reply from server
* Connection #0 to host <public-ip> left intact
curl: (52) Empty reply from server
Internal call example:
/ # wget -O - -S <service-name>/longResponse
Connecting to location-service (10.3.255.181:80)
HTTP/1.1 200 OK
Access-Control-Allow-Origin: *
Content-Type: application/json
Content-Length: 15
Date: Thu, 28 Feb 2019 10:31:14 GMT
Connection: close
- 100% |*********************************************************************************************************************************************************************************************************************| 15 0:00:00 ETA
/ #
I've tried to find documentation for request or socket timeout in the load balancer level, but I didn't encounter anything. Any idea?
Thanks.
Are you sure that's not a client-side timeout? Network LB doesn't process packets other than to route them, so it should never send any response back.
Try the -m flag to curl?
Also maybe capture a tcpdump on your client-side so you can see what the network is actually doing.
Get the load-balancer's backend name with:
gcloud compute backend-services list
then
BACKEND=name-of-your-backend
gcloud compute backend-services update $BACKEND --timeout=600s
otherwise, in the console: Network services ⇒ Load balancing ⇒ Backends then you can click your HTTP backend(s) and edit the settings, including the timeout.
On a wider note, this may be one of serval hops between server and client, each of which might timeout. You're better off either living with the timeout (and making your long polls complete before the timeout), or drip feeding data down the line... for instance, you can preprend whitespace to json, so for instance, send a space character every 30 seconds until you have a proper response body. This will keep the load-balance from timing out.

increase the upload limit of HAProxy

When using HAProxy, I've been getting the error 413: Request Entity Too Large
This error occurs when I'm trying to upload some files which are too large, however I can not find any documentation on how to increase this limit.
How can you increase the maximum upload limit to a specified amount of MB's?
This is not a HAProxy error, as you can see here http://cbonte.github.io/haproxy-dconv/configuration-1.7#1.3.1, 413 Error is not in the list.
So this probably an error returned from the server and HAProxy is just "forwarding" the error to the client.
To be 100% sure, you can see the logs:
An error returned by HAProxy:
127.0.0.1:35494 fe_main be_app/websrv1 0/0/-1/-1/3002 503 212 - - SC-- 0/0/0/0/3 0/0 "GET /test HTTP/1.1"
An error returned by the backend server:
127.0.0.1:39055 fe_main be_app/websrv2 0/0/0/0/0 404 324 - - --NI 1/1/0/1/0 0/0 "GET /test HTTP/1.1"
Notice the "-1" in the timers.

404 redirect to another server/domain

I'm looking for a solution with redirects to another domain if the response from HTTP server was 404.
acl not_found status 404
acl found_ceph status 200
use_backend minio_s3 rsprep ^HTTP/1.1\ 404\ (.*)$ HTTP/1.1\ 302\ Found\nLocation:\ / if not_found
use_backend ceph if found_ceph
But still not working, this rule goes to minio_s3 backend.
Thank you for you advice.
When the response from this backend has status 404, first add a Location header that will send the browser to example.com with the original URI intact, then set the status code to 302 so the browser executes a redirect.
backend my-backend
mode http
server my-server 203.0.113.113:80 check inter 60000 rise 1 fall 2
http-response set-header Location http://example.com%[capture.req.uri] if { status eq 404 }
http-response set-status 302 if { status eq 404 }
Test:
$ curl -v http://example.org/pics/funny/cat.jpg
* Hostname was NOT found in DNS cache
* Trying 127.0.0.1...
* Connected to example.org (127.0.0.1) port 80 (#0)
> GET /pics/funny/cat.jpg HTTP/1.1
> User-Agent: curl/7.35.0
> Host: example.org
> Accept: */*
The actual back-end returns 404, but we don't see it. Instead...
< HTTP/1.1 302 Moved Temporarily
< Last-Modified: Thu, 04 Aug 2016 16:59:51 GMT
< Content-Type: text/html
< Content-Length: 332
< Date: Sat, 07 Oct 2017 00:03:22 GMT
< Location: http://example.com/pics/funny/cat.jpg
The response body from the back-end's 404 error page will still be sent to the browser, but -- as it turns out -- the browser will not display it, so no harm done. This requires HAProxy 1.6 or later.
#Michael's answer is rather good, but isno't working for me for two reasons:
Mainly because the %[capture.req.uri] tag resolves to empty (HA Proxy 1.7.9 Docker image)
Also due to the fact that the original assumptions are incomplete, due to the fact that the frontend section is missing...
So I struggled for a while, as you find all kinds of answers on the Internet, between those guys who swear the 404 logic should be put in the frontend, vs those who choose the backend, and any possible kind of tags...
This is my answer, which works for me.
My use case is that if an image is not found on the backend behind HA Proxy, then an S3 bucket is checked.
The entry point is: https://myhostname:8080/path/to/image.jpeg
defaults
mode http
global
log 127.0.0.1:514 local0 debug
frontend come_on_over_here
bind :8080
# The following two lines are here to save values while we have access to them. They won't be available in the backend section.
http-request set-var(txn.path) path
http-request set-var(txn.query) query
http-request replace-value Host localhost:8080 dev.local:80
default_backend onprems_or_s3_be
backend onprems_or_s3_be
log global
acl path_photos var(txn.path) -m beg /path/prefix/i/want/to/strip/off
acl p_ext_jpeg var(txn.path) -m end .jpeg
acl is404 status eq 404
http-response set-header Location https://mybucket.s3.eu-west-3.amazonaws.com"%[var(txn.path),regsub(^/path_prefix_i_want_to_strip_off/,/)]?%[var(txn.query)]" if path_photos p_ext_jpeg is404
http-response set-status 301 if is404
server onprems_server dev.local:80 check

Haproxy 503 Service Unavailable . No server is available to handle this request

How does haproxy deal with static file , like .css, .js, .jpeg ? When I use my configure file , my brower says :
503 Service Unavailable
No server is available to handle this request.
This my config :
global
daemon
group root
maxconn 4000
pidfile /var/run/haproxy.pid
user root
defaults
log global
option redispatch
maxconn 65535
contimeout 5000
clitimeout 50000
srvtimeout 50000
retries 3
log 127.0.0.1 local3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout check 10s
listen dashboard_cluster :8888
mode http
stats refresh 5s
balance roundrobin
option httpclose
option tcplog
#stats realm Haproxy \ statistic
acl url_static path_beg -i /static
acl url_static path_end -i .css .jpg .jpeg .gif .png .js
use_backend static_server if url_static
backend static_server
mode http
balance roundrobin
option httpclose
option tcplog
stats realm Haproxy \ statistic
server controller1 10.0.3.139:80 cookie controller1 check inter 2000 rise 2 fall 5
server controller2 10.0.3.113:80 cookie controller2 check inter 2000 rise 2 fall 5
Does my file wrong ? What should I do to solve this problem ? ths !
What I think is the cause:
There was no default_backend defined. 503 will be sent by HAProxy---this will appear as NOSRV in the logs.
Another Possible Cause
Based on one of my experiences, the HTTP 503 error I receive was due to my 2 bindings I have for the same IP and port x.x.x.x:80.
frontend test_fe
bind x.x.x.x:80
bind x.x.x.x:443 ssl blah
# more config here
frontend conflicting_fe
bind x.x.x.x:80
# more config here
Haproxy configuration check does not warn you about it and netstat doesn't show you 2 LISTEN entries, that's why it took a while to realize what's going on.
This can also happen if you have 2 haproxy services running. Please check the running processes and terminate the older one.
Try making the timers bigger and check that the server is reachable.
From the HAproxy docs:
It can happen from many reasons:
The status code is always 3-digit. The first digit indicates a general status :
- 1xx = informational message to be skipped (eg: 100, 101)
- 2xx = OK, content is following (eg: 200, 206)
- 3xx = OK, no content following (eg: 302, 304)
- 4xx = error caused by the client (eg: 401, 403, 404)
- 5xx = error caused by the server (eg: 500, 502, 503)
503 when no server was available to handle the request, or in response to
monitoring requests which match the "monitor fail" condition
When a server's maxconn is reached, connections are left pending in a queue
which may be server-specific or global to the backend. In order not to wait
indefinitely, a timeout is applied to requests pending in the queue. If the
timeout is reached, it is considered that the request will almost never be
served, so it is dropped and a 503 error is returned to the client.
if you see SC in the logs:
SC The server or an equipment between it and haproxy explicitly refused
the TCP connection (the proxy received a TCP RST or an ICMP message
in return). Under some circumstances, it can also be the network
stack telling the proxy that the server is unreachable (eg: no route,
or no ARP response on local network). When this happens in HTTP mode,
the status code is likely a 502 or 503 here.
Check ACLs, check timeouts... and check the logs, that's the most important...