My backend, in Rocket (Rust), does not have compression built in. So, it is dependent on the proxy to compress it. Though nginx ingress controller supports it, I thought whether the default one had it too as it has high availability.
If it does not have, then how should I setup?
UPDATE(2018-01-31): It looks like Cloud HTTP(S) Load Balancer supports GZIP. You just have to serve compressed content from your backend and the load balancer will pass it on.
However, NGINX is confused because of the Via header (it thinks proxies don't support GZIP, and on most cloud providers this is correct, but not Google). See this FAQ: https://cloud.google.com/cdn/docs/troubleshooting#compression-not-working
If you are using the nginx web server software, modify the nginx.conf
configuration file to enable compression. The location of this file
depends on where nginx is installed. In many Linux distributions, the
file is stored at /etc/nginx/nginx.conf. To allow nginx compression to
work with HTTP(S) load balancing, add the following two lines to the
http section of nginx.conf:
gzip_proxied any;
gzip_vary on;
I believe nginx does not compress requests via a proxy by default.
You can change its config to enable that:
gzip_proxied any;
gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;
Source: https://blog.percy.io/tuning-nginx-behind-google-cloud-platform-http-s-load-balancer-305982ddb340
Related
I have a Cloudflare Load Balancer configuration with two origin servers:
app.example.com -> backend1.example.com
-> backend2.example.com
This works fine most of the time. However, when a backend server does an HTTP redirect, it reveals the backend server hostname to the browser. For example, if there is a redirect from /a to /b the request/response would look like this (with some headers omitted for brevity):
Request
GET /a HTTP/1.1
Host: app.example.com
Response
HTTP/1.1 302 Found
Location: https://backend1.example.com/b
This means the browser tries to connect to the backend server directly, bypassing the load balancer.
What I want
Is it possible for the Location to be corrected by the Cloudflare Load Balancer, similar to what ProxyPassReverse does in an Apache reverse proxy?
For example:
HTTP/1.1 302 Found
Location: https://app.example.com/b
or even
HTTP/1.1 302 Found
Location: /b
Or do I need to find a way to fix this on the backend server?
Here's an approach that may work, if the backend supports it.
The X-Forwarded-Host request header is (a) injected by some reverse proxies and (b) honoured by some application servers. It allows the application to see what original hostname the browser connected to before it was reverse proxied, and then use that hostname when constructing redirects.
It's easily spoofed by the reverse proxy so it's often not automatically trusted by the application server.
Here's how to use it.
Add a Cloudflare Transform Rule:
Rule Name: Add X-Forwarded-Host,
When: Hostname equals app.example.com
HTTP Request Header Modification,
Set Dynamic,
Header Name: X-Forwarded-Host,
Value: http.host
Deploy
Now on the backend, configure the application server to support it (if required).
For example, JBoss or Wildfly:
/subsystem=undertow/server=default-server/https-listener=default:write-attribute(name=proxy-address-forwarding,value=true)
Express for Node.js: Use the trust proxy setting
Your application server may support it out of the box, it may need a bit of configuration, or it may not support it at all. Look for X-Forwarded-Host in the docs.
My company is thinking of moving from AWS to GCP. One of the feature we want to support from CloudCDN is brotli encoding. We have a techstack that will bundle our javascript into 3 files:
chunk.js
chunk.js.gz
chunk.js.br
If CloudCDN receives client request headers Accept-Encoding: br, gzip is CloudCDN smart enough to serve up Brotli file? Moreover, will it be cached? If not, are there any other approaches in achieving this in CloudCDN.
AWS Cloudfront only offers this feature with the use of 2 lamdbas. Which I think is a bad idea.
Yes, Cloud CDN can cache all 3 representations and serve the correct one based on the client's Accept-Encoding header so long as the response from your origin server includes a Vary: Accept-Encoding header. There's more information at https://cloud.google.com/cdn/docs/caching#vary_headers.
Update:
I didn't realize you were using a Cloud Storage bucket as the origin. Unfortunately, neither Cloud CDN nor Cloud Storage have functionality that will rewrite client requests for /chunk.js to /chunk.js.br based on whether the client supports Brotli. I agree that would useful, so I filed an internal feature request.
When an origin server such as nginx is configured to select the appropriate file, Cloud CDN needs to go back to the origin server only on a cache miss. So long as the origin server's responses contain a Vary: Accept-Encoding header, Cloud CDN can serve cache hits directly from the edge by comparing the client's Accept-Encoding request header with the Accept-Encoding value specified when the response was cached. Clients that specify Accept-Encoding: br, gzip will be served from one cache entry while clients that specify Accept-Encoding: gzip will be served from another.
CloudFront now supports Brotli compression natively. If you're using S3 as your origin (or any origin that returns uncompressed content), CloudFront can automatically compress at the edge using Brotli or Gzip. You don't need to create three versions of the file or use Lambda#Edge.
https://aws.amazon.com/about-aws/whats-new/2020/09/cloudfront-brotli-compression/
I need to integrate several web applications on-premise and off-site under a common internally hosted URL. The on-premise applications are in the same data center as the haproxy, but the off-site applications can only be reached via a http proxy because the server on which haproxy is running has no direct Internet access. Therefore I have to use a http Internet proxy, SOCKS might be an option too.
How can I tell haproxy that a backend can only be reached via proxy ?
I would rather not use an additional component like socksify / proxifier / proxychains / tsocks / ... because this introduces additional overhead.
This picture shows the components involved in the setup:
When I run this on a machine with direct Internet connection I can use this config and it works just fine:
frontend main
bind *:8000
acl is_extweb1 path_beg -i /policies
acl is_extweb2 path_beg -i /produkte
use_backend externalweb1 if is_extweb1
use_backend externalweb2 if is_extweb2
backend externalweb1
server static www.google.com:80 check
backend externalweb2
server static www.gmx.net:80 check
(Obviously these are not the URLs I am talking to, this is just an example)
Haproxy is able to check the external applications and routes traffic to them:
In the safe environment of the company I work at I have to use a proxy and haproxy is unable to connect to the external applications.
How can I enable haproxy to use those external web application servers behind a http proxy (no authentication needed) while providing access to them through a common http page / via browser ?
How about to use delegate ( http://delegate.org/documents/ ) for this, just as an idea.
haproxy -> delegate -f -vv -P127.0.0.1:8081 PROXY=<your-proxy>
http://delegate9.org/delegate/Manual.shtml?PROXY
I know it's not that elegant but it could work.
I have tested this setup with a local squid and this curl call
echo 'GET http://www.php.net/' |curl -v telnet://127.0.0.1:8081
The curl call simluates the haproxy tcp call.
I was intrigued to make it work but i really could not find anything in the haproxy documentation, so i googled a bit and found that nginx might do the trick, but it didn't for me, after a bit more of googleing i ended up finding a configuration for apache that works.
here is the important part:
Listen 80
SSLProxyEngine on
ProxyPass /example/ https://www.example.com/
ProxyPassReverse /example/ https://www.example.com/
ProxyRemote https://www.example.com/ http://corporateproxy:port
ProxyPass /google/ https://www.google.com/
ProxyPassReverse /google/ https://www.google.com/
ProxyRemote https://www.google.com/ http://corporateproxy:port
i'm quite sure there should be a way to translate this configuration to nginx and even to haproxy... if i manage to find the time i will update the answer with my findings.
for apache to work you should also enable a few modules, i put up a github repository with a basic docker configuration that showcases feel free to have a look at that to see the full working configuration.
I have two components to my application, an API server (which is shared between several versions of the app), and static asset servers for the different distributions (mobile/desktop). I am using HAproxy to make the API server and the static asset servers behave as though they are on the same domain (to prevent CORS nastiness). My static asset servers are on CloudFront. Eventually, the HTML will reference the cloudfront URLs for the assets it depends on (to leverage global distribution). Temporarily for ease, I'm just having everything go through HAProxy. I'm having a hard time, however, getting HAProxy to send stuff properly to cloudfront.
My backend definition looks like this:
backend music_static
http-request set-header Host <hash>.cloudfront.net
option httpclose
server cloudfront <hash>.cloudfront.net
I figured that by setting the Host header value, I would be "spoofing" things correctly on their way to CloudFront. Obviously, visiting .cloudfront.net behaves exactly as I expect.
You probably moved over from this issue, but I see its not answered yet.
One solution to this issue is to enable SNI on CloudFront (this cost money, but worked for me - http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/SecureConnections.html). The above Host header doesnt help, as HTTP Host header is sent after TCP handshake, and to support SNI CloudFront requires host details in TCP handshake.
Is is possible force HTTPS URLs even when the X-Forwarded-Host header is not present?
Update:
We are using HAProxy in front of the Neo4j server. The configuration is
frontend proxy-ssl
bind 0.0.0.0:1591 ssl crt /etc/haproxy/server.pem
reqadd X-Forwarded-Proto:\ https
default_backend neo-1
This works well when every connection contains only one request. However, for Neo4j drivers which uses keep-alive (like Py2neo), the header is added only to the first request.
Without the X-Forwarded-Proto header, the generated URLs are http://host:1591, instead of https://host:1591.
According to the HAProxy documentation, this is the normal behavior:
since HAProxy's HTTP engine does not support keep-alive, only headers
passed during the first request of a TCP session will be seen. All subsequent
headers will be considered data only and not analyzed. Furthermore, HAProxy
never touches data contents, it stops analysis at the end of headers.
The workaround is to add option http-server-close in the frontend, so it will force that every request is in its own connection, but it will be nicer if we can support keep-alive.
Put something like Apache or Nginx in front of your Neo4j server to perform that task.
In terms of py2neo, I can add some functionality to cater for this situation quite easily. What if I were to include X-Forwarded-Proto: https for all https connections? Would that cause a problem in cases where a proxy isn't used?