How to strip Proxy protocol with HAproxy? - haproxy

Consider the following situation:
Internet
||
||
.------''------.
| HTTPS (:443) |
'------..------'
||
.-----------------------'|
| \/
| 3rd party HAproxy service
| ||
| ||
optional .-----------''-----------.
route | PROXY Protocol (:5443) |
| '-----------..-----------'
| || ________
___________|_______________________||________________________________| SERVER |____
| | \/ |
| | local HAproxy |
| | || |
| | || |
| | .------''------. |
| | | HTTPS (:443) | |
| | '------..------' |
| | || |
| | || |
| | \/ |
| '---------------> local webserver |
|___________________________________________________________________________________|
The backend server has both HAproxy and Apache httpd locally running on port 5443 and 443 respectively.
My local webserver does not support the PROXY protocol. So I want HAproxy to catch the PROXY Protocol from the 3rd party service, and pass the data to the local webserver in either HTTPS or simply a TCP pass-through.
In the case of HTTPS I suppose it should manipulate the HTTP packets using the correct SSL-certificate to add the original sender IP in the X-Forwarded-For HTTP headers (which should be provided by the PROXY protocol).
However, the documentation of HAproxy is awful if you are new to HAproxy, and I could not find examples that explain how to do this. I know it has to be possible since HAproxy is listed as "Proxy-protocol ready software", but how?

Yes, you need to use the accept-proxy keyword after bind in the frontend declaration. It will also be good to read about the related send-proxy keyword which is used in the given "3rd party HAproxy service".
The PROXY Protocol can be stripped back to its original state using the following HAproxy configuration:
frontend app-proxy
bind *:5443 accept-proxy
mode tcp
option tcplog
default_backend app-httpd
backend app-httpd
mode tcp
server app1 127.0.0.1:443 check
This will accept a PROXY Protocol on port 5443, strip it, and send the TCP data to 443.
If you would like to manipulate the HTTP packets in the SSL-encrypted TCP data, you would need to have access to the correct SSL certificates (which your webserver should have access to already). This is what you'll likely want to do.
frontend app-proxy
bind *:5443 accept-proxy ssl crt /path/to/certnkey-file.pem
mode http
option httplog
default_backend app-httpd
backend app-httpd
mode http
server app1 127.0.0.1:443 check ssl verify none
The advantage of the latter approach is that the original client data is preserved while passing through the proxies, so that you know what the original IP of your visitor is. Which is kind of the whole idea of using PROXY Protocol in the first place! HAproxy will automatically update the X-Forwarded-For header with the correct IP-address which was transferred using the PROXY Protocol.

Related

How TCP connections are distinguished during backend service communication?

Basically I know how browsers are attaching different port to each TCP connection by choosing free ephemeral port and therefore connection is unique, however I don't know how it looks like on TCP level when two backend services connect to each other. Is that similar to how browsers work?
For example let's say I'm sending request from some http client to 'Service A' that is running on 'thread-per-connection' server and listening on port 'X'. Within choosen endpoint I am also sending http request to 'Service B' that listens on port 'Y' (similar service or database), how will it start unique TCP connection between these two services, do 'Service A' acts simlilarly to how browsers handle that?
The outside HTTP client application is acting as a client to Service A. So that app will use an ephemeral port when making that 1st connection.
Service A then acts as a client to Service B. So Service A will use an ephemeral port when making that 2nd connection.
---------- ------------- -------------
| client | ----> | service A | --------> | service B |
---------- ------------- -------------
^ ^ ^ ^
| | | |
x.x.x.x:e1 y.y.y.y:X y.y.y.y:e2 z.z.z.z:Y
What you describe is common to all TCP connection, including HTTP. The party creating the connection ("client") picks an ephemeral port (it is actually picked by the OS, not by the application) when connecting to a party accepting the connection ("server").
Note that the terms "client" and "server" might be confusing since they are used with several meanings. A "server" is often a hardware which provides services. It can be the service application itself which accepts connections. But it can also be the role in the communication, i.e. the client is the one initiating the connection and the server is the one accepting it. In your case a Service A which is a server application acts in the role of the client when initiating a TCP connection to Service B.

Access PostgreSQL Database Outside Local Network

I have a Windows 10 machine, and I would like to access a database which is set on another machine outside local network.
Is there any possibility of achieving that using postgresql?
Thank's a lot, and I'd appreciate your effort to help me overcome this situation.
It is possible, provided that:
The firewall of your local network allows outgoing connections to the PostgreSQL listen port (usually 5432).
The firewall of the other network allows incoming connections to the PostgreSQL listen port (usually 5432).
The firewall of the PostgreSQL server allows connection on its listen port (usually 5432).
The PostgreSQL server is configured to accept network connections.
You can use a network scanner such as Nmap to test things, thing to do is to get a laptop on the customer's network, and scan from there. If you can connect to the PostgreSQL from an address on the same subnet, then you know there is nothing else needed on the PostgreSQL server, and so your attention need to be on the customer's firewall. This is where things can get difficult, and you'll need to work with whoever controls that firewall / router.
Chances are that the customer's network is on an RFC 1918 subnet. If this is the case the firewall / router will need to be configured to port forward like this:
public internet
|
----public address--port nnn--
| |
| firewall |
| |
|-----rfc 1918 address--------|
|
|
|
----rfc 1918 address--port 5432--
| |
| PostgreSQL server |
| |
|--------------------------------|

Is there a Perl PSGI/Plack server available that only speaks PSGI and not also HTTP?

A usual deployment have looked in past and present like the following to me:
+------------------+ +---------+ tcp +-------+ tcp
| PSGI Application |----o| Starman |---->| nginx |<----(internet)
+------------------+ +---------+ +-------+
In fact I do have two fully fledged web servers in between the internet and the actual web application.
Since nginx has uWSGI directly build in and uWSGI supports the PSGI protocol, which is a fork of WSGI, I would love to use a PSGI-broker (only PSGI no HTTP) instead of a full fledged web server (Starman).
Is there an PSGI-only-broker solution available?
The PSGI 'protocol' (like WSGI) is essentially a calling convention for a subroutine. A request comes into the application as a subroutine call with a hash as an argument. The application responds through the subroutine's return value: an arrayref containing HTTP status code, HTTP headers and body. There's more to it than that, but those are the essentials.
What this means is that a process can only implement PSGI if the process contains a Perl interpreter. To achieve this, the process might be implemented in Perl or it might be implemented in a language like C that can load the libperl.so shared library. Similarly a process can only implement WSGI if it contains a Python interpreter.
Your block diagram contains three parts, but in reality the PSGI application is inside the Starman process. So there are really only two parts (although both parts are multiprocess containers).
You say that "nginx has uWSGI directly build in". This does not mean that a WGSI application runs inside the Nginx process. It means that the WSGI application runs in a separate uwsgi process and Nginx communicates with that process over a TCP socket using the uWSGI protocol. This is essentially the same model as Nginx with Starman behind it, but with the distinction that the socket connection to Starman will use the HTTP protocol:
.----------------------. .-----------.
| Starman | | Nginx |
| | HTTP | | HTTP
| .------------------. |<---------| |<-------(internet)
| | PSGI Application | | | |
| '------------------' | | |
'----------------------' '-----------'
The HTTP protocol does have higher overheads than the uWSGI protocol so you could get better performance by running an application server that speaks the WSGI socket protocol and can load libperl.so to implement the PSGI interface. uWSGI can do that:
.----------------------. .----------.
| uWSGI | | Nginx |
| | WSGI | | HTTP
| .------------------. |<---------| |<-------(internet)
| | PSGI Application | | | |
| '------------------' | | |
'----------------------' '----------'

64k connection myth and NAT translation

I have a lot (ten of thousands) of connected mobile devices which are maintaining an opened connection to a server. If my understanding of the 64k connection limitation is correct, you cannot have more than 64k (because of the TCP/IP protocol) connections to a single port of a server per client IP (because of the range of ephemeral ports on the client side).
But most of the time, you are in a context where these devices are connected through a network provider which use NAT to translate addresses. (for example, a smartphone won't have a static IP address).
So in this context, my server will see the same ip address and nothing garantee that the source port won't be the same in 2 different clients.
My question is maybe dumb but there it is : how can my server identify the correct connection if we think of a connection as the 5-tuple (protocol, server port, server ip, client ip, client port) in this situation ? Is there a risk of losing a connection or conflicts between 2 different clients ?
my server will see the same ip address and nothing guarantees that the source port won't be the same in 2 different clients [...] Is there a risk of losing a connection or conflicts?
No, that's the job of the router performing the NAT: keeping the IP:port combinations at one side linked to the ones on the other side.
So:
Client | IP | Src | < NAT > | IP | Src | Dest | Dst
======================================================
1 | .1 | 42 | <-----> | .3 | 1 | Server | 80
2 | .2 | 84 | <-----> | .3 | 2 | Server | 80
Given two clients, with (source IP 10.0.0.1, source port 42) and (source IP 10.0.0.2, source port 84) wish to connect to your server at port 80, then NAT will translate their IP:port pair to a pair that is valid on the other (right) side of the NAT (e.g. 11.0.0.3), by giving them a unique source port (on that side of the NAT). It will keep this translation in memory in order to be able to send packets both ways.
You'll see that the tuples on the right side of the NAT (so what your server sees) are unique:
11.0.0.3:1 - Server:80
11.0.0.3:2 - Server:80
If the router determines that the possible tuples towards your server have exhausted (so after 11.0.0.3:65535 - Server:80), it may refuse to open new connections to it.

Using TUN/TAP to read incoming data, encapsulate as UDP and transmit

I have a tun/tap device which is used to read incoming packets from one interface and send them as UDP packets via another interface. I could implement this and could read ICMP pakcets send to the tun/tap interface and also get them remotely using UDP. But the issue happens when I try to change the default gateway of the input interface to the tun/tap device so that I can read all the incoming data from the tun/tap. When this is done, I cant send the UDP packets as the routing isnt proper.
I also tried to you the "SO_BINDTODEVICE" option in socket comm but still didnt work. Please note that I havent used the write() method in the tun/tap. I just used the read() function, collected the data and send them via UDP socket communication.
Please let me know if my approach is wrong or any other work around to overcome this. Thanks.
/********More Details********/
Thanks Rob.
What I am trying to achieve is a simulation of IP based header commuication(ROHC) in a high latency channel.
For this I have 4 virtual machines. VM1 is a normal desktop machine. VM2 is a gateway which takes the packets using tun/tap(from VM1) and does the UDP based communication with VM4. VM3 is the channel where parameters like latency, error rate etc can be set. VM4 is connected to the WAN. The user in VM1 should be able to browse the WAN just like normal. Please find the diagram below.
IP Packets
|
| +------------------+ +--------------+ +----------------+
'---|eth1..... | | | | |
| | | | | | |
| tun/tap | | eth0|___|UDP Sock eth0|___
| | | | | | | | | |
| ..UDP Sock|_____|eth1 | | | | | |
| | | | | +tun/tap+ | '
+------------------+ +--------------+ +----------------+ WAN
VM2 VM3(Channel) VM4
Update:
Thanks Tommi. Your solution worked. I could get the UDP packets one way to the final NAT gateway. But I could not get the reverse way to work till now.
I tried enabling the masquerade using iptables and also setting up the host route to the tuntap at VM1 but it wasnot working.
I have a few queries regarding this.
1) In VM4 I receive the UDP data and write to the tun/tap. This will get routed to the WAN by the kernel. But for the incoming packet, do I again need to read using the tun/tap? In this case do I need to make the read and write in different threads? I am asking this because I need to transport them back also as UDP data. Let me know if I am missing something here.
Once again thanks a lot for your help.
Your udp packets will get routed to your tuntap interface, too. (well, depending on some settings they may just get discarded). You need to add a route rule for the udp peer you are sending them to, a host rule or a smaller network rule that wont interfere with your other communication.