Is it possible to use Traefik to proxy PostgreSQL over SSL? - postgresql

Motivations
I am a running into an issue when trying to proxy PostgreSQL with Traefik over SSL using Let's Encrypt.
I did some research but it is not well documented and I would like to confirm my observations and leave a record to everyone who faces this situation.
Configuration
I use latest versions of PostgreSQL v12 and Traefik v2. I want to build a pure TCP flow from tcp://example.com:5432 -> tcp://postgresql:5432 over TLS using Let's Encrypt.
Traefik service is configured as follow:
version: "3.6"
services:
traefik:
image: traefik:latest
restart: unless-stopped
volumes:
- "/var/run/docker.sock:/var/run/docker.sock:ro"
- "./configuration/traefik.toml:/etc/traefik/traefik.toml:ro"
- "./configuration/dynamic_conf.toml:/etc/traefik/dynamic_conf.toml"
- "./letsencrypt/acme.json:/acme.json"
networks:
- backend
ports:
- "80:80"
- "443:443"
- "5432:5432"
networks:
backend:
external: true
With the static setup:
[entryPoints]
[entryPoints.web]
address = ":80"
[entryPoints.web.http]
[entryPoints.web.http.redirections.entryPoint]
to = "websecure"
scheme = "https"
[entryPoints.websecure]
address = ":443"
[entryPoints.websecure.http]
[entryPoints.websecure.http.tls]
certresolver = "lets"
[entryPoints.postgres]
address = ":5432"
PostgreSQL service is configured as follow:
version: "3.6"
services:
postgresql:
image: postgres:latest
environment:
- POSTGRES_PASSWORD=secret
volumes:
- ./configuration/trial_config.conf:/etc/postgresql/postgresql.conf:ro
- ./configuration/trial_hba.conf:/etc/postgresql/pg_hba.conf:ro
- ./configuration/initdb:/docker-entrypoint-initdb.d
- postgresql-data:/var/lib/postgresql/data
networks:
- backend
#ports:
# - 5432:5432
labels:
- "traefik.enable=true"
- "traefik.docker.network=backend"
- "traefik.tcp.routers.postgres.entrypoints=postgres"
- "traefik.tcp.routers.postgres.rule=HostSNI(`example.com`)"
- "traefic.tcp.routers.postgres.tls=true"
- "traefik.tcp.routers.postgres.tls.certresolver=lets"
- "traefik.tcp.services.postgres.loadBalancer.server.port=5432"
networks:
backend:
external: true
volumes:
postgresql-data:
It seems my Traefik configuration is correct. Everything is OK in the logs and all sections in dashboard are flagged as Success (no Warnings, no Errors). So I am confident with the Traefik configuration above. The complete flow is about:
EntryPoint(':5432') -> HostSNI(`example.com`) -> TcpRouter(`postgres`) -> Service(`postgres#docker`)
But, it may have a limitation at PostgreSQL side.
Debug
The problem is that I cannot connect the PostgreSQL database. I always get a Timeout error.
I have checked PostgreSQL is listening properly (main cause of Timeout error):
# - Connection Settings -
listen_addresses = '*'
port = 5432
And I checked that I can connect PostgreSQL on the host (outside the container):
psql --host 172.19.0.4 -U postgres
Password for user postgres:
psql (12.2 (Ubuntu 12.2-4), server 12.3 (Debian 12.3-1.pgdg100+1))
Type "help" for help.
postgres=#
Thus I know PostgreSQL is listening outside its container, so Traefik should be able to bind the flow.
I also have checked external traefik can reach the server:
sudo tcpdump -i ens3 port 5432
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens3, link-type EN10MB (Ethernet), capture size 262144 bytes
09:02:37.878614 IP x.y-z-w.isp.com.61229 > example.com.postgresql: Flags [S], seq 1027429527, win 64240, options [mss 1452,nop,wscale 8,nop,nop,sackOK], length 0
09:02:37.879858 IP example.com.postgresql > x.y-z-w.isp.com.61229: Flags [S.], seq 3545496818, ack 1027429528, win 64240, options [mss 1460,nop,nop,sackOK,nop,wscale 7], length 0
09:02:37.922591 IP x.y-z-w.isp.com.61229 > example.com.postgresql: Flags [.], ack 1, win 516, length 0
09:02:37.922718 IP x.y-z-w.isp.com.61229 > example.com.postgresql: Flags [P.], seq 1:9, ack 1, win 516, length 8
09:02:37.922750 IP example.com.postgresql > x.y-z-w.isp.com.61229: Flags [.], ack 9, win 502, length 0
09:02:47.908808 IP x.y-z-w.isp.com.61229 > example.com.postgresql: Flags [F.], seq 9, ack 1, win 516, length 0
09:02:47.909578 IP example.com.postgresql > x.y-z-w.isp.com.61229: Flags [P.], seq 1:104, ack 10, win 502, length 103
09:02:47.909754 IP example.com.postgresql > x.y-z-w.isp.com.61229: Flags [F.], seq 104, ack 10, win 502, length 0
09:02:47.961826 IP x.y-z-w.isp.com.61229 > example.com.postgresql: Flags [R.], seq 10, ack 104, win 0, length 0
So, I am wondering why the connection cannot succeed. Something must be wrong between Traefik and PostgreSQL.
SNI incompatibility?
Even when I remove the TLS configuration, the problem is still there, so I don't expect the TLS to be the origin of this problem.
Then I searched and I found few posts relating similar issue:
Introducing SNI in TLS handshake for SSL connections
Traefik 2.0 TCP routing for multiple DBs;
As far as I understand it, the SSL protocol of PostgreSQL is a custom one and does not support SNI for now and might never support it. If it is correct, it will confirm that Traefik cannot proxy PostgreSQL for now and this is a limitation.
By writing this post I would like to confirm my observations and at the same time leave a visible record on Stack Overflow to anyone who faces the same problem and seek for help. My question is then: Is it possible to use Traefik to proxy PostgreSQL?
Update
Intersting observation, if using HostSNI('*') and Let's Encrypt:
labels:
- "traefik.enable=true"
- "traefik.docker.network=backend"
- "traefik.tcp.routers.postgres.entrypoints=postgres"
- "traefik.tcp.routers.postgres.rule=HostSNI(`*`)"
- "traefik.tcp.routers.postgres.tls=true"
- "traefik.tcp.routers.postgres.tls.certresolver=lets"
- "traefik.tcp.services.postgres.loadBalancer.server.port=5432"
Everything is flagged as success in Dashboard but of course Let's Encrypt cannot perform the DNS Challenge for wildcard *, it complaints in logs:
time="2020-08-12T10:25:22Z" level=error msg="Unable to obtain ACME certificate for domains \"*\": unable to generate a wildcard certificate in ACME provider for domain \"*\" : ACME needs a DNSChallenge" providerName=lets.acme routerName=postgres#docker rule="HostSNI(`*`)"
When I try the following configuration:
labels:
- "traefik.enable=true"
- "traefik.docker.network=backend"
- "traefik.tcp.routers.postgres.entrypoints=postgres"
- "traefik.tcp.routers.postgres.rule=HostSNI(`*`)"
- "traefik.tcp.routers.postgres.tls=true"
- "traefik.tcp.routers.postgres.tls.domains[0].main=example.com"
- "traefik.tcp.routers.postgres.tls.certresolver=lets"
- "traefik.tcp.services.postgres.loadBalancer.server.port=5432"
The error vanishes from logs and in both setups the dashboard seems ok but traffic is not routed to PostgreSQL (time out). Anyway, removing SSL from the configuration makes the flow complete (and unsecure):
labels:
- "traefik.enable=true"
- "traefik.docker.network=backend"
- "traefik.tcp.routers.postgres.entrypoints=postgres"
- "traefik.tcp.routers.postgres.rule=HostSNI(`*`)"
- "traefik.tcp.services.postgres.loadBalancer.server.port=5432"
Then it is possible to connect PostgreSQL database:
time="2020-08-12T10:30:52Z" level=debug msg="Handling connection from x.y.z.w:58389"

I'm using Traefik to proxy PostgreSQL, so answer is yes. But I'm not using TLS, because my setup is a bit different. First of all, if PostgreSQL doesn't support SNI, then I would suggest to try to modify labels, especially HostSNI rule to this:
"traefik.tcp.routers.postgres.rule=HostSNI(`*`)"
That says: ignore SNI and just take any name from specified entrypoint as valid.

SNI routing for postgres with STARTTLS has been added to Traefik in this PR. Now Treafik will listen to the initial bytes sent by postgres and if its going to initiate a TLS handshake (Note that postgres TLS requests are created as non-TLS first and then upgraded to TLS requests), Treafik will handle the handshake and then is able to receive the TLS headers from postgres, which contains the SNI information that it needs to route the request properly. This means that you can use HostSNI("example.com") along with tls to expose postgres databases under different subdomains.
As of writing this answer, I was able to get this working with the v3.0.0-beta2 image (Reference)

Related

consul - connect client to server

I'm new at consul and I try to setup a server-client environment. I have started my server with the following command and configuration:
consul.exe agent -ui -config-dir=P:\Consule\config
The config file looks the following ("P:\Consule\config\server.json")
{
"bootstrap": false,
"server": true,
"datacenter": "MyServices",
"data_dir": "P:\\Consule\\data",
"log_level": "INFO"
}
Output when I start consul from commandline with above command:
==> Starting Consul agent...
==> Consul agent running!
Version: 'v0.8.3'
Node ID: '1a244456-e725-44be-0549-33603ea7087d'
Node name: 'MYCOMPUTERNAMEA'
Datacenter: 'myservices'
Server: true (bootstrap: false)
Client Addr: 127.0.0.1 (HTTP: 8500, HTTPS: -1, DNS: 8600)
Cluster Addr: 127.0.0.1 (LAN: 8301, WAN: 8302)
Gossip encrypt: false, RPC-TLS: false, TLS-Incoming: false
Atlas: <disabled>
Now, at another computer in my domain I try to run an consul client with follwoing commandline and config-file:
consul.exe agent -config-dir C:\Consul -bind=127.0.0.1
Config file ("C:\Consul\client.json")
{
"server": false,
"datacenter": "MyServices",
"data_dir": "C:\\TEMP",
"log_level": "INFO",
"start_join": ["MYCOMPUTERNAMEA"]
}
But I always get follwing output/error message:
==> Starting Consul agent...
==> Joining cluster...
==> 1 error(s) occurred:
* Failed to join <IP_OF_MYCOMPUTERNAMEA>: dial tcp <IP_OF_MYCOMPUTERNAMEA>:8301: connectex: No connection could be made because the target machine actively refused it.
Does anyone know what I'm doing wrong?
Thanks and best regards
I suppose, the reason is that your server is available only for 127.0.0.1 ip-address, which is localhost ip and available only from the same server. This can be seen here:
Client Addr: 127.0.0.1 (HTTP: 8500, HTTPS: -1, DNS: 8600)
Cluster Addr: 127.0.0.1 (LAN: 8301, WAN: 8302)
You have to configure your server, to make it listening all network interfaces or some specific interface, which have to be available from other server.
Try to run it with the client and advertise options set to 0.0.0.0 (or some specific ip). Read about it here and here.
And you might have to delete -bind=127.0.0.1 from the client configuration, since it might be available from the server too.

InfluxDB not getting packets from StatsD

I have an Amazon EC2 instance running and I am trying to set up StatsD+InfluxDB+Grafana. InfluxDB and Grafana work well (and Grafana sees the data from InfluxDB), but I can't manage to get any data from StatsD to InfluxDB.
I have a domain registered, which is pointed to my EC2 instance with an Elastic IP.
What I can see is that:
- I can perfectly interact with the InfluxDB database (including inserting values) when I don't use StatsD
- StatsD seems to be getting the data I randomly generate from Python (I can see it in its logs). It is sent through the port 8125 to StatsD.
- UTC packets sent from StatsD to InfluxDB through port 8086 seem to not be getting to InfluxDB (or not sending....?)
- Port 8086 is open on my AWS security settings for both TCP and UDP
- Port 8125 is open on my AWS security settings for UDP
I am wondering whether some of my settings are wrong, but I don't know what else to try:
InfluxDB configuration file contains:
# hostname = "localhost"
hostname = MYDOMAIN.com
[[udp]]
enabled = true
bind-address = ":8086"
database = "MY_DATABASE"
retention-policy = ""
batch-size = 1000 # will flush if this many points get buffered
batch-pending = 10 # number of batches that may be pending in memory
batch-timeout = "1s" # will flush at least this often even if we haven't hit buffer limit
read-buffer = 0 # UDP Read buffer size, 0 means OS default. UDP listener will fail if set above OS max.
udp-payload-size = 65536
My StatsD configuration file contains (among other things) the following lines:
{
influxdb: {
/*
host: '127.0.0.1', // InfluxDB host (default 127.0.0.1)
*/
host: 'MYDOMAIN.com', // InfluxDB host (default 127.0.0.1)
port: 8086, // InfluxDB port (default 8086)
database: 'MY_DATABASE', // InfluxDB db instance (required)
username: 'MY_USERNAME', // InfluxDB db username (required)
password: 'MY_PASSWORD', // InfluxDB db password (required)
flush: {
enable: true // enable regular flush strategy (default true)
},
proxy: {
enable: false, // enable the proxy strategy (default false)
suffix: 'raw', // metric name suffix (default 'raw')
flushInterval: 1000
}
},
port: 8125, // statsD port
backends: ['./backends/console'],
debug: true,
legacyNamespace: false
}
As far as I understand, the process is:
Python --> Port 8125 --> StatsD --> Port 8086 --> InfluxDB
Is there a need to use something like Telegraf or statsd-influxdb-backend to connect StatsD and InfluxDB?
I would truly appreciate any helps because I have been trying to set it up for hours and I don't see what could be wrong.
Thanks!
The part of the stack I'm not sure about is your StatsD server. It's probably having a problem posting the data to InfluxDB. If you use Telegraf instead it should "just work". Telegraf can act as a StatsD server (among many other things) and send data to InfluxDB via either UDP or the regular HTTP protocol.

Haproxy 1.6.2 not recognizing resolvers section

As a test, I have a local bind instance running:
>netstat -ant | grep LISTEN
tcp 0 0 10.72.186.23:53 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:53 0.0.0.0:* LISTEN
...
>nslookup mysubdomain.example.com 127.0.0.1
Server: 127.0.0.1
Address: 127.0.0.1#53
Name: mysubdomain.example.com
Address: nn.nn.nn.251
Name: mysubdomain.example.com
Address: nn.nn.nn.249
Name: mysubdomain.example.com
Address: nn.nn.nn.201
Name: mysubdomain.example.com
Address: nn.nn.nn.138
I'm running haproxy 1.6.2 on the same host, with a resolvers section:
resolvers dns
nameserver dns1 127.0.0.1:53
nameserver dns2 10.72.186.23:53
hold valid 10s
It doesn't reject the resolvers section, but doesn't seem to be using it, either. It doesn't show in the stats page, and attempting to add this service command:
server mysubdomain-dev mysubdomain.example.com
causes this error:
>service haproxy restart
* Restarting haproxy haproxy
[ALERT] 322/171813 (10166) : parsing [/etc/haproxy/haproxy.cfg:77] : 'server mysubdomain-dev' : invalid address: 'mysubdomain.example.com' in 'mysubdomain.example.com'
[ALERT] 322/165300 (29751) : Error(s) found in configuration file : /etc/haproxy/haproxy.cfg
[ALERT] 322/165300 (29751) : Fatal errors found in configuration.
The haproxy doc https://cbonte.github.io/haproxy-dconv/configuration-1.6.html indicates this should work.
server <name> <address>[:[port]] [param*]
...
<address> is the IPv4 or IPv6 address of the server. Alternatively, a
resolvable hostname is supported, but this name will be resolved
during start-up. Address "0.0.0.0" or "*" has a special meaning.
Is there some other piece that needs to be added to the haproxy.cfg that activates the resolvers section?
When HAProxy first starts, it attempts to resolve the hostnames of any servers in all the backends to fill the server structures. During this first startup phase, HAProxy uses the OS resolver, i.e. generally the servers defined in your /etc/resolv.conf file.
Only later, when the server's IP addresses are updated during checks, HAProxy uses its internal resolver configuration and its internal DNS resolver.
From your error description, it now seems as if your host itself can not resolve the mysubdomain.example.com hostname. HAProxy will only be able to start if it can resolve the hostnames without an explicit named nameserver. This can be verified with e.g.
dig mysubdomain.example.com
might be you are not specifying the resolvers to use for that server
server mysubdomain-dev mysubdomain.example.com ->
server mysubdomain-dev mysubdomain.example.com resolvers dns

Nginx can't access a uWSGI unix socket on CentOS 7

I have configured uWSGI to serve my Django app on a unix socket, and Nginx as a proxy to this socket. The server is running CentOS 7. I think I have configured Nginx so that it has permission to read and write to uWSGI's socket, but I'm still getting a permission denied error. Why can't Nginx access the uWSGI socket on CentOS 7?
[uwsgi]
socket=/socket/uwsgi.sock
virtualenv=/home/site/virtsite/
chdir=/home/site/wsgitest/
module=wsgitest.wsgi:application
vhost = true
master=True
workers=8
chmod-socket=666
pidfile=/home/site/wsgitest/uwsgi-master.pid
max-requests=5000
chown-socket=nginx:nginx
uid = nginx
gid = nginx
listen.owner = nginx
listen.group = nginx
server {
listen 80;
location / {
uwsgi_pass unix:///home/site/wsgitest/uwsgi.sock;
include uwsgi_params;
}
}
uwsgi --ini uwsgi.ini (as root)
ls -l /home/site/wsgitest/uwsgi.sock
srwxrwxrwx. 1 nginx nginx 0 Oct 13 10:05 uwsgi.sock
2014/10/12 19:01:44 [crit] 19365#0: *10 connect() to unix:///socket/uwsgi.sock failed (13: Permission denied) while connecting to upstream, client: 2.191.102.217, server: , request: "GET / HTTP/1.1", upstream: "uwsgi://unix:///socket/uwsgi.sock:", host: "179.227.126.222"
The Nginx and uWSGI configurations are correct. The problem is that SELinux denied Nginx access to the socket. This results in a generic access denied error in Nginx's log. The important messages are actually in SELinux's audit log.
# show the new rules to be generated
grep nginx /var/log/audit/audit.log | audit2allow
# show the full rules to be applied
grep nginx /var/log/audit/audit.log | audit2allow -m nginx
# generate the rules to be applied
grep nginx /var/log/audit/audit.log | audit2allow -M nginx
# apply the rules
semodule -i nginx.pp
You may need to generate the rules multiple times, trying to access the site after each pass, since the first SELinux error might not be the only one that can be generated. Always inspect the policy that audit2allow suggests creating.
These steps were taken from this blog post which contains more details about how to investigate and what output you'll get.
Configure your uwsgi.ini with uid and gid user.
#uwsgi.ini
uid = nginx
gid = nginx
Regards,
I wished I could comment :(
Everything looks fine from here except unix socket path
unix:///socket/uwsgi.sock failed (2: No such file or directory)
Docs says it has just one slash
uwsgi_pass unix:/tmp/uwsgi.socket;

CentOS 6.3 Samba share over internet not working

Summary:
This is a 2 part question. A simple Samba share on one ISP with router doesn't work while another ISP with a different router setup the same and a similar server with same Samba configuration works.
It seems to be either the router not forwarding the ports, although it successfully forwards SSH and others, or the ISP somehow blocking the standard Samba ports. It still bugs me that I can't figure out why it doesnt work and I'll still try to narrow down the cause.
The second question is I'm looking for a business use, simple, easy to use (for end users), secure share for a small number of people and files, hosted internally and accessible externally on the internet, between Windows 7, XP, Mac, and linux servers with simple clients for end users.
A new friend outside of stackoverflow helped with sshfs as a solution. On CentOS ssh already supports sshfs. The Windows client win-sshfs is working well and I'll be trying OSXFUSE with MACFusion described at UO.
Additionally, setup linux users for each person. To allow write by everyone in the linux group, change the umask in /etc/ssh/sshd_config described in this question at serverfault. People get to their home directory first, where I placed links to a shared folder with sticky bit set so they can't delete the folder. They can delete the links but that's easy enough to put back. The only issues I can see are lack of file locking and lack of auto-refresh.
Original Question:
I can't seem to get Samba working on a Centos 6.3 server over the internet. I have a similar test server on another internet connection working fine with the exact same setup. I've gone through http://www.samba.org/samba/docs/man/Samba-HOWTO-Collection/diagnosis.html twice, made sure the ports are forwarded through to the internet (although not sure how to test they are really open), double checked samba configuration, its only sharing /tmp simply now. The user account is setup, it can ssh in and get to /tmp and the samba password is set the same. I can't ping the server but that is because the router or IP is set not pingable by the owner/work. SSH and HTTPS apache work well on the server with ports forwarded the same way. I haven't been able to test the share within the local network yet since I am not there, but I assume that it should work internally. When trying to connect from Windows 7 it just times out, no prompt and it has never connected, whereas my test server on my own internet connection is always working internally and externally.
Any help would be greatly appreciated.
The requirement is a easy to use internally hosted shared folder alternative to using "dropbox" for use between Windows 7, XP, mac, and linux servers that works over external internet connection. It won't see heavy usage but should be quick, easy to access/setup on the client side, and secure for business. If there are any alternatives to install on CentOS that would be great as well.
Thank you!
Andrew
Edit, details:
Ports are forwarded:
(I had an image but as new user I cant post) 137, 138, 139, 445 are forwarded all with both TCP and UDP for testing now.
smb.conf is setup simply and exactly the same as the working test server:
# cat /etc/samba/smb.conf
[global]
workgroup=WORKGROUP
log level = 3
log file = /var/log/samba/log.%m
max log size = 50
security = user
passdb backend = tdbsam
[tmp]
comment = temporary files
path = /tmp
read only = yes
Samba restarted for good measure:
# service smb restart
Shutting down SMB services: [ OK ]
Starting SMB services: [ OK ]
Windows 7 times out when trying to access the share as \ which works fine with the test server:
(I had a screenshot but new users cant post)
A search for the error 0x80004005 results in http://answers.microsoft.com/en-us/windows/forum/windows_vista-networking/cannot-access-network-share-get-unspecified-error/9f840844-9d5b-e011-8dfc-68b599b31bf5
I've checked the workgroup, share settings, and restarted windows. Since the test share works I would think the Windows machine is working. I'll continue with the details.
Edit again:
Following the troubleshooting guide again:
Simplify the smb.conf to just:
# cat /etc/samba/smb.conf
[tmp]
comment = temporary files
path = /tmp
read only = yes
/etc/resolv.conf is using the ISPs servers and they work. They are different than the working server's DNS but that one is on a different ISP:
# nslookup google.com
Server: 71.242.0.12
Address: 71.242.0.12#53
Non-authoritative answer:
Name: google.com
Address: 74.125.228.2
I'm doing everything with IP addresses so I don't know that DNS would come into play.
I added dns proxy = no to smb.conf for fun but that didn't help.
/var/log/samba/log.smbd doesn't report anything different from the working server:
[2012/09/20 16:59:41, 0] smbd/server.c:1141(main)
smbd version 3.5.10-125.el6 started.
Copyright Andrew Tridgell and the Samba Team 1992-2010
[2012/09/20 16:59:41.484699, 0] param/loadparm.c:7648(lp_do_parameter)
Global parameter dns proxy found in service section!
[2012/09/20 16:59:41.486645, 0] printing/print_cups.c:109(cups_connect)
Unable to connect to CUPS server localhost:631 - Connection refused
[2012/09/20 16:59:41.486809, 0] printing/print_cups.c:468(cups_async_callback)
failed to retrieve printer list: NT_STATUS_UNSUCCESSFUL
[2012/09/20 16:59:41.507198, 0] smbd/server.c:501(smbd_open_one_socket)
smbd_open_once_socket: open_socket_in: Address already in use
[2012/09/20 16:59:41.507407, 0] smbd/server.c:501(smbd_open_one_socket)
smbd_open_once_socket: open_socket_in: Address already in use
[2012/09/20 17:00:39, 0] smbd/server.c:1141(main)
smbd version 3.5.10-125.el6 started.
Copyright Andrew Tridgell and the Samba Team 1992-2010
[2012/09/20 17:00:39.513793, 0] printing/print_cups.c:109(cups_connect)
Unable to connect to CUPS server localhost:631 - Connection refused
[2012/09/20 17:00:39.513955, 0] printing/print_cups.c:468(cups_async_callback)
failed to retrieve printer list: NT_STATUS_UNSUCCESSFUL
[2012/09/20 17:00:39.535458, 0] smbd/server.c:501(smbd_open_one_socket)
smbd_open_once_socket: open_socket_in: Address already in use
[2012/09/20 17:00:39.535689, 0] smbd/server.c:501(smbd_open_one_socket)
smbd_open_once_socket: open_socket_in: Address already in use
However the working server creates a log file in the directory named log. which the non working server does not.
testparm:
# testparm
Load smb config files from /etc/samba/smb.conf
rlimit_max: increasing rlimit_max (1024) to minimum Windows limit (16384)
Processing section "[tmp]"
Loaded services file OK.
Server role: ROLE_STANDALONE
Press enter to see a dump of your service definitions
[global]
[tmp]
comment = temporary files
path = /tmp
continuing...
Continued:
nmb is running as well:
# service nmb restart
Shutting down NMB services: [ OK ]
Starting NMB services: [ OK ]
"Respond to Ping on Internet Port" is normally turned off on the routers. I turned it on, on both the Windows client and the server. Each can ping the other, sharing still doesn't work.
Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation. All rights reserved.
C:\Users\xxxx>ping xxxx
Pinging xxxx with 32 bytes of data:
Reply from xxxx: bytes=32 time=25ms TTL=51
Reply from xxxx: bytes=32 time=23ms TTL=51
Reply from xxxx: bytes=32 time=26ms TTL=51
Reply from xxxx: bytes=32 time=24ms TTL=51
Ping statistics for xxxx:
Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 23ms, Maximum = 26ms, Average = 24ms
# ping xxxx -c 5
PING xxxx (xxxx) 56(84) bytes of data.
64 bytes from xxxx: icmp_seq=1 ttl=251 time=20.7 ms
64 bytes from xxxx: icmp_seq=2 ttl=251 time=24.6 ms
64 bytes from xxxx: icmp_seq=3 ttl=251 time=21.4 ms
64 bytes from xxxx: icmp_seq=4 ttl=251 time=25.3 ms
64 bytes from xxxx: icmp_seq=5 ttl=251 time=22.9 ms
--- xxxx ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4029ms
rtt min/avg/max/mdev = 20.776/23.022/25.319/1.764 ms
continuing...
Continued:
iptables are off:
# iptables -L -v
Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
pkts bytes target prot opt in out source destination
SELinux is off:
# sestatus
SELinux status: disabled
smbclient using a user setup in samba works from the samba server to its local IP and to its external IP. The Windows client gets:
Connection to <ip addr> failed (Error NT_STATUS_UNSUCCESSFUL)
Samba is running as a daemon/service and netbios-ssn is in listen mode:
# netstat -a|grep netbios-ssn
tcp 0 0 *:netbios-ssn *:* LISTEN
Continuing...
Continued:
We're not restricting connections or using inetd.
log.nmbd does not report any problems.
nmblookup -B BIGSERVER SAMBA works using the server's name
nmblookup -B ACLIENT * fails on all log files using the windows client name OR the external IP address
nmblookup -d 2 `*'. fails
"If your PC and server aren't on the same subnet, then you will need to use the -B option to set the broadcast address to that of the PC's subnet.
This test will probably fail if your subnet mask and broadcast address are not correct. (Refer to test 3 notes above)."
Im not sure here, since we're going over the internet do we need these to match and work?
smbclient //BIGSERVER/TMP works
On the client:
Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation. All rights reserved.
C:\Users\xxxx>net view \\xxxx (ip addr)
System error 53 has occurred.
The network path was not found.
C:\Users\xxxx>
net use has the same problem, even with providing user and passwd.
nmblookup -M WORKGROUP returns a local windows machine on the network there, whereas on my test server it returns the client which is local to the test machine. Perhaps there is an issue here with workgroup being on another machine, but how would others connect from other networks if this was the issue?
I tried preferred master = yes as well.
Page 2 of samba howto next.
Update: A new friend said to try nmap to see check the ports:
# nmap -sS -P0 -sV -O xxxx
Starting Nmap 5.51 ( ) at 2012-09-21 11:09 EDT
Nmap scan report for xxxx (xxxx)
Host is up (0.024s latency).
Not shown: 995 filtered ports
PORT STATE SERVICE VERSION
22/tcp open ssh OpenSSH 5.3 (protocol 2.0)
25/tcp open smtp Postfix smtpd
110/tcp open pop3 Dovecot pop3d
443/tcp open ssl/http Apache httpd 2.2.15 ((CentOS))
9100/tcp open jetdirect?
Warning: OSScan results may be unreliable because we could not find at
least 1 open and 1 closed port
OS fingerprint not ideal because: Missing a closed TCP port so results
incomplete
No OS matches for host
Service Info: Host: xxxx
Since the Samba ports do not show up, I'm thinking the router or ISP is not forwarding/blocking the ports at this point.
As for a solution to sharing, I'm trying sshfs with a windows and mac client.
Answering your original question, the good way to test if your ISP is not blocking listed ports is this:
# yum -y install tcpdump
# tcpdump -i eth0 "port 137 or port 138 or port 139 or port 445"
(substitute eth0 with the name of the interface connected to the Internet).
Then you should try accessing the share (net view / net use / Windows Shell). If ports are forwarded correctly you should see something like that:
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
01:25:48.631173 IP 192.168.0.10.54032 > 192.168.0.1.microsoft-ds: Flags [S], seq 4008761512, win 5840, options [mss 1460,sackOK,TS val 136010468 ecr 0,nop,wscale 7], length 0
01:25:48.631198 IP 192.168.0.1.microsoft-ds > 192.168.0.10.54032: Flags [S.], seq 2220435566, ack 4008761513, win 14480, options [mss 1460,sackOK,TS val 15507714 ecr 136010468,nop,wscale 7], length 0
01:25:48.631397 IP 192.168.0.10.54032 > 192.168.0.1.microsoft-ds: Flags [.], ack 1, win 46, options [nop,nop,TS val 136010468 ecr 15507714], length 0
01:25:48.642171 IP 192.168.0.10.54032 > 192.168.0.1.microsoft-ds: Flags [P.], seq 1:184, ack 1, win 46, options [nop,nop,TS val 136010479 ecr 15507714], length 183SMB PACKET: SMBnegprot (REQUEST)
...
If you see nothing at all it means that your ISP (or intermediate router) is blocking packets to those ports and it's most likely the case — SMB protocol proved to be quite insecure for open Internet deployments.
In the file /etc/samba/smb.conf, under the section [global], below the workgroup line add this two lines :
client min protocol = NT1
client max protocol = SMB3