Ansible hangs on setup or playbook but SSH works fine - server

I am having an odd issue with Ansible and connecting to a host (any host) and hoping someone can see something I'm not. I can ssh directly to the host w/o any issue. I can run -m ping w/o issue. But that's where success ends. If I run a -m setup it appears to connect and gather some info, but subsequent connections fail.
This is a server spun up on Proxmox (7.2.11). I've done this 100's of times w/o issue. That's why I can't seen to identify what has changed. I typically spin up a container and set up w/ a ssh key (requiring passphrase) for root. If a VM, I simply copy the public key to the root users authorized_keys. Then run ansible playbook to add the user(s) and services along with locking down ssh. So my playbooks initially run using the root user. Ansible has always prompted for the passphrase and then go along it's merry way.
I'm using pipelining, but I've set to false in testing.
Appreciate any insight you may have... Thank you
Here's the output of a simple gather facts. You can see that the first two SSH: EXEC return a result, but the third connection hangs.
➜ ansible git:(main) ✗ ansible all -vvv -i ./inventory.yml -m setup
ansible 2.10.8
config file = /home/johndoe/NAS1-Mounts/Code/ansible/ansible.cfg
configured module search path = ['/home/johndoe/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
ansible python module location = /usr/lib/python3/dist-packages/ansible
executable location = /usr/bin/ansible
python version = 3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0]
Using /home/johndoe/NAS1-Mounts/Code/ansible/ansible.cfg as config file
host_list declined parsing /home/johndoe/NAS1-Mounts/Code/ansible/inventory.yml as it did not pass its verify_file() method
script declined parsing /home/johndoe/NAS1-Mounts/Code/ansible/inventory.yml as it did not pass its verify_file() method
Parsed /home/johndoe/NAS1-Mounts/Code/ansible/inventory.yml inventory source with ini plugin
Skipping callback 'default', as we already have a stdout callback.
Skipping callback 'minimal', as we already have a stdout callback.
Skipping callback 'oneline', as we already have a stdout callback.
META: ran handlers
<target_server> Attempting python interpreter discovery
<10.2.0.27> ESTABLISH SSH CONNECTION FOR USER: root
<10.2.0.27> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="root"' -o ConnectTimeout=10 -o ControlPath=/home/johndoe/.dotfiles/ansible/.ansible/cp/27e670244a 10.2.0.27 '/bin/sh -c '"'"'echo PLATFORM; uname; echo FOUND; command -v '"'"'"'"'"'"'"'"'/usr/bin/python'"'"'"'"'"'"'"'"'; command -v '"'"'"'"'"'"'"'"'python3.9'"'"'"'"'"'"'"'"'; command -v '"'"'"'"'"'"'"'"'python3.8'"'"'"'"'"'"'"'"'; command -v '"'"'"'"'"'"'"'"'python3.7'"'"'"'"'"'"'"'"'; command -v '"'"'"'"'"'"'"'"'python3.6'"'"'"'"'"'"'"'"'; command -v '"'"'"'"'"'"'"'"'python3.5'"'"'"'"'"'"'"'"'; command -v '"'"'"'"'"'"'"'"'python2.7'"'"'"'"'"'"'"'"'; command -v '"'"'"'"'"'"'"'"'python2.6'"'"'"'"'"'"'"'"'; command -v '"'"'"'"'"'"'"'"'/usr/libexec/platform-python'"'"'"'"'"'"'"'"'; command -v '"'"'"'"'"'"'"'"'/usr/bin/python3'"'"'"'"'"'"'"'"'; command -v '"'"'"'"'"'"'"'"'python'"'"'"'"'"'"'"'"'; echo ENDFOUND && sleep 0'"'"''
<10.2.0.27> (0, b'PLATFORM\nLinux\nFOUND\n/usr/bin/python3\nENDFOUND\n', b'')
<10.2.0.27> ESTABLISH SSH CONNECTION FOR USER: root
<10.2.0.27> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="root"' -o ConnectTimeout=10 -o ControlPath=/home/johndoe/.dotfiles/ansible/.ansible/cp/27e670244a 10.2.0.27 '/bin/sh -c '"'"'/usr/bin/python3 && sleep 0'"'"''
<10.2.0.27> (0, b'{"platform_dist_result": [], "osrelease_content": "PRETTY_NAME=\\"Ubuntu 22.04.1 LTS\\"\\nNAME=\\"Ubuntu\\"\\nVERSION_ID=\\"22.04\\"\\nVERSION=\\"22.04.1 LTS (Jammy Jellyfish)\\"\\nVERSION_CODENAME=jammy\\nID=ubuntu\\nID_LIKE=debian\\nHOME_URL=\\"https://www.ubuntu.com/\\"\\nSUPPORT_URL=\\"https://help.ubuntu.com/\\"\\nBUG_REPORT_URL=\\"https://bugs.launchpad.net/ubuntu/\\"\\nPRIVACY_POLICY_URL=\\"https://www.ubuntu.com/legal/terms-and-policies/privacy-policy\\"\\nUBUNTU_CODENAME=jammy\\n"}\n', b'')
Using module file /usr/lib/python3/dist-packages/ansible/modules/setup.py
Pipelining is enabled.
<10.2.0.27> ESTABLISH SSH CONNECTION FOR USER: root
<10.2.0.27> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="root"' -o ConnectTimeout=10 -o ControlPath=/home/johndoe/.dotfiles/ansible/.ansible/cp/27e670244a 10.2.0.27 '/bin/sh -c '"'"'/usr/bin/python3 && sleep 0'"'"''
^C [ERROR]: User interrupted execution
➜ ansible git:(main) ✗
-m ping
➜ ansible git:(main) ✗ ansible all -vvv -i ./inventory.yml -m ping
ansible 2.10.8
config file = /home/johndoe/NAS1-Mounts/Code/ansible/ansible.cfg
configured module search path = ['/home/johndoe/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
ansible python module location = /usr/lib/python3/dist-packages/ansible
executable location = /usr/bin/ansible
python version = 3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0]
Using /home/johndoe/NAS1-Mounts/Code/ansible/ansible.cfg as config file
host_list declined parsing /home/johndoe/NAS1-Mounts/Code/ansible/inventory.yml as it did not pass its verify_file() method
script declined parsing /home/johndoe/NAS1-Mounts/Code/ansible/inventory.yml as it did not pass its verify_file() method
Parsed /home/johndoe/NAS1-Mounts/Code/ansible/inventory.yml inventory source with ini plugin
Skipping callback 'default', as we already have a stdout callback.
Skipping callback 'minimal', as we already have a stdout callback.
Skipping callback 'oneline', as we already have a stdout callback.
META: ran handlers
<target_server> Attempting python interpreter discovery
<10.2.0.27> ESTABLISH SSH CONNECTION FOR USER: root
<10.2.0.27> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="root"' -o ConnectTimeout=10 -o ControlPath=/home/johndoe/.dotfiles/ansible/.ansible/cp/27e670244a 10.2.0.27 '/bin/sh -c '"'"'echo PLATFORM; uname; echo FOUND; command -v '"'"'"'"'"'"'"'"'/usr/bin/python'"'"'"'"'"'"'"'"'; command -v '"'"'"'"'"'"'"'"'python3.9'"'"'"'"'"'"'"'"'; command -v '"'"'"'"'"'"'"'"'python3.8'"'"'"'"'"'"'"'"'; command -v '"'"'"'"'"'"'"'"'python3.7'"'"'"'"'"'"'"'"'; command -v '"'"'"'"'"'"'"'"'python3.6'"'"'"'"'"'"'"'"'; command -v '"'"'"'"'"'"'"'"'python3.5'"'"'"'"'"'"'"'"'; command -v '"'"'"'"'"'"'"'"'python2.7'"'"'"'"'"'"'"'"'; command -v '"'"'"'"'"'"'"'"'python2.6'"'"'"'"'"'"'"'"'; command -v '"'"'"'"'"'"'"'"'/usr/libexec/platform-python'"'"'"'"'"'"'"'"'; command -v '"'"'"'"'"'"'"'"'/usr/bin/python3'"'"'"'"'"'"'"'"'; command -v '"'"'"'"'"'"'"'"'python'"'"'"'"'"'"'"'"'; echo ENDFOUND && sleep 0'"'"''
<10.2.0.27> (0, b'PLATFORM\nLinux\nFOUND\n/usr/bin/python3\nENDFOUND\n', b'')
<10.2.0.27> ESTABLISH SSH CONNECTION FOR USER: root
<10.2.0.27> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="root"' -o ConnectTimeout=10 -o ControlPath=/home/johndoe/.dotfiles/ansible/.ansible/cp/27e670244a 10.2.0.27 '/bin/sh -c '"'"'/usr/bin/python3 && sleep 0'"'"''
<10.2.0.27> (0, b'{"platform_dist_result": [], "osrelease_content": "PRETTY_NAME=\\"Ubuntu 22.04.1 LTS\\"\\nNAME=\\"Ubuntu\\"\\nVERSION_ID=\\"22.04\\"\\nVERSION=\\"22.04.1 LTS (Jammy Jellyfish)\\"\\nVERSION_CODENAME=jammy\\nID=ubuntu\\nID_LIKE=debian\\nHOME_URL=\\"https://www.ubuntu.com/\\"\\nSUPPORT_URL=\\"https://help.ubuntu.com/\\"\\nBUG_REPORT_URL=\\"https://bugs.launchpad.net/ubuntu/\\"\\nPRIVACY_POLICY_URL=\\"https://www.ubuntu.com/legal/terms-and-policies/privacy-policy\\"\\nUBUNTU_CODENAME=jammy\\n"}\n', b'')
Using module file /usr/lib/python3/dist-packages/ansible/modules/ping.py
Pipelining is enabled.
<10.2.0.27> ESTABLISH SSH CONNECTION FOR USER: root
<10.2.0.27> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="root"' -o ConnectTimeout=10 -o ControlPath=/home/johndoe/.dotfiles/ansible/.ansible/cp/27e670244a 10.2.0.27 '/bin/sh -c '"'"'/usr/bin/python3 && sleep 0'"'"''
<10.2.0.27> (0, b'\n{"ping": "pong", "invocation": {"module_args": {"data": "pong"}}}\n', b'')
target_server | SUCCESS => {
"ansible_facts": {
"discovered_interpreter_python": "/usr/bin/python3"
},
"changed": false,
"invocation": {
"module_args": {
"data": "pong"
}
},
"ping": "pong"
}
META: ran handlers
META: ran handlers
ansible.cfg
➜ ansible git:(main) ✗ cat ansible.cfg
[default]
inventory = /home/johndoe/NAS1-Mounts/Code/ansible/inventory.yml
# Use the Beautiful Output callback plugin.
stdout_callback = beautiful_output
# Use specific ssh key and user
# ed25519 w/ passphrase
private_key = /home/johndoe/.ssh/johndoe_default
host_key_checking = False
# For updates/maintenance as sudo user
remote_user = johndoe
# Set remote host working directory
remote_tmp = ~/.ansible/tmp
# Misc
allow_world_readable_tmpfiles = True
display_skipped_hosts = False
# display_args_to_stdout = True
# stdout_callback = full_skip
transport = ssh
[ssh_connection]
pipelining = True
timeout = 30
[connection]
pipelining = True
my inventory.yml
➜ ansible git:(main) ✗ cat inventory.yml
# Vagrant Host
#default
[workstation]
[server]
target_server ansible_user=root ansible_host=10.2.0.27 install_docker=true
[pve_container]
my .ssh/config file
➜ ansible git:(main) ✗ cat ~/.ssh/config
# Defaults
Host *
# Default ed25519 Keypair for all connections - unless otherwise specified
IdentityFile ~/.ssh/johndoe_default
IdentitiesOnly yes
# Always use multiplex'd sessions - unless otherwise specified in host def below
Controlmaster auto
ControlPersist yes
Controlpath /tmp/ssh-%r#%h:%p
ControlPersist 10m
ssh directly to host
➜ ansible git:(main) ✗ ssh root#10.2.0.27
Welcome to Ubuntu 22.04.1 LTS (GNU/Linux 5.15.0-56-generic x86_64)
* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/advantage
System information as of Wed Dec 14 05:06:20 PM UTC 2022
System load: 0.0 Processes: 117
Usage of /: 31.8% of 14.66GB Users logged in: 1
Memory usage: 11% IPv4 address for ens18: 10.2.0.27
Swap usage: 0%
50 updates can be applied immediately.
To see these additional updates run: apt list --upgradable
Last login: Wed Dec 14 17:05:53 2022 from 10.0.2.5
root#ubuntu-ansible-test:~#

My apologies for wasting anyone's time. My issue turned out to be a MTU issue with the Tunnel to the remote site. Someone set it up with 1500 on the wireguard tunnel. A packet capture pcap showed a bunch of TCP Out or order, TCP Dup ACK and TCP Retransmission. Setting back to 1420 cured the issue.
Best

Related

what is the root password of telepresence in kubernetes remote debugging

I am using telepresence to remote debugging the kubernetes cluster, and I am log in cluster using command:
telepresence
but when I want to install some software in the telepresence pod:
sudo apt-get install wget
and I did not know the password of telepresence pod, so what should I do to install software?
you could using this script to login pod as root:
#!/usr/bin/env bash
set -xe
POD=$(kubectl describe pod "$1")
NODE=$(echo "$POD" | grep -m1 Node | awk -F'/' '{print $2}')
CONTAINER=$(echo "$POD" | grep -m1 'Container ID' | awk -F 'docker://' '{print $2}')
CONTAINER_SHELL=${2:-bash}
set +e
ssh -t "$NODE" sudo docker exec --user 0 -it "$CONTAINER" "$CONTAINER_SHELL"
if [ "$?" -gt 0 ]; then
set +x
echo 'SSH into pod failed. If you see an error message similar to "executable file not found in $PATH", please try:'
echo "$0 $1 sh"
fi
login like this:
./login-k8s-pod.sh flink-taskmanager-54d85f57c7-wd2nb

Rsync with sshpass on Linux using systemd: 'Host key verification failed.'

I am trying to set up rsync with sshpass on a RaspberryPi to connect to a Synology drive in order to synchronize data.
The listed command:
sshpass -p 'password' rsync -avz -e 'ssh -p 22' \home\pi host#IP::home/example
works out fine, if I run it manually at the command prompt. As well it works out, when I implement it into a python script using the package 'subprocess':
import subprocess
subprocess.run([ sshpass -p 'password' rsync -avz -e 'ssh -p 22' \home\pi host#IP::home/example])
Whenever I want to autostart the python script using systemctl as a service, I get the following error:
Host key verification failed.
rsync error: received SIGINT, SIGTERM or SIGHUP (code 20) at rsync.c(644) [sender=3.1.3]
I am wondering, what ist the difference between the command prompt and systemd in this case?
Thank you so much for your help in advance! I really appreciate every Tip!!!
Kilian
The rude way is to add -o "StrictHostKeyChecking=no" to your SSH command:
sshpass -p 'password' rsync -avz -e 'ssh -o "StrictHostKeyChecking=no" -p 22' \home\pi host#IP::home/example

ICP 1.2.0 Module failure

During the icp 1.2.0 installation process I encounter following error:
TASK [kubelet : Starting Kubelet container on Worker nodes] ********************
task path: /installer/playbook/roles/kubelet/tasks/kubelet.yaml:3
Using module file /usr/lib/python2.7/dist-packages/ansible/modules/cloud/docker/docker_container.py
<192.168.240.14> ESTABLISH SSH CONNECTION FOR USER: user
<192.168.240.14> SSH: EXEC sshpass -d10 ssh -C -o CheckHostIP=no -o LogLevel=ERROR -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o Port=22 -o 'IdentityFile="cluster/ssh_key"' -o User=user -o ConnectTimeout=10 -oPubkeyAuthentication=no 192.168.240.14 '/bin/bash -c '"'"'echo ~ && sleep 0'"'"''
<192.168.240.14> (0, '/home/user\n', '')
<192.168.240.14> ESTABLISH SSH CONNECTION FOR USER: user
<192.168.240.14> SSH: EXEC sshpass -d10 ssh -C -o CheckHostIP=no -o LogLevel=ERROR -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o Port=22 -o 'IdentityFile="cluster/ssh_key"' -o User=user -o ConnectTimeout=10 -oPubkeyAuthentication=no 192.168.240.14 '/bin/bash -c '"'"'( umask 77 && mkdir -p "` echo /home/user/.ansible/tmp/ansible-tmp-1529485552.37-109409849437093 `" && echo ansible-tmp-1529485552.37-109409849437093="` echo /home/user/.ansible/tmp/ansible-tmp-1529485552.37-109409849437093 `" ) && sleep 0'"'"''
<192.168.240.14> (0, 'ansible-tmp-1529485552.37-109409849437093=/home/user/.ansible/tmp/ansible-tmp-1529485552.37-109409849437093\n', '')
<192.168.240.14> PUT /tmp/tmpQDhbak TO /home/user/.ansible/tmp/ansible-tmp-1529485552.37-109409849437093/docker_container.py
<192.168.240.14> SSH: EXEC sshpass -d10 sftp -o BatchMode=no -b - -C -o CheckHostIP=no -o LogLevel=ERROR -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o Port=22 -o 'IdentityFile="cluster/ssh_key"' -o User=user -o ConnectTimeout=10 -oPubkeyAuthentication=no '[192.168.240.14]'
<192.168.240.14> (0, 'sftp> put /tmp/tmpQDhbak /home/user/.ansible/tmp/ansible-tmp-1529485552.37-109409849437093/docker_container.py\n', '')
<192.168.240.14> ESTABLISH SSH CONNECTION FOR USER: user
<192.168.240.14> SSH: EXEC sshpass -d10 ssh -C -o CheckHostIP=no -o LogLevel=ERROR -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o Port=22 -o 'IdentityFile="cluster/ssh_key"' -o User=user -o ConnectTimeout=10 -oPubkeyAuthentication=no 192.168.240.14 '/bin/bash -c '"'"'chmod u+x /home/user/.ansible/tmp/ansible-tmp-1529485552.37-109409849437093/ /home/user/.ansible/tmp/ansible-tmp-1529485552.37-109409849437093/docker_container.py && sleep 0'"'"''
<192.168.240.14> (0, '', '')
<192.168.240.14> ESTABLISH SSH CONNECTION FOR USER: user
<192.168.240.14> SSH: EXEC sshpass -d10 ssh -C -o CheckHostIP=no -o LogLevel=ERROR -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o Port=22 -o 'IdentityFile="cluster/ssh_key"' -o User=user -o ConnectTimeout=10 -oPubkeyAuthentication=no -tt 192.168.240.14 '/bin/bash -c '"'"'sudo -H -S -i -p "[sudo via ansible, key=iunllbazxshyeergbibbpevmrjmrbrte] password: " -u root /bin/bash -c '"'"'"'"'"'"'"'"'echo BECOME-SUCCESS-iunllbazxshyeergbibbpevmrjmrbrte; /usr/bin/python /home/user/.ansible/tmp/ansible-tmp-1529485552.37-109409849437093/docker_container.py; rm -rf "/home/user/.ansible/tmp/ansible-tmp-1529485552.37-109409849437093/" > /dev/null 2>&1'"'"'"'"'"'"'"'"' && sleep 0'"'"''
<192.168.240.14> (0, '\r\nTraceback (most recent call last):\r\n File "/tmp/ansible_L67oqX/ansible_module_docker_container.py", line 660, in <module>\r\n from ansible.module_utils.docker_common import *\r\n File "/tmp/ansible_L67oqX/ansible_modlib.zip/ansible/module_utils/docker_common.py", line 34, in <module>\r\n File "/root/.local/lib/python2.7/site-packages/requests-2.18.4-py2.7.egg/requests/__init__.py", line 84, in <module>\r\n from urllib3.contrib import pyopenssl\r\n File "/root/.local/lib/python2.7/site-packages/urllib3-1.22-py2.7.egg/urllib3/contrib/pyopenssl.py", line 46, in <module>\r\n import OpenSSL.SSL\r\n File "/usr/lib/python2.7/dist-packages/OpenSSL/__init__.py", line 8, in <module>\r\n from OpenSSL import rand, crypto, SSL\r\n File "/usr/lib/python2.7/dist-packages/OpenSSL/SSL.py", line 118, in <module>\r\n SSL_ST_INIT = _lib.SSL_ST_INIT\r\nAttributeError: \'module\' object has no attribute \'SSL_ST_INIT\'\r\n', 'Connection to 192.168.240.14 closed.\r\n')
fatal: [192.168.240.14] => MODULE FAILURE
PLAY RECAP *********************************************************************
192.168.240.14 : ok=51 changed=27 unreachable=0 failed=1
localhost : ok=15 changed=0 unreachable=0 failed=0
Playbook run took 0 days, 0 hours, 4 minutes, 1 seconds
user#user:/opt/ibm-cloud-private-ce-1.2.0/cluster$
I'm not sure if this is relevant, but I have python 2.7.14 and openssl 1.0.2o installed:
user#user:/opt/ibm-cloud-private-ce-1.2.0/cluster$ python --version
Python 2.7.14 :: Anaconda, Inc.
user#user:/opt/ibm-cloud-private-ce-1.2.0/cluster$ openssl version
OpenSSL 1.0.2o 27 Mar 2018
The error seems to be this below, but I don't understand how I could fix it.
File "/usr/lib/python2.7/dist-packages/OpenSSL/SSL.py", line 118, in <module>\r\n SSL_ST_INIT = _lib.SSL_ST_INIT\r\nAttributeError: \'module\' object has no attribute \'SSL_ST_INIT\'\r\n', 'Connection to 192.168.240.14 closed.\r\n')
Any help is appreciated, thanks.
I saw in this SO post folks experiencing the same error. Seems to be tied to the OpenSSL version. Review this post and try to install a newer version or make sure you installed it with the same user as running ICP.
Python AttributeError: 'module' object has no attribute 'SSL_ST_INIT'
Also there is a new version of ICP-CE available: ICP 2.1.0.3. Check it out:
https://www.ibm.com/developerworks/community/blogs/fe25b4ef-ea6a-4d86-a629-6f87ccf4649e/entry/IBM_Cloud_Private_version_2_1_0_3_is_now_available_for_download?lang=en

docker varnish cmd error - no such file or directory

I'm trying to get a Varnish container running as part of a multicontainer Docker environment.
I'm using https://github.com/newsdev/docker-varnish as a base.
My Dockerfile looks like:
FROM newsdev/varnish:4.1.0
COPY start-varnishd.sh /usr/local/bin/start-varnishd
ENV VARNISH_VCL_PATH /etc/varnish/default.vcl
ENV VARNISH_PORT 80
ENV VARNISH_MEMORY 64m
EXPOSE 80
CMD [ "exec /usr/local/sbin/varnishd -j unix,user=varnishd -F -f /etc/varnish/default.vcl -s malloc,64m -a 0.0.0.0:80 -p http_req_hdr_len=16384 -p http_resp_hdr_len=16384" ]
When I run this as part of a docker-compose setup, I get:
ERROR: for eventsapi_varnish_1 Cannot start service varnish: oci
runtime error: container_linux.go:262: starting container process
caused "exec: \"exec /usr/local/sbin/varnishd -j unix,user=varnishd -F
-f /etc/varnish/default.vcl -s malloc,64m -a 0.0.0.0:80 -p http_req_hdr_len=16384 -p http_resp_hdr_len=16384\": stat exec
/usr/local/sbin/varnishd -j unix,user=varnishd -F -f
/etc/varnish/default.vcl -s malloc,64m -a 0.0.0.0:80 -p
http_req_hdr_len=16384 -p http_resp_hdr_len=16384: no such file or
directory"
I get the same if I try
CMD ["start-varnishd"]
(as it is in the base newsdev/docker-varnish)
or
CMD [/usr/local/bin/start-varnishd]
But if I run a bash shell on the container directly:
docker run -t -i eventsapi_varnish /bin/bash
and then run the varnishd command from there, varnish starts up fine (and starts complaining that it can't find the web container, obviously).
What am I doing wrong? What file can't it find? Again looking around the running container directly, it seems that Varnish is where it thinks it should be, the VCL file is where it thinks it should be... what's stopping it running from within docker-compose?
Thanks!
I didn't get to the bottom of why I was getting this error, but "fixed" it by using the (more recent?) fork: https://hub.docker.com/r/tripviss/varnish/. My Dockerfile is now just:
FROM tripviss/varnish:5.1
COPY default.vcl /usr/local/etc/varnish/

How do I get pcp to automatically attach nodes to postgres pgpool?

I'm using postgres 9.4.9, pgpool 3.5.4 on centos 6.8.
I'm having a major hard time getting pgpool to automatically detect when nodes are up (it often detects the first node but rarely detects the secondary) but if I use pcp_attach_node to tell it what nodes are up, then everything is hunky dory.
So I figured until I could properly sort the issue out, I would write a little script to check the status of the nodes and attach them as appropriate, but I'm having trouble with the password prompt. According to the documentation, I should be able to issue commands like
pcp_attach_node 10 localhost 9898 pgpool mypass 1
but that just complains
pcp_attach_node: Warning: extra command-line argument "localhost" ignored
pcp_attach_node: Warning: extra command-line argument "9898" ignored
pcp_attach_node: Warning: extra command-line argument "pgpool" ignored
pcp_attach_node: Warning: extra command-line argument "mypass" ignored
pcp_attach_node: Warning: extra command-line argument "1" ignored
it'll only work when I use parameters like
pcp_attach_node -U pgpool -h localhost -p 9898 -n 1
and there's no parameter for the password, I have to manually enter it at the prompt.
Any suggestions for sorting this other than using Expect?
You have to create PCPPASSFILE. Search pgpool documentation for more info.
Example 1:
create PCPPASSFILE for logged user (vi ~/.pcppass), file content is 127.0.0.1:9897:user:pass (hostname:port:username:password), set file permissions 0600 (chmod 0600 ~/.pcppass)
command should run without asking for password
pcp_attach_node -h 127.0.0.1 -U user -p 9897 -w -n 1
Example 2:
create PCPPASSFILE (vi /usr/local/etc/.pcppass), file content is 127.0.0.1:9897:user:pass (hostname:port:username:password), set file permissions 0600 (chmod 0600 /usr/local/etc/.pcppass), set variable PCPPASSFILE (export PCPPASSFILE=/usr/local/etc/.pcppass)
command should run without asking for password
pcp_attach_node -h 127.0.0.1 -U user -p 9897 -w -n 1
Script for auto attach the node
You can schedule this script with for example crontab.
#!/bin/bash
#pgpool status
#0 - This state is only used during the initialization. PCP will never display it.
#1 - Node is up. No connections yet.
#2 - Node is up. Connections are pooled.
#3 - Node is down.
source $HOME/.bash_profile
export PCPPASSFILE=/appl/scripts/.pcppass
STATUS_0=$(/usr/local/bin/pcp_node_info -h 127.0.0.1 -U postgres -p 9897 -n 0 -w | cut -d " " -f 3)
echo $(date +%Y.%m.%d-%H:%M:%S.%3N)" [INFO] NODE 0 status "$STATUS_0;
if (( $STATUS_0 == 3 ))
then
echo $(date +%Y.%m.%d-%H:%M:%S.%3N)" [WARN] NODE 0 is down - attaching node"
TMP=$(/usr/local/bin/pcp_attach_node -h 127.0.0.1 -U postgres -p 9897 -n 0 -w -v)
echo $(date +%Y.%m.%d-%H:%M:%S.%3N)" [INFO] "$TMP
fi
STATUS_1=$(/usr/local/bin/pcp_node_info -h 127.0.0.1 -U postgres -p 9897 -n 1 -w | cut -d " " -f 3)
echo $(date +%Y.%m.%d-%H:%M:%S.%3N)" [INFO] NODE 1 status "$STATUS_1;
if (( $STATUS_1 == 3 ))
then
echo $(date +%Y.%m.%d-%H:%M:%S.%3N)" [WARN] NODE 1 is down - attaching node"
TMP=$(/usr/local/bin/pcp_attach_node -h 127.0.0.1 -U postgres -p 9897 -n 1 -w -v)
echo $(date +%Y.%m.%d-%H:%M:%S.%3N)" [INFO] "$TMP
fi
exit 0
yes you can trigger execution of this command using a customised failover_command (failover.sh in your /etc/pgpool)
Automated way to up your pgpool down node:
copy this script into a file with execute permission to your desired location with postgres ownership into all nodes.
run crontab -e comamnd under postgres user
Finally set that script to run every minute at crontab . But to execute it for every second you may create your own
service and run it.
#!/bin/bash
# This script will up all pgpool down node
#************************
#******NODE STATUS*******
#************************
# 0 - This state is only used during the initialization.
# 1 - Node is up. No connection yet.
# 2 - Node is up and connection is pooled.
# 3 - Node is down
#************************
#******SCRIPT*******
#************************
server_node_list=(0 1 2)
for server_node in ${server_node_list[#]}
do
source $HOME/.bash_profile
export PCPPASSFILE=/var/lib/pgsql/.pcppass
node_status=$(pcp_node_info -p 9898 -h localhost -U pgpool -n $server_node -w | cut -d ' ' -f 3);
if [[ $node_status == 3 ]]
then
pcp_attach_node -n $server_node -U pgpool -p 9898 -w -v
fi
done