How to deal with ECONNREFUSED for connect? - solaris

We have a process hanging, and the truss shows it also tries to connect but failed with error ECONNREFUSED.
The man page says the following, but why it got rejected again and again?
ECONNREFUSED The attempt to connect was force-
fully rejected. The calling program
should close(2) the socket descrip-
tor, and issue another
socket(3SOCKET) call to obtain a new
descriptor before attempting another
connect() call.
truss -p 2145
/3: lwp_park(0x00000000, 0) (sleeping...)
/2: nanosleep(0xFFFFFFFF7B5FBE60, 0xFFFFFFFF7B5FBE50) (sleeping...)
/2: nanosleep(0xFFFFFFFF7B5FBE60, 0xFFFFFFFF7B5FBE50) = 0
/2: so_socket(PF_INET, SOCK_STREAM, IPPROTO_TCP, "", SOV_DEFAULT) = 17
/2: fcntl(17, F_SETFD, 0x00000001) = 0
/2: connect(17, 0xFFFFFFFF7B5FBF40, 16, SOV_DEFAULT) Err#146 ECONNREFUSED
/2: close(17) = 0
/2: nanosleep(0xFFFFFFFF7B5FBE60, 0xFFFFFFFF7B5FBE50) (sleeping...)
/2: nanosleep(0xFFFFFFFF7B5FBE60, 0xFFFFFFFF7B5FBE50) = 0
/2: so_socket(PF_INET, SOCK_STREAM, IPPROTO_TCP, "", SOV_DEFAULT) = 17
/2: fcntl(17, F_SETFD, 0x00000001) = 0
/2: connect(17, 0xFFFFFFFF7B5FBF40, 16, SOV_DEFAULT) Err#146 ECONNREFUSED
/2: close(17) = 0
/2: nanosleep(0xFFFFFFFF7B5FBE60, 0xFFFFFFFF7B5FBE50) (sleeping...)

Does it sometimes work from this machine and then start failing, or is the error returned every time? Does it work from some machines and not others?
The server program may have crashed or closed the listening socket. Try "netstat -af inet" on the server to ensure that there is a socket in LISTEN state on that port, and to check the current number of connections on that port. The Solaris command "pfiles pid" on the server process id can also be used to verify that the server still has the listening socket open, and to check the current number of client connections. If many connections are being made, ensure that the listen() backlog is sufficient. Add the -vall option to your truss command on the client to show the address and port where you are connecting, to ensure they are correct. Also try making the same connection from the server machine to rule out any network, firewall, or NAT issue.

Firewall perhaps? There are lots of potential reasons.

Related

kafka-python: Connection reset during recv when using SASL_SSL + SCRAM-SHA-512

I am using kafka-python to connect to Kafka Cluster using SASL
consumer = KafkaConsumer(bootstrap_servers=['fooserver1:9092', 'fooserver2:9092'], client_id='foo', api_version=(2,2,1), security_protocol='SASL_SSL', sasl_mechanism='SCRAM-SHA-512', sasl_plain_username='myusername', sasl_plain_password='password123')
However I am getting the following error while connecting:
<BrokerConnection node_id=bootstrap-0 host=fooserver1:9092 <authenticating> [IPv4 ('my.ip.ad.dress', 9092)]>: Error receiving reply from server
Traceback (most recent call last):
File "/opt/python/kafka/conn.py", line 692, in _try_authenticate_scram(data_len,) = struct.unpack('>i', self._recv_bytes_blocking(4))
File "/opt/python/kafka/conn.py", line 616, in _recv_bytes_blocking raise ConnectionError('Connection reset during recv')
ConnectionError: Connection reset during recv
I have made sure that appropriate ports are open for establishing connections.
I need help in resolving this error.
This error may appear if you enter an incorrect username/password combination.
You could try verifying whether the username/password used when configuring the Kafka cluster is the same username/password you are using in kafka-python.

Cannot kill python program on port 8000 causing Tryton server socket.error

I have been getting more deeply involved in python for scientific computing (as a hobby) over the last 2 years and as I also have a medical degree, I really, really want to get a copy of GNU Health running on my new Kubuntu 15.10 OS so I can learn how it all works and play around with it! I followed the instructions to install it on this page: https://en.wikibooks.org/wiki/GNU_Health/Installation
I got pretty much to the end but when I try to launch the tryton server with ./trytond I get this error message:
[Thu Oct 29 10:25:02 2015] INFO:trytond.server:using /home/gnuhealth/gnuhealth/tryton/server/config/trytond.conf as configuration file
[Thu Oct 29 10:25:02 2015] INFO:trytond.server:initialising distributed objects services
Traceback (most recent call last):
File "./trytond", line 80, in <module>
trytond.server.TrytonServer(options).run()
File "/home/gnuhealth/gnuhealth/tryton/server/trytond-3.4.6/trytond/server.py", line 71, in run
self.start_servers()
File "/home/gnuhealth/gnuhealth/tryton/server/trytond-3.4.6/trytond/server.py", line 178, in start_servers
self.jsonrpcd.append(JSONRPCDaemon(hostname, port, ssl))
File "/home/gnuhealth/gnuhealth/tryton/server/trytond-3.4.6/trytond/protocols/jsonrpc.py", line 382, in __init__
self.server = server_class((interface, port), handler_class, 0)
File "/home/gnuhealth/gnuhealth/tryton/server/trytond-3.4.6/trytond/protocols/jsonrpc.py", line 317, in __init__
bind_and_activate)
File "/usr/lib/python2.7/SocketServer.py", line 420, in __init__
self.server_bind()
File "/home/gnuhealth/gnuhealth/tryton/server/trytond-3.4.6/trytond/protocols/jsonrpc.py", line 346, in server_bind
SimpleJSONRPCServer.server_bind(self)
File "/usr/lib/python2.7/SocketServer.py", line 434, in server_bind
self.socket.bind(self.server_address)
File "/usr/lib/python2.7/socket.py", line 228, in meth
return getattr(self._sock,name)(*args)
socket.error: [Errno 98] Address already in use
On further investigation with sudo netstat -pant | grep 8000 I get
tcp 0 0 127.0.0.1:8000 0.0.0.0:* LISTEN 2516/python
I have tried to kill this python program running on port 8000 every different way I could find but it keeps coming back with a new number in front i.e.
tcp 0 0 127.0.0.1:8000 0.0.0.0:* LISTEN 916/python
I kill it then...
tcp 0 0 127.0.0.1:8000 0.0.0.0:* LISTEN some_other_number etc../python
Can someone please explain what is going on with this python program keeping on restarting and how I fix this one little problem getting in the way of me starting the server!?
I was looking at the install instructions you mentioned.
Look at this section:
Activate Network Devices for the JSON-RPC Protocol
The Tryton GNU Health server listens to localhost at port 8000, not allowing direct connections from other workstations.
editconf
You can edit the parameter listen in the [jsonrpc] section , to activate the network device so workstations in your net can connect. For example, the following block
[jsonrpc]
listen = *:8000
will allow to connect to the server in the different devices of your system.
Check if you can change the value of the port and see if it works.
Use a port number that is unused. Use this command to check whether the port number is available. It has to be greater than 1024.
netstat -nlp | grep <self-chosen-hopefully-unused-port-number>
Hope this helps.

Failed connection with sphinx

I have a problem starting and working with sphinx.
I was able to run indexer --all, but now I want to search it, and I keep getting this error when I run searchd --status.
WARNING: failed to connect to 127.0.0.1:9312: Connection refused
WARNING: failed to connect to 0.0.0.0:9306: Connection refused
FATAL: failed to connect to daemon: please specify listen with sphinx protocol in your config file
sphinx query() returns false, and I guess that's related to connection problem.
Here's the part of my .conf file.
searchd
{
listen = 127.0.0.1:9312
listen = 9306:sphinx
listen = 2471:mysql41
log = /var/log/sphinx/searchd.log
query_log = /var/log/sphinx/query.log
max_matches = 1000
read_timeout = 5
max_children = 30
pid_file = /var/run/sphinx/searchd.pid
seamless_rotate = 1
preopen_indexes = 1
unlink_old = 1
workers = threads # for RT to work
binlog_path = /var/lib/sphinx
}
What am I missing in configuration of listening ports?
As noted in comments, indicates searchd daemon not actully running.
Can try using searchd to start the daemon (and later searchd --stop), which can show errors you might not see with using service/init.d starting.
(because if the log file itself is not functional, there is nowhere for errors to go :)

how to know if a server is listening at a port

I have created a TCP socket at one end of my application. Say the end is 1. This socket closes after about 10sec. Now the other side of my application (end 2) is allowed to connect to the above created socket. Im coding this socket app in python, so suppose my end 2 is trying to connect to the TCP socket, but the socket no longer exists, my program terminates because of some exception. I dont want that to happen. It's like there is a while loop in my end 2. So if a connection is not available it goes back and wait.
Are you handling the exception correctly ..
try:
s.connect((host,port))
except socket.error, (value,message):
if s:
s.close()
print "Could not open socket: " + message
"""Code to handle a retry"""
On getting an error .. you can retry by doing a bind and listen again.. Also you need to have retry count ..say 5 and then perhaps exit.

Memcached - Connection refused

I tried a simple test with memcached from jelastic and always getting the exception "COnnection refused"... But the URL ist correct. Is some add
MemcachedClient c = new MemcachedClient(
new InetSocketAddress("memcached-myexample.jelastic.dogado.eu", 11211));
c.set("someKey", 3600, user);
User cachedUser = (User) c.get("someKey");
Here is the exception:
2014-01-02 00:07:41.820 INFO net.spy.memcached.MemcachedConnection: Added {QA sa=memcached-myexample.jelastic.dogado.eu/92.51.168.106:11211, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
2014-01-02 00:07:41.833 WARN net.spy.memcached.MemcachedConnection: Could not redistribute to another node, retrying primary node for someKey.
2014-01-02 00:07:41.835 WARN net.spy.memcached.MemcachedConnection: Could not redistribute to another node, retrying primary node for someKey.
2014-01-02 00:07:41.858 INFO net.spy.memcached.MemcachedConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl#2dc1482f
2014-01-02 00:07:41.859 INFO net.spy.memcached.MemcachedConnection: Reconnecting due to failure to connect to {QA sa=memcached-myexample.jelastic.dogado.eu/92.51.168.106:11211, #Rops=0, #Wops=2, #iq=0, topRop=null, topWop=Cmd: set Key: someKey Flags: 1 Exp: 3600 Data Length: 149, toWrite=0, interested=0}
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:735)
at net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:629)
at net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:409)
at net.spy.memcached.MemcachedConnection.run(MemcachedConnection.java:1334)
I would try to telnet to your memcached cluster in order to rule out a firewall issue. You can do that with the following command.
telnet memcached-myexample.jelastic.dogado.eu 11211
If that doesn't work then you have network issues. If this is the case I would first check to see if you have a firewall up.
Add int portNum = 11211; at the first, and try again.
int portNum = 11211;
MemcachedClient c = new MemcachedClient(
new InetSocketAddress("memcached-myexample.jelastic.dogado.eu", portNum));
// Store a value (async) for one hour
c.set("someKey", 3600, someObject);
// Retrieve a value (synchronously).
Object myObject=c.get("someKey");
Thanks but the error was on a firewall rule from the provider. So not my failure.
Check /etc/memcached.conf file and update the server IP address from which you want to access the cache.