I have a replica set consisting of four nodes (ux002, ux009, ux019, ux020). I have a program that I'd like to run in parallel on each of the same four nodes which connects to this replica set using the Mongo Java driver.
Examining the status of the replica set shows that all four nodes are operating fine, however the program throws the following warning message on all four nodes:
Nov 12, 2014 2:34:40 PM com.mongodb.ConnectionStatus$UpdatableNode update
WARNING: Server seen down: ux009/127.0.1.1:27017 - java.io.IOException - message: couldn't connect to [ux009/127.0.1.1:27017] bc:java.net.ConnectException: Connection refused
However, on each node, the server which is seen down is the one the program is running on. I.e. I run the program on ux009, and it tells me ux009 is down. I run it on ux002, it tells me ux002 is down.
I made a stupidly simple program to test whether there was something wrong with my original code, but the same warning persists:
public static void main(String[] args) throws Exception {
List<ServerAddress> addrs = new ArrayList<>();
if (args.length == 0) {
addrs.add(new ServerAddress("localhost", 27017));
} else {
for (String a : args) {
String[] host = a.split(":");
addrs.add(new ServerAddress(host[0], Integer.valueOf(host[1])));
}
}
mongo = new Mongo(addrs);
Thread.sleep(5000); // Sleep to give it time to print messages
mongo.close();
}
And I run it as follows:
java -jar mongo-test.jar ux002:27017 ux009:27017 ux019:27017 ux020:27017
Could it be that mongod isn't configured correctly? Or perhaps I am misusing the Java API?
The Mongo Java driver is version 2.9.3, and mongod is version 2.6.5.
Many thanks in advance!
-Jim
The IP is a little strange for the local host:
ux009/127.0.1.1:27017
I would have expected that to be:
ux009/127.0.0.1:27017
Most likely someone fat fingered the ip address in the /etc/hosts on each machine.
Posting the answer here for completeness. The issue was that the bind_ip parameter in the mongod configuration file had been set to the IP address of only one of the nodes. Thanks to helmy for spotting that.
Related
I am trying to implement an interface for my erlang program using jinterface. When I call the command OtpNode otpNode = new OtpNode(nodeName, cookie); java throws an IOException with
java.io.IOException: Nameserver not responding on DESKTOP-GIR29G3 when publishing javanode.
It doesn't seem to be common problem for people as I couldn't find anything similar online. It's a local node with the node name being "javanode" with no fullstops or dashes. Why would there be a DNS issue on a local node?
I have tried starting an erlang node in the directory the java program is started as well as starting the erlang console on my pc, but I'm very new to erlang so those were just wild guesses that some erlang VM must be running.
Here is the code that may help:
public Erlterface()
{
Thread t = new Thread(new Runnable() {
public void run() {
setupMBox();
}
});
t.start();
}
private void setupMBox()
{
try {
String nodeName = "javanode";
String cookie = "jinterface";
//String[] names = OtpEpmd.lookupNames();
OtpNode otpNode = new OtpNode(nodeName, cookie); //CRASH HAPPENS HERE
OtpMbox Mbox = otpNode.createMbox("javaserver");
The error from the console:
Connected to the target VM, address: '127.0.0.1:54025', transport: 'socket'
java.io.IOException: Nameserver not responding on DESKTOP-GIR29G3 when publishing javanode
at com.stellar.base.schedule.com.ericsson.otp.erlang.OtpEpmd.r4_publish(OtpEpmd.java:344)
at com.stellar.base.schedule.com.ericsson.otp.erlang.OtpEpmd.publishPort(OtpEpmd.java:141)
at com.stellar.base.schedule.com.ericsson.otp.erlang.OtpNode$Acceptor.publishPort(OtpNode.java:784)
at com.stellar.base.schedule.com.ericsson.otp.erlang.OtpNode$Acceptor.<init>(OtpNode.java:776)
at com.stellar.base.schedule.com.ericsson.otp.erlang.OtpNode.init(OtpNode.java:232)
at com.stellar.base.schedule.com.ericsson.otp.erlang.OtpNode.<init>(OtpNode.java:196)
at com.stellar.base.schedule.com.ericsson.otp.erlang.OtpNode.<init>(OtpNode.java:149)
at com.stellar.base.schedule.Erlterface.setupMBox(Erlterface.java:40)
at com.stellar.base.schedule.Erlterface.access$000(Erlterface.java:16)
at com.stellar.base.schedule.Erlterface$1.run(Erlterface.java:26)
at java.lang.Thread.run(Thread.java:745)
Thanks in advance
Dale
UPDATE ADDITIONAL INFORMATION:
I went into a dive to try and figure out where exactly the train leaves the rails but I'm taking wild guesses as to what I should flag as potential issues. I just want to add some additional information here to help:
In OptEpmd the following is caught before the io exception is thrown
java.net.ConnectException: Connection refused: connect
The final source is the native DeulSocketImpl class that I suppose calls on windows to do the final connection thingamabob ad it fails:
static native int connect0(int var0, InetAddress var1, int var2) throws IOException;
Am I missing something in setting up the erlang node? I surely don't have to start it manually? I've diabled my firewall completely to test it. How do I figure out why the connection was refused?
I am trying to determine the OS of a particular IP address using nmap. Here is my code so far:
import java.io.*;
public class NmapFlags {
public static void main(String[] args) throws Exception {
try {
String[] cmdarray = { "nmap", "-O", "66.110.59.130" };//
// example trying to find the OS or device detials of this Ip address//
Process process = Runtime.getRuntime().exec(cmdarray);
BufferedReader r = new BufferedReader(new InputStreamReader(
process.getInputStream()));
String s;
while ((s = r.readLine()) != null) {
System.out.println(s);
}
r.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
After running this code output I got is:
All 1000 scanned ports on 66.110.59.130 are filtered
All 1000 scanned ports on 66.110.59.130 are filtered
Too many fingerprints match this host to give specific OS details
Too many fingerprints match this host to give specific OS details
OS detection performed. Please report any incorrect results at http://nmap.org/submit/ .
OS detection performed. Please report any incorrect results at http://nmap.org/submit/ .
Nmap done: 1 IP address (1 host up) scanned in 246.06 seconds
Nmap done: 1 IP address (1 host up) scanned in 246.06 seconds**
Are there any other nmap flags I can use to detect the device type? I tried -A option. I need to find the device details at each hop of trace route.
Nmap performs “active fingerprinting” (it sends packets then analyse the response) to guess what the remote Operating System is. These probes are quite intrusive and I‘d recommend reading more about it (http://nmap.org/book/osdetect-fingerprint-format.html).
"Too many fingerprints match this host to give specific OS details" means that the probes are contradictory or too broad.
For example in a NAT scenario, some port scans return the router information (e.q. Cisco iOS), some other probes return the real host specifications (e.q. Windows).
The best way to understand how the network is designed is to rely on your own judgment based on different probes and output.
IP ID sequence, fingerprint analysis and service detection (-sV) can help:
e.q. If 3389 is open, then the OS running is a Windows.
e.q. if IP ID sequence varies then the target might be multiple (Load balanced).
Your analysis of the network traffic will always be more accurate than what nmap attempt to guess in an automated way.
I have a stupid problem with scotty web app and mongodb service starting in the right order.
I use systemd to start mongodb first and then the scotty web app. It does not work for some reason. The app errors out with connect: does not exist (Connection refused) from the mongodb driver meaning that the connection is not ready.
So my question. How can I test the connection availability say three times with 0.5s interval and only then error out?
This is the application main function
main :: IO ()
main = do
pool <- createPool (runIOE $ connect $ host "127.0.0.1") close 1 300 5
clearSessions pool
let r = \x -> runReaderT x pool
scottyT 3000 r r basal
basal :: ScottyD ()
basal = do
middleware $ staticPolicy (noDots >-> addBase "static")
notFound $ runSession
routes
Although the app service is ordered after mongodb service the connection to mongodb is still unavailable during the app start up. So I get the above mentioned error.
This is the systemd service file to avoid questions regarding the correct service ordering.
[Unit]
Description=Basal Web Application
Requires=mongodb.service
After=mongodb.service iptables.service network-online.target
[Service]
User=http
Group=http
WorkingDirectory=/srv/http/basal/
ExecStart=/srv/http/basal/bin/basal
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.target
I don't know why connection to mongodb is not available given the correct service order.
So I want to probe connection availability withing haskell code three times with 0.5s delay and then error out. How can I do it?
Thanks.
I guess from the functions you're using that you're using something like mongoDB 1.5.0.
Here, connect returns something in the IOE monad, which is an alias for ErrorTIOErrorIO.
So the best approach is to use the retrying mechanisms ErrorT offers. As it's an instance of MonadPlus, we can just use mplus if we don't care about checking for the specific error:
retryConnect :: Int -> Int -> Host -> IOE Pipe
retryConnect retries delayInMicroseconds host
| retries > 0 =
connect host `mplus`
(liftIO (threadDelay delayInMicroseconds) >>
retryConnect (retries - 1) delayInMicroseconds host)
| otherwise = connect host
(threadDelay comes from Control.Concurrent).
Then replace connect with retryConnect 2 500000 and it'll retry twice after the first failure with a 500,000 microsecond gap (i.e. 0.5s).
If you do want to check for a specific error, then use catchError instead and inspect the error to decide whether to swallow it or rethrow it.
I'm using python. I did a yum install memcached followed by a easy_install python-memcached
I used the simple test program from the Help(memcache). When I wasn't getting the proper answers I threw in some print statements:
[~/test]$ cat m2.py
import memcache
mc = memcache.Client(['127.0.0.1:11211'], debug=0)
x = mc.set("some_key", "Some value")
print 'Just set a key and value into the cache (suposedly)'
value = mc.get("some_key")
print 'Just retrieved that value from the cache using the key'
print 'X %s' % x
print 'Value %s' % value
[~/test]$ python m2.py
Just set a key and value into the cache (suposedly)
Just retrieved that value from the cache using the key
X 0
Value None
[~/test]$
The question now is, what have I failed to do in my installation? It appears to be working from an API perspective but it fails to put anything into the memcache share area.
I'm using a virtualbox vm running centos
[~]# cat /proc/version
Linux version 2.6.32-358.6.2.el6.i686 (mockbuild#c6b8.bsys.dev.centos.org) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-3) (GCC) ) #1 SMP Thu May 16 18:12:13 UTC 2013
Is there a daemon that is supposed to be running? I don't see an obvious named one when I do a ps.
I tried to get pylibmc installed on my vm but was unable to find a working installation so for now will see if I can get the above stuff working first.
I discovered if i ran straight from the python console GUI i get a bit more output if I set debug=1
>>> mc = memcache.Client(['127.0.0.1:11211'], debug=1)
>>> mc.stats
{}
>>> mc.set('test','value')
MemCached: MemCache: inet:127.0.0.1:11211: connect: Connection refused. Marking dead.
0
>>> mc.get('test')
MemCached: MemCache: inet:127.0.0.1:11211: connect: Connection refused. Marking dead.
When I try to use per the example telnet to connect to the port i get a connection refused:
[root#~]# telnet 127.0.0.1 11211
Trying 127.0.0.1...
telnet: connect to address 127.0.0.1: Connection refused
[root#~]#
I tried the instructions I found on the net for configuring telnet so localhost wouldn't be disabled:
vi /etc/xinetd.d/telnet
service telnet
{
flags = REUSE
socket_type = stream
wait = no
user = root
server = /usr/sbin/in.telnetd
log_on_failure += USERID
disable = no
}
And then ran the commands to restart the service(s):
service iptables stop
service xinetd stop
service iptables start
service xinetd start
service iptables stop
I ran with both cases (iptables started and stopped) but it has no effect. So I am out of ideas. What do I need to do to make it so the PORT will be allowed? if that is the problem?
Or is there a memcached service that needs to be running that needs to open up the port ?
well this is what it took to get it working: ( a series of manual steps )
1) su -
cd /var/run
mkdir memcached # this was missing
In the memcached file I added "-l 127.0.0.1" to the OPTIONS statement. It's apparently a listen option. Do this for steps 2 & 3. I'm not certain which file is actually used at runtime.
2) cd /etc/sysconfig
cp memcached memcached.old
vi memcached
3) cd /etc/init.d
cp memcached memcached.old
vi memcached
4) Try some commands to see if the server starts now
/etc/init.d/memcached start
/etc/init.d/memcached status
/etc/init.d/memcached stop
/etc/init.d/memcached restart
I tried opening a browser, but it never seemed to actually display anything so I don't really know how valid this approach is. I'm not running apache or anything like this so perhaps its not relevant to my cause. Perhaps I would have to supply a ?key=blah or something.
5) http://127.0.0.1:11211
6) Now it should be ready to go. If one runs the test shown with the following it should work. At least it did for me. doing the help(memcache) will display a simple program. just paste that in and it should work just fine.
[~]$ python
>>> import memcache
>>> help(memcache)
For the relevant part of our server stack, we're running:
NGINX 1.2.3
PHP-FPM 5.3.10 with PECL mongo 1.2.12
MongoDB 2.0.7
CentOS 6.2
We're getting some strange, but predictable behavior when the MongoDB server goes away (crashes, gets killed, etc). Even with a try/catch block around the connection code, i.e:
try
{
$mdb = new Mongo('mongodb://localhost:27017');
}
catch (MongoConnectionException $e)
{
die( $e->getMessage() );
}
$db = $mdb->selectDB('collection_name');
Depending on which PHP-FPM workers have connected to mongo already, the connection state is cached, causing further exceptions to go unhandled, because the $mdb connection handler can't be used. The troubling thing is that the try does not consistently fail for a considerable amount of time, up to 15 minutes later, when -- I assume -- the php-fpm processes die/respawn.
Essentially, the behavior is that when you hit a worker that hasn't connected to mongo yet, you get the die message above, and when you connect to a worker that has, you get an unhandled exception from $mdb->selectDB('collection_name'); because catch does not run.
When PHP is a single process, i.e. via Apache with mod_php, this behavior does not occur. Just for posterity, going back to Apache/mod_php is not an option for us at this time.
Is there a way to fix this behavior? I don't want the connection state to be inconsistent between different php-fpm processes.
Edit:
While I wait for the driver to be fixed in this regard, my current workaround is to do a quick polling to determine if the driver can handle requests and then load or not load the MongoDB library/run queries if it can't connect/query:
try
{
// connect
$mongo = new Mongo("mongodb://localhost:27017");
// try to do anything with connection handle
try
{
$mongo->YOUR_DB->YOUR_COLLECTION->findOne();
$mongo->close();
define('MONGO_STATE', TRUE);
}
catch(MongoCursorException $e)
{
$mongo->close();
error_log('Error connecting to MongoDB: ' . $e->getMessage() );
define('MONGO_STATE', FALSE);
}
}
catch(MongoConnectionException $e)
{
error_log('Error connecting to MongoDB: ' . $e->getMessage() );
define('MONGO_STATE', FALSE);
}
The PHP mongo driver connectivity code is getting a big overhaul in the 1.3 release, currently in beta2 as of writing this. Based on your description, your issues may be resolved by the fixes for:
https://jira.mongodb.org/browse/PHP-158
https://jira.mongodb.org/browse/PHP-465
Once it is released you will be able to see the full list of fixes here:
https://jira.mongodb.org/browse/PHP/fixforversion/10499
Or, alternatively on the PECL site. If you can test 1.3 and confirm that your issues are still present then I'm sure the driver devs would love to hear from you before the 1.3.0 release, especially if it is easily reproducible.