JMX Connection refused on Kubernetes with AdoptOpenJDK OpenJ9 - kubernetes

With my team we are trying to move our micro-services to openj9, they are running on kubernetes. However, we encounter a problem on the configuration of JMX. (openjdk8-openj9)
We have a connection refused when we try a connection with jvisualvm (and a port-forwarding with Kubernetes).
We haven't changed our configuration, except for switching from Hotspot to OpenJ9.
The error :
E0312 17:09:46.286374 17160 portforward.go:400] an error occurred forwarding 1099 -> 1099: error forwarding port 1099 to pod XXXXXXX, uid : exit status 1: 2020/03/12 16:09:45 socat[31284] E connect(5, AF=2 127.0.0.1:1099, 16): Connection refused
The java options that we use :
-Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.local.only=false
-Dcom.sun.management.jmxremote.port=1099
-Dcom.sun.management.jmxremote.rmi.port=1099
We are using the last adoptopenjdk/openjdk8-openj9 docker image.
Do you have any ideas?
Thank you !
Regards.

I managed to figure out why it wasn't working.
It turns out that to pass the JMX options to the service we were using the Kubernetes service descriptor in YAML. It looks like this:
- name: _JAVA_OPTIONS
value: -Dzipkinserver.listOfServers=http://zipkin:9411 -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.local.only=false -Dcom.sun.management.jmxremote.port=1099 -Dcom.sun.management.jmxremote.rmi.port=1099
I realized that the JMX properties were not taken into account from _JAVA_OPTIONS when the application is not launch with ENTRYPOINT in the docker container.
So I pass the properties directly into the Dockerfile like this and it works.
CMD ["java", "-Dcom.sun.management.jmxremote", "-Dcom.sun.management.jmxremote.authenticate=false", "-Dcom.sun.management.jmxremote.ssl=false", "-Dcom.sun.management.jmxremote.local.only=false", "-Dcom.sun.management.jmxremote.port=1099", "-Dcom.sun.management.jmxremote.rmi.port=1099", "-Djava.rmi.server.hostname=127.0.0.1", "-cp","app:app/lib/*","OurMainClass"]
It's also possible to keep _JAVA_OPTIONS and setup an ENTRYPOINT in the dockerfile.
Thanks!

Related

Enable JMX port for monitoring kafka

Using reference to https://docs.microfocus.com/itom/MP_for_Apache_Kafka:1.10/Kafka/Kafka_JMX,
I created the jmx_local.config and modified the Kafka start up script)
The Kafka start up script picks the jmx_local.coonfig but the port is not getting exposed.
This is what I see on grepping the java process:
"/Library/Java/JavaVirtualMachines/adoptopenjdk-8.jdk/Contents/Home/bin/java -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.config.file=/usr/local/etc/kafka/jmx_local.conf kafka.Kafka /usr/local/etc/kafka/server.properties"
cat /usr/local/etc/kafka/jmx_local.conf
Dcom.sun.management.jmxremote.port=9395
Dcom.sun.management.jmxremote.authenticate=false
Dcom.sun.management.jmxremote.ssl=false
Also tried with port 10167 but the port is not enabled. Also modified as 'com.sun.management.jmxremote.port=9395'
I could see the other jmx properties.
Any suggestion please
I did grep -rl "jmxremote" /usr/local/Cellar/kafka/2.6.0, and found the jxm config was considered from bin/kafka-run-class.sh. So added 'Dcom.sun.management.jmxremote.port=9395' in bin/kafka-run-class.sh and restarted the kafka service.
To find if the port is available:
netstat -an | grep 1099

Can't talk to HBase from different kubernetes namespace: java.net.UnknownHostException: hregion-0.hregion

I am using kubernetes, where I have a Hadoop cluster running in namespace 'platform'.
I have an example application running in namespace 'example'
The example application needs to talk to HBase. When it does so, we see the following error:
java.net.UnknownHostException: hregion-0.hregion
at java.net.InetAddress.getAllByName0(InetAddress.java:1280)
at java.net.InetAddress.getAllByName(InetAddress.java:1192)
at java.net.InetAddress.getAllByName(InetAddress.java:1126)
at java.net.InetAddress.getByName(InetAddress.java:1076)
at org.apache.hadoop.hbase.client.ConnectionUtils.getStubKey(ConnectionUtils.java:233)
at org.apache.hadoop.hbase.client.ConnectionImplementation.getClient(ConnectionImplementation.java:1192)
at org.apache.hadoop.hbase.client.ClientServiceCallable.setStubByServiceName(ClientServiceCallable.java:44)
at org.apache.hadoop.hbase.client.RegionServerCallable.prepare(RegionServerCallable.java:229)
at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:105)
at org.apache.hadoop.hbase.client.HTable.get(HTable.java:386)
at org.apache.hadoop.hbase.client.HTable.get(HTable.java:360)
at org.apache.hadoop.hbase.MetaTableAccessor.getTableState(MetaTableAccessor.java:1078)
at org.apache.hadoop.hbase.MetaTableAccessor.tableExists(MetaTableAccessor.java:403)
at org.apache.hadoop.hbase.client.HBaseAdmin$6.rpcCall(HBaseAdmin.java:445)
at org.apache.hadoop.hbase.client.HBaseAdmin$6.rpcCall(HBaseAdmin.java:442)
at org.apache.hadoop.hbase.client.RpcRetryingCallable.call(RpcRetryingCallable.java:58)
at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:107)
at org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3084)
at org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:3076)
at org.apache.hadoop.hbase.client.HBaseAdmin.tableExists(HBaseAdmin.java:442)
The command
> nslookup hregion-0.hregion
on the client machine fails, because the hregion service is in the platform namespace (where that command will succeed).
We suspected that the HBase region server has registered itself with zookeeper using an incomplete name, and verified by connecting to the zookeeper server:
[zk: localhost:2181(CONNECTED) 8] ls /hbase/rs
[hregion-0.hregion,16020,1560851357442]
The ConnectionUtils.getStubKey method simply uses java.net.InetAddress.getByName(hostname) and it is this method which fails.
Here is some zookeeper debugging info (this from the HBase master):
hbase(main):001:0> zk_dump
HBase is rooted at /hbase
Active master address: hmaster-0.hmaster.platform.svc.cluster.local,16000,1560851357485
Backup master addresses:
Region server holding hbase:meta: hregion-0.hregion,16020,1560851357442
Region servers:
hregion-0.hregion,16020,1560851357442
On the hregion-0 server, we have the following entries in /etc/hosts:
# Kubernetes-managed hosts file.
127.0.0.1 localhost
10.1.14.53 hregion-0.hregion.platform.svc.cluster.local hregion-0
And the /etc/resolv.conf file looks like this:
nameserver 10.96.0.10
search platform.svc.cluster.local svc.cluster.local cluster.local mycompany.com
options ndots:5
How do I fix this? I assume I need to tell HBase to register its nodes in zookeeper using their fully qualified domain name - how?

How do i enable remote jmx with port in zookeeper zkServer.cmd

Here my zkServer.cmd file :
#echo off
setlocal
call "%~dp0zkEnv.cmd"
set ZOOMAIN=org.apache.zookeeper.server.quorum.QuorumPeerMain
echo on
call %JAVA% "-Dzookeeper.log.dir=%ZOO_LOG_DIR%" "-Dzookeeper.root.logger=%ZOO_LOG4J_PROP%" -cp "%CLASSPATH%" %ZOOMAIN% "%ZOOCFG%" %*
endlocal
The skServer.sh script will run the zkEnv.sh script which in-turn will look for a script '../conf/zookeeper-env.sh'
create a file on the conf folder called zookeeper-env.sh
Paste this into the file and restart Zookeeper:
JMXLOCALONLY=false
JMXDISABLE=false
JMXPORT=4048
JMXAUTH=false
JMXSSL=false
First obtain the hostname (or reachable IP eg. lan/public/NAT address):
hostname -i
# or find ip
ip a
next add following options to ZOOMAIN (assumed hostname my.remoteconsole.org and desired port 8989)
-Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.port=8989
-Djava.rmi.server.hostname=my.remoteconsole.org
More details about available options in java docs (http://docs.oracle.com/javase/8/docs/technotes/guides/management/agent.html).
ADD org.apache.zookeeper.server.quorum.QuorumPeerMain in server-start.
The class org.apache.zookeeper.server.quorum.QuorumPeerMain will start a JMX manageable ZooKeeper server. This class registers the proper MBeans during initalization to support JMX monitoring and management of the instance.
In addition to above answer by Marcell du Plessis, if you are running zookeeper as a systemd service, then you can specify jmx port in the environment variable.
[Unit]
Description=Apache Kakfa Zookeeper
Requires=network.target
After=network.target
[Service]
Type=simple
User=user
Group=users
ExecStart=/your-zookeeper-install-path/bin/zkServer.sh start
ExecStop=/your-zookeeper-install-path/bin/zkServer.sh stop
TimeoutStopSec=180
Restart=on-failure
Environment="JMX_PORT=9999"
[Install]
WantedBy=multi-user.target
Alias=zookeeper.service

The Datastax cassandra community server 2.1.10 service on local computer started and then stopped

I am trying to configure a two node cluster with cassandra in windows r2 2008
So i installed cassandra community version in one server (10.xxx.0.1,10.xxx.0.2)
And then I stopped the service and then edited the configuraton.yaml file in the conf folder.
The changes are:
cluster_name
commented the num_tokens
gave the tokens in initial_token,
seeds as 10.xxx.0.1,10.xxx.0.2,
listen_addresses are their respective ip addresses which are 10.xxx.0.1,10.xxx.0.2,
rpc_addresses as 0.0.0.0,
endpointsnitch as gossip
I also changed the cassandra rackdc.properties file to dc=DC1 rack=RAC1.
I then saved and started back the service and opened the cqlsh, but it is not connecting. Below is the error:
2015-10-12 16:20:13 Commons Daemon procrun stderr initialized
If rpc_address is set to a wildcard address (0.0.0.0), then you must set broadcast_rpc_address to a value other than 0.0.0.0
Fatal configuration error; unable to start. See log for stacktrace.
..
ERROR 21:20:14 Fatal configuration error
org.apache.cassandra.exceptions.ConfigurationException: If rpc_address is set to a wildcard address (0.0.0.0), then you must set broadcast_rpc_address to a value other than 0.0.0.0
at org.apache.cassandra.config.DatabaseDescriptor.applyAddressConfig(DatabaseDescriptor.java:285) ~[apache-cassandra-2.1.10.jar:2.1.10]
at org.apache.cassandra.config.DatabaseDescriptor.applyConfig(DatabaseDescriptor.java:443) ~[apache-cassandra-2.1.10.jar:2.1.10]
at org.apache.cassandra.config.DatabaseDescriptor.<clinit>(DatabaseDescriptor.java:136) ~[apache-cassandra-2.1.10.jar:2.1.10]
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:168) [apache-cassandra-2.1.10.jar:2.1.10]
at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:562) [apache-cassandra-2.1.10.jar:2.1.10]
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:651) [apache-cassandra-2.1.10.jar:2.1.10]
If you out 0.0.0.0 to the rpc_address you have to change the broadcast_rpc_address like in http://docs.datastax.com/en/cassandra/2.1/cassandra/configuration/configCassandra_yaml_r.html , I think that the right broadcast_rpc_address can be the own ip address.

Hector test example not working on Cassandra 0.7.4

I have set up my single node Cassandra 0.7.4 and started the service with
bin/cassandra -f. Now I am trying to use the Hector API (v. 0.7.0) to manage the
DB.
The Cassandra CLI works fine and I can create keyspaces and so on.
I tried to run the test example and create a single keyspace:
Cluster cluster = HFactory.getOrCreateCluster("TestCluster",
new CassandraHostConfigurator("localhost:9160"));
Keyspace keyspace = HFactory.createKeyspace("Keyspace1", cluster);
But all I get is this:
2011-04-14 22:20:27,469 [main ] INFO
me.prettyprint.cassandra.connection.CassandraHostRetryService
- Downed Host
Retry service started with queue size -1 and retry delay 10s
2011-04-14 22:20:27,492 [main ] DEBUG
me.prettyprint.cassandra.connection.HThriftClient -
Transport open status false
for client CassandraClient<localhost:9160-1>
....this again about 20 times
me.prettyprint.cassandra.service.JmxMonitor - Registering JMX
me.prettyprint.cassandra.service_TestCluster:ServiceType=hector,
MonitorType=hector
2011-04-14 22:20:27,636 [Thread-0 ] INFO
me.prettyprint.cassandra.connection.CassandraHostRetryService -
Downed Host
retry shutdown hook called
2011-04-14 22:20:27,646 [Thread-0 ] INFO
me.prettyprint.cassandra.connection.CassandraHostRetryService -
Downed Host
retry shutdown complete
Can you please tell me what I'm doing wrong?
Thanks
When you connect via the CLI, do you specify "-h localhost -p 9160"?
Can you actually do stuff on the command line with the above?
The error from HThriftClient indicates it could not connect to the Cassandra Daemon.
FTR, you would get responses much faster via hector-users#googlegroups.com
If you are on a linux machine, try starting up your cassandra server by this command:
/bin$ ./cassandra start -f
Then for the cli, use this command:
./cassandra-cli -h {hostname}/9160.
Then make sure that the configures are ok.