how to work with glusterfs-hadoop plugin? - hadoop-plugins

i installed glusterfs and works fine, after that i installed hadoop 1.x and works fine with hdfs, but when i use glusterfs-hadoop plugin to use glusterfs as the filesystem backend for my hadoop i get error, i use github site for glusterfs-hadoop plugin. and copy jar file to hadoop library directory, and change my core-site.xml to this:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>fs.glusterfs.impl</name>
<value>org.apache.hadoop.fs.glusterfs.GlusterFileSystem</value>
</property>
<property>
<name>fs.default.name</name>
<value>glusterfs://fedora1:9010</value>
</property>
<property>
<name>fs.AbstractFileSystem.glusterfs.impl</name>
<value>org.apache.hadoop.fs.local.GlusterFs</value>
</property>
<property>
<name>fs.glusterfs.volumes</name>
<value>test1</value>
</property>
<property>
<name>fs.glusterfs.volume.fuse.test1</name>
<value>/mnt/Hadoop</value>
</property>
</configuration>
and when execute start-mapred.sh, jobtracker and tasktracker start whitout any problem, but when execute this command "hadoop fs -mkdir ossl" i get this output:
15/04/14 12:52:53 INFO glusterfs.GlusterVolume: Initializing gluster volume..
15/04/14 12:52:53 INFO glusterfs.GlusterFileSystem: Configuring GlusterFS
15/04/14 12:52:53 INFO glusterfs.GlusterFileSystem: Initializing GlusterFS, CRC disabled.
15/04/14 12:52:53 INFO glusterfs.GlusterFileSystem: GIT INFO={git.commit.id.abbrev=f0fee73, git.commit.user.email=bchilds#redhat.com, git.commit.message.full=Merge pull request #122 from childsb/getfattrparse
Refactor and cleanup the BlockLocation parsing code, git.commit.id=f0fee73c336ac19461d5b5bb91a77e05cff73361, git.commit.message.short=Merge pull request #122 from childsb/getfattrparse, git.commit.user.name=bradley childs, git.build.user.name=Unknown, git.commit.id.describe=GA-12-gf0fee73, git.build.user.email=Unknown, git.branch=master, git.commit.time=31.03.2015 # 00:36:46 IRDT, git.build.time=12.04.2015 # 14:45:49 IRDT}
15/04/14 12:52:53 INFO glusterfs.GlusterFileSystem: GIT_TAG=GA
15/04/14 12:52:53 INFO glusterfs.GlusterFileSystem: Configuring GlusterFS
15/04/14 12:52:53 INFO glusterfs.GlusterVolume: Initializing gluster volume..
15/04/14 14:36:01 INFO glusterfs.GlusterVolume: Gluster volume: test at : /mnt/hadoop
15/04/14 14:36:01 INFO glusterfs.GlusterVolume: Working directory is : /
15/04/14 14:36:01 INFO glusterfs.GlusterVolume: Write buffer size : 131072
15/04/14 14:36:01 INFO glusterfs.GlusterVolume: Default block size : 67108864
15/04/14 14:36:01 INFO glusterfs.GlusterVolume: Directory list order : fs ordering
15/04/14 14:36:01 INFO glusterfs.GlusterVolume: File timestamp lease significant digits removed : 0
mkdir: Error undefined volume:fedora1:9010 in path: glusterfs://fedora1:9010/ossl
please help me, thanks for your reply.

If I'm not mistaken this should work:
<property>
<name>fs.default.name</name>
<value>glusterfs:///fedora1:9010</value>
</property>

Related

Error: Could not find or load main class org.apache.tez.dag.app.DAGAppMaster

I have installed tez and want to run the example like this
hadoop jar tez-examples-0.10.1.jar orderedwordcount /input /output
but it's not work and the log is
Log Type: stderr
Log Upload Time: Thu May 12 13:19:25 +0800 2022
Log Length: 77
Error: Could not find or load main class org.apache.tez.dag.app.DAGAppMaster
Log Type: stdout
Log Upload Time: Thu May 12 13:19:25 +0800 2022
Log Length: 716
Heap
PSYoungGen total 17920K, used 921K [0x00000000eef00000, 0x00000000f0300000, 0x0000000100000000)
eden space 15360K, 6% used [0x00000000eef00000,0x00000000eefe67a8,0x00000000efe00000)
from space 2560K, 0% used [0x00000000f0080000,0x00000000f0080000,0x00000000f0300000)
to space 2560K, 0% used [0x00000000efe00000,0x00000000efe00000,0x00000000f0080000)
ParOldGen total 40960K, used 0K [0x00000000ccc00000, 0x00000000cf400000, 0x00000000eef00000)
object space 40960K, 0% used [0x00000000ccc00000,0x00000000ccc00000,0x00000000cf400000)
Metaspace used 2541K, capacity 4480K, committed 4480K, reserved 1056768K
class space used 283K, capacity 384K, committed 384K, reserved 1048576K
my_env.sh is
#JAVA_HOME
export JAVA_HOME=/opt/module/jdk1.8.0_331
export PATH=$PATH:$JAVA_HOME/bin
#HADOOP_HOME
export HADOOP_HOME=/opt/module/hadoop-3.3.2
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
#HIVE_HOME
export HIVE_HOME=/opt/module/hive-3.1.3
export PATH=$PATH:$HIVE_HOME/bin
#MAVEN_HOME
export MAVEN_HOME=/opt/module/maven-3.8.5
export PATH=$PATH:$MAVEN_HOME/bin
#TEZ_HOME
export TEZ_HOME=/opt/module/tez-0.10.1
export HADOOP_CLASSPATH=${TEZ_HOME}/conf:${TEZ_HOME}/*:${TEZ_HOME}/lib/*
tez-site.xml is
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>tez.lib.uris</name>
<value>${fs.defaultFS}/apps/tez/tez-0.10.1.tar.gz</value>
</property>
<property>
<name>tez.use.cluster.hadoop-libs</name>
<value>false</value>
</property>
<property>
<name>tez.history.logging.service.class</name>
<value>org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService</value>
</property>
</configuration>
I have tried answer https://issues.apache.org/jira/browse/TEZ-3392 but it's not work.
Please help me in resolving this.Thanks in Advance!!!
I had the similar error message with running tez example. The installation guide https://tez.apache.org/install.html is not so obious, but has some valueable notes.
Helpful was an application tracking page where from logs i found out that path in the container after decompress tez archive is incorrect.
The whole archive is decompressed to the directory called ./tezlib and excerpt of CLASSPATH looks like that:
$PWD:$PWD/*:$PWD/tezlib/*:$PWD/tezlib/lib/*
but archive apache-tez-0.10.1-bin.tar.gz (on my HDFS in path /apps/apache-tez-0.10.1-bin.tar.gz) is decompressed inside a container to ./tezlib/apache-tez-0.10.1-bin.
So, after several hours trial and error i resolved this issue in the following steps:
tar -xf apache-tez-0.10.1-bin.tar.gz
tar -czf apache-tez-0.10.1-bin-nodir.tar.gz -C apache-tez-0.10.1-bin .
hdfs dfs -copyFromLocal apache-tez-0.10.1-bin-nodir.tar.gz /apps/
The second line above pack tez jars into an archive without parent directory.
After that tez example runs without error and finishes succeed.
My tez-site.xml:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>tez.lib.uris</name>
<value>${fs.defaultFS}/apps/apache-tez-0.10.1-bin-nodir.tar.gz</value>
</property>
<property>
<name>tez.use.cluster.hadoop-libs</name>
<value>true</value>
</property>
</configuration>
Of course there are propably another ways to manage this error with incorrect path to jars.
I've tested that at hadoop 3.2.2 from bigtop distribution and tez 0.10.0/0.10.1.
I had solved my question by upload to hdfs an uncompressed tez package and change my tez-site.xml file.
hadoop fs -put tez-0.10.1 /apps/tez
My changed tez-site.xml
The main different place is "tez.lib.uris"
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>tez.lib.uris</name>
<value>${fs.defaultFS}/apps/tez/tez-0.10.1,${fs.defaultFS}/apps/tez/tez-0.10.1/lib</value>
</property>
<property>
<name>tez.use.cluster.hadoop-libs</name>
<value>false</value>
</property>
<property>
<name>tez.history.logging.service.class</name>
<value>org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService</value>
</property>
<property>
<name>tez.am.resource.memory.mb</name>
<value>1024</value>
</property>
<property>
<name>tez.am.resource.cpu.vcores</name>
<value>1</value>
</property>
</configuration>

Hazelcast Kubernetes plugin is not defined

I`m trying to create an OrientDB (version 3.0.10) cluster using Kubernetes. OrientDB uses Hazelcast (version 3.10.4) in its distributed mode that is why I hat to set up KubernetesHazelcast plugin. I used this repository as an example.
I have created all the necessary configuration files, I have defined hazelcast Kubernetes dependency (version 1.3.1) in build.sbt file for my project and this dependency appeared in the classpath
However, the logs on each pod show this error message:
com.orientechnologies.orient.server.distributed.ODistributedStartupException: Error on starting distributed plugin
Caused by: com.hazelcast.config.properties.ValidationException: There is no discovery strategy factory to create 'DiscoveryStrategyConfig{properties={service-dns=orientdbservice2.default.svc.cluster.local, service-dns-timeout=10}, className='com.hazelcast.kubernetes.HazelcastKubernetesDiscoveryStrategy', discoveryStrategyFactory=null}' Is it a typo in a strategy classname? Perhaps you forgot to include implementation on a classpath?
So it looks like the Hazelcast Kubernetes dependency is set up in a worng way. How can this error be fixed?
Here is my config hazelcast.xml file:
<properties>
<property name="hazelcast.discovery.enabled">true</property>
</properties>
<network>
<join>
<multicast enabled="false"/>
<tcp-ip enabled="false" />
<discovery-strategies>
<discovery-strategy enabled="true"
class="com.hazelcast.kubernetes.HazelcastKubernetesDiscoveryStrategy">
<properties>
<property name="service-dns">orientdbservice2.default.svc.cluster.local</property>
<property name="service-dns-timeout">10</property>
</properties>
</discovery-strategy>
</discovery-strategies>
</join>
</network>
For the cluster creation, I use StatefulSet with OrientDB image and mount all the config files as config maps. I am pretty sure that the problem is not in my config files as with multicast instead of the dns strategy everything works fine. Also, there are no network problems in the Kubernetes cluster itself.
First of all, OrientDB version should be updated to the latest - 3.0.10 with embedded newest Hazelcast version. Also, I have mounted hazelcast-kubernetes.jar dependency file directly into /orientdb/lib folder and it started to work properly. HazelcastKubernetes plugin is discovered and nodes join the cluster:
INFO [172.17.0.3]:5701 [orientdb-test-cluster-1] [3.10.4] Kubernetes Discovery activated resolver: DnsEndpointResolver [DiscoveryService]
INFO [172.17.0.3]:5701 [orientdb-test-cluster-1] [3.10.4] Activating Discovery SPI Joiner [Node]
INFO [172.17.0.3]:5701 [orientdb-test-cluster-1] [3.10.4] Starting 2 partition threads and 3 generic threads (1 dedicated for priority tasks) [OperationExecutorImpl]
Members {size:3, ver:3} [
Member [172.17.0.3]:5701 - hash
Member [172.17.0.4]:5701 - hash
Member [172.17.0.8]:5701 - hash
]

GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt) while connecting Polybase with Kerberos

We want to connect our SQL Server 2016 Enterprise via Polybase with our Kerberized OnPrem Hadoop-Cluster with Cloudera 5.14.
I followed the Microsoft PolyBase Guide to configure Polybase. After working few days on this topic I'm not able to continue because of an exception: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
Microsoft has an built in diagnostic tool for troubleshooting the connectivity with PolyBase and Kerberos. On this troubleshooting guide from Microsoft there are 4 checkpoints and I'm stuck on checkpoint 4.
Short information about the checkpoints (where I'm successfull):
Checkpoint 1: Successfull! Authenticated against the KDC and received a TGT
Checkpoint 2: Successfull! Regarding troubleshooting guide PolyBase will make an attempt to access the HDFS and fail because the request did not contain the necessary Service Ticket.
Checkpoint 3: Sucessfull! A second hex dump indicates that SQL Server successfully used the TGT and acquired the applicable Service Ticket for the name node's SPN from the KDC.
Checkpoint 4: Not successfull SQL Server was authenticated by Hadoop using the ST (Service Ticket) and a session was granted to access the secured resource.
krb5.conf file
[libdefaults]
default_realm = COMPANY.REALM.COM
dns_lookup_kdc = false
dns_lookup_realm = false
ticket_lifetime = 86400
renew_lifetime = 604800
forwardable = true
default_tgs_enctypes = aes256-cts-hmac-sha1-96 aes128-cts-hmac-sha1-96
default_tkt_enctypes = aes256-cts-hmac-sha1-96 aes128-cts-hmac-sha1-96
permitted_enctypes = aes256-cts-hmac-sha1-96 aes128-cts-hmac-sha1-96
udp_preference_limit = 1
kdc_timeout = 3000
[realms]
COMPANY.REALM.COM = {
kdc = ipadress.kdc.host
admin_server = ipadress.kdc.host
}
[logging]
default = FILE:/var/log/krb5/kdc.log
kdc = FILE:/var/log/krb5/kdc.log
admin_server = FILE:/var/log/krb5/kadmind.log
core-site.xml for Polybase on SQL-Server
<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>ipc.client.connect.max.retries</name>
<value>2</value>
</property>
<property>
<name>ipc.client.connect.max.retries.on.timeouts</name>
<value>2</value>
</property>
<!-- kerberos security information, PLEASE FILL THESE IN ACCORDING TO HADOOP CLUSTER CONFIG -->
<property>
<name>polybase.kerberos.realm</name>
<value>COMPANY.REALM.COM</value>
</property>
<property>
<name>polybase.kerberos.kdchost</name>
<value>ipadress.kdc.host</value>
</property>
<property>
<name>hadoop.security.authentication</name>
<value>KERBEROS</value>
</property>
</configuration>
hdfs-site.xml for Polybase on SQL-Server
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.block.size</name>
<value>268435456</value>
</property>
<!-- Client side file system caching is disabled below for credential refresh and
settting the below cache disabled options to true might result in
stale credentials when an alter credential or alter datasource is performed
-->
<property>
<name>fs.wasb.impl.disable.cache</name>
<value>true</value>
</property>
<property>
<name>fs.wasbs.impl.disable.cache</name>
<value>true</value>
</property>
<property>
<name>fs.asv.impl.disable.cache</name>
<value>true</value>
</property>
<property>
<name>fs.asvs.impl.disable.cache</name>
<value>true</value>
</property>
<property>
<name>fs.hdfs.impl.disable.cache</name>
<value>true</value>
</property>
<!-- kerberos security information, PLEASE FILL THESE IN ACCORDING TO HADOOP CLUSTER CONFIG -->
<property>
<name>dfs.namenode.kerberos.principal</name>
<value>hdfs/_HOST#COMPANY.REALM.COM</value>
</property>
</configuration>
Polybase Exception
[2018-06-22 12:51:50,349] WARN 2872[main] - org.apache.hadoop.security.UserGroupInformation.hasSufficientTimeElapsed(UserGroupInformation.java:1156) - Not attempting to re-login since the last re-login was attempted less than 600 seconds before.
[2018-06-22 12:51:53,568] WARN 6091[main] - org.apache.hadoop.security.UserGroupInformation.hasSufficientTimeElapsed(UserGroupInformation.java:1156) - Not attempting to re-login since the last re-login was attempted less than 600 seconds before.
[2018-06-22 12:51:56,127] WARN 8650[main] - org.apache.hadoop.security.UserGroupInformation.hasSufficientTimeElapsed(UserGroupInformation.java:1156) - Not attempting to re-login since the last re-login was attempted less than 600 seconds before.
[2018-06-22 12:51:58,998] WARN 11521[main] - org.apache.hadoop.security.UserGroupInformation.hasSufficientTimeElapsed(UserGroupInformation.java:1156) - Not attempting to re-login since the last re-login was attempted less than 600 seconds before.
[2018-06-22 12:51:59,139] WARN 11662[main] - org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:676) - Couldn't setup connection for hdfs#COMPANY.REALM.COM to IPADRESS_OF_NAMENODE:8020
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
Log Entry on NameNode
Socket Reader #1 for port 8020: readAndProcess from client IP-ADRESS_SQL-SERVER threw exception [javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: Failure unspecified at GSS-API level (Mechanism level: AES128 CTS mode with HMAC SHA1-96 encryption type not in permitted_enctypes list)]]
Auth failed for IP-ADRESS_SQL-SERVER:60484:null (GSS initiate failed) with true cause: (GSS initiate failed)
The confusing part for me is the log entry from our NameNode because AES128 CTS mode with HMAC SHA1-96 is already in the list of permitted enctypes as shown in krb5.conf and in Cloudera Manager UI
We appreciate your help!
The problem has itself taken care after we restarted the cluster.
I think the problem was that the krb5.conf file in our Hadoop-Cluster could not be distributed on all nodes because of some running services. There was also a warning in the Cloudera Manager about a stale configuration regarding Kerberos.
Many thanks to everyone!

Graphite server shows blank kafka metrics while using collectd JMX and java plugin

I am using collectd JMX and java plugin to gather kafka metrics and write to graphite server. When I run the command to see kafka metrics from the node it shows the data but when I use collectd plugin blank metrics are exported. Any idea if I am missing some configuration. Below is the sample code snipped
try this,
<Plugin java>
JVMARG "-Djava.class.path=/usr/share/collectd/java/collectd-api.jar‌​:/usr/share/collectd‌​/java/generic-jmx.ja‌​r"
LoadPlugin "org.collectd.java.GenericJMX"
<Plugin "GenericJMX">
<MBean "kafka-all-messages">
ObjectName "kafka.server:type=BrokerTopicMetrics,name=MessagesInPerSec"
<Value>
Type "counter"
Table false
Attribute "Count"
</Value>
</MBean>
<Connection>
ServiceURL"service:jmx:rmi:///jndi/rmi://localhost:2999/jmxr‌​mi"
Host "localhost"
Collect "kafka-all-messages"
</Connection>
</Plugin>
</Plugin>

unable to start terracota

I'm having issue when starting terracota, I did grep on 37.139.24.150 ip in whole system but couldn't find any file containing this IP, any other places to look for? Also i couldn't find tc-config.xml in terracota its actually an old system I'm just starting terracota its not installed/configured by me.
2015-03-12 13:02:09,737 [main] INFO com.terracottatech.dso - Statistics store: '/root/terracotta/server-statistics'.
2015-03-12 13:02:09,750 [main] INFO com.terracottatech.console - Available Max Runtime Memory: 490MB
2015-03-12 13:02:09,958 [main] INFO com.terracottatech.dso - Standard DSO Server created
2015-03-12 13:02:09,962 [main] INFO com.terracottatech.dso - Creating server nodeID: NodeID[37.139.24.150:9510]
2015-03-12 13:02:09,973 [main] ERROR com.terracottatech.console - Unable to find local network interface for 37.139.24.150
2015-03-12 13:02:09,975 [main] ERROR com.terracottatech.dso - Unable to find local network interface for 37.139.24.150
com.tc.exception.TCRuntimeException: Unable to find local network interface for 37.139.24.150
at com.tc.objectserver.impl.DistributedObjectServer.start(DistributedObjectServer.java:502)
at com.tc.server.TCServerImpl.startDSOServer(TCServerImpl.java:531)
at com.tc.server.TCServerImpl.access$600(TCServerImpl.java:92)
at com.tc.server.TCServerImpl$StartAction.execute(TCServerImpl.java:479)
at com.tc.lang.StartupHelper.startUp(StartupHelper.java:39)
at com.tc.server.TCServerImpl.startServer(TCServerImpl.java:510)
at com.tc.server.TCServerImpl.start(TCServerImpl.java:271)
at com.tc.server.TCServerMain.main(TCServerMain.java:30)
I made it work, I've created new tc-config.xml and started server with ./start-tc-server.sh -f /home/tomcat/terracotta/latest/terracotta/bin/tc-config.xml &
<?xml version="1.0" encoding="UTF-8"?>
<!-- All content copyright Terracotta, Inc., unless otherwise indicated. All rights reserved. -->
<tc:tc-config xsi:schemaLocation="http://www.terracotta.org/schema/terracotta-5.xsd"
xmlns:tc="http://www.terracotta.org/config"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<servers>
<!-- Sets where the Terracotta server can be found. Replace the value of host with the server's IP address. -->
<server host="<my-server-ip>" name="localhost">
<data>/home/tomcat/terracotta/server-data</data>
<logs>/home/tomcat/terracotta/server-logs</logs>
<statistics>/home/tomcat/terracotta/server-statistics</statistics>
</server>
<!-- If using more than one server, add an <ha> section. -->
<ha>
<mode>networked-active-passive</mode>
<networked-active-passive>
<election-time>5</election-time>
</networked-active-passive>
</ha>
</servers>
<!-- Sets where the generated client logs are saved on clients. Note that the exact location of Terracotta logs on client machines may vary based on the value of user.home and the local disk layout. -->
<clients>
<logs>/opt/terracotta/client-logs</logs>
</clients>
</tc:tc-config>