Classpath is empty error when running zookeeper instance - apache-kafka

I am trying to follow the instructions on https://kafka.apache.org/quickstart to try and start a Kafka install and then send some messages from a scala client.
I am using a windows system.
I am getting this error(see screencap) when i run the zookeeper instance.

The reason most probably is because your directory path has a space - “Development Tools”. Try running this in a path which has no spaces. I guess the space is causing some path issues in the shell script.
Also, I assume that you downloaded the binary and not the source files?
Hope it works and let us know.

Related

Kafka Connect On windows failed to find the connector class

I have installed Kafka on my windows machine and everything is supposed to work properly until i insert a connector configuration file which contains the io debezium mysql connector. all i found on google for this bug is about the plugin.path so i did all the possible workarounds to make it work in vain: Change the jar folders to literally c:/mypluginfolder using " \ " and changing it to "/" in my paths, using absolute paths adding the directory in classpath ect... the logs even say that some of the debezium plugins are being added before it crashes so technically the server sees that path. Help a fellow out i've been at loss for more than 2 weeks. thank you.
The cmd output: https://controlc.com/880d73f2
my standalone.props and connector: https://controlc.com/4ee0164f
PS:Sorry for the controlc links i dont know how to format questions i'm new here.

Kafka does not start blank output

Im workign to install Kafa and Zookeeper.
I have already run the Zookeeper and it is currently running.
I set up everything as in [https://dzone.com/articles/running-apache-kafka-on-windows-os]
when i finally run in my cmd,
.\bin\windows\kafka-server-start.bat .\config\server.properties
there is no output, it just moves shows the next command line started.
Please help me out.
Finally I find someone with with the same issue I had! Zookeeper running, but kafka not doing anything at all except returning to the next line with no log, error, or anything. Dunno if the cause is the same, but the solution for me, oddly enough, was to download and open cygwin, and run the command exactly as you have it, except with flipping all the \s to /s and it worked.
After lot of search this is the way I solved
You have to add in User path in Environment Variable:
%SystemRoot%\System32\Wbem;%SystemRoot%\System32\;SystemRoot%
In User %PATH% Environment variable, and not in system %PATH% Environment variable.
this question already replied on this page:
Kafka server not returning anything
Solution that worked for me:
Create the logs folder and mention it on the sever.properties, it will not create the folder automatically.
go to your cmd and run kafka-server-start.bat D:\<pathofkafka>\config\server.properties
Thanks!

Divolte-collector with MAPR, Storm, Kafka and Cassandra

I am not sure if I can get help for this on here, but I thought it was worth a try.
I have 3 node cluster on AWS, I am running MAPR M3 , I installed Storm, Kafka and Divolte-collector and Cassandra. I would like try some of the clickstream examples and I am running into an issue with the tcp-consumer example. Also being quite new to java and distributed processing I have some clarification questions. Again I am not quite sure where to post this because I feel like this is divolte-collector specific and I also have some gaps in my understanding of the javadoc concept and the building and running of jar files; but I figured someone could point me to some resources or help with some clarifications. I can't get the json string to appear in the console running netcat socket listening for clicks:
Divolte tcp-kafka-consumer example
Everything works until the netcat part step 7 and my knowledge gap is with step 6.
Step 1: install and configure Divolte Collector
Install works and hello world click collections is promising :-)
Step 2: download, unpack and run Kafka
# In one terminal session
cd kafka_2.10-0.8.1.1/bin
./zookeeper-server-start.sh ../config/zookeeper.properties
# Leave Zookeeper running and in another terminal session, do:
cd kafka_2.10-0.8.1.1/bin
./kafka-server-start.sh ../config/server.properties
No erros plus tested kafka examples so seems to working as well
Step 3: start Divolte Collector
Go into the bin directory of your installation and run:
cd divolte-collector-0.2/bin
./divolte-collector
Step 3 no hitch, can test default divole-collector test page
Step 4: host your Javadoc files
Setup a HTTP server that serves the Javadoc files that you generated or downloaded for the examples. If you have Python installed, you can use this:
cd <your-javadoc-directory>
python -m SimpleHTTPServer
Ok so I can reach the javadoc pages
Step 5: listen on TCP port 1234
nc -kl 1234
Note: when using netcat (nc) as TCP server, make sure that you configure the Kafka consumer to use only 1 thread, because nc won't handle multiple incoming connections.
Tested netcat by opening port and sending messages so I figured I don't have any port issues on AWS.
Step 6: run the example
cd divolte-examples/tcp-kafka-consumer
mvn clean package
java -jar target/tcp-kafka-consumer-*-jar-with-dependencies.jar
Note: for this to work, you need to have the avro-schema project installed into your local Maven repository.
I installed the avro-schema with mvn clean install in avro project that comes with the examples. as per instructions here
Step 7: click around and check that you see events being flushed to the console where you run netcat
When you click around the Javadoc pages, you console should show events in JSON format similar to this:
I don't see the clicks in my netcat window :(
Investigating the issue I viewed the console and network tabs using chrome developer tools it seems divolte is running, but I am not sure how to dig further. This is the console view. Any ideas or pointers?
Thanks anyways
Initializing Divolte.
divolte.js:140 Divolte base URL detected http://ec2-x-x-x-x.us-west-x.compute.amazonaws.com:8290/
divolte.js:280 Divolte party/session/pageview identifiers ["0:i6i3g0jy:nxGMDVdU9~f1wF3RGqwmCKKICn4d1Sb9", "0:i6qx4rmi:IXc1i6Qcr17pespL5lIlQZql956XOqzk", "0:6ZIHf9BHzVt_vVNj76KFjKmknXJixquh"]
divolte.js:307 Module initialized. Object {partyId: "0:i6i3g0jy:nxGMDVdU9~f1wF3RGqwmCKKICn4d1Sb9", sessionId: "0:i6qx4rmi:IXc1i6Qcr17pespL5lIlQZql956XOqzk", pageViewId: "0:6ZIHf9BHzVt_vVNj76KFjKmknXJixquh", isNewPartyId: false, isFirstInSession: false…}
divolte.js:21 Signalling event: pageView 0:6ZIHf9BHzVt_vVNj76KFjKmknXJixquh0
allclasses-frame.html:9 GET http://ec2-x-x-x-x.us-west-x.compute.amazonaws.com:8000/resources/fonts/dejavu.css
overview-summary.html:200 GET http://localhost:8290/divolte.js net::ERR_CONNECTION_REFUSED
(Intro: I work on Divolte Collector)
It seems that you are running the example on an AWS instance somewhere. If you are using the pre-packaged JavaDoc files that come with the examples, they have hard-coded the divolte location as http://localhost:8290/divolte.js. So if you are running somewhere other than localhost, you should probably create your own JavaDoc for the example, using the correct hostname for the Divolte Collector server.
You can do so using this command. Be sure to run it from the directory where you source tree is rooted. And of course change localhost for the hostname where you are running the collector.
javadoc -d YOUR_OUTPUT_DIRECTORY \
-bottom '<script src="//localhost:8290/divolte.js" defer async></script>' \
-subpackages .
As an alternative, you could also just try to run the examples locally first (possibly in a virtual machine, if you are on a Windows machine).
It doesn't seem there is anything MapR specific with the issue that you are seeing so far. The Kafka based examples and pipeline should work in any environment that has the required components installed. This doesn't touch MapR-FS or anything else MapR specific. Writing to the distributed filesystem is another story.
We don't compile Divolte Collector against MapR Hadoop currently, but incidentally I have given it a run on the MapR sandbox VM. When installing from the RPM distribution, create a /etc/divolte/divolte-env.sh with the following env var setting:
HADOOP_CONF_DIR=/usr/share/divolte/lib/guava-18.0.jar:/usr/share/divolte/lib/avro-1.7.7.jar:$(hadoop classpath)
Obviously this is a bit of a hack to get around classpath peculiarities and we hope to provide a distribution compiled against MapR that works out of the box in the future.
Also, you need Java 8 to run Divolte. If you install this from the Oracle RPM, add the proper JAVA_HOME to divolte-env.sh as well, e.g.:
JAVA_HOME=/usr/java/jdk1.8.0_31
With these settings I'm able to run the server and collect Avro files on MapR FS, create a external Hive table on those files and run a query.

starting warden after zookeeper of MapR

I am installing the MapR and I stucked at starting warden after start zookeeper on a single node.
# service mapr-warden start
Error: warden can not be started. See /opt/mapr/logs/warden.log for details
On this file there is no detail. Does anybody have a hint? Thanks =)
If you aren't getting anything in warden.log, then it's likely that the warden JVM is never even being started by the mapr-warden init script.
In some MapR versions, the mapr-warden init script will log some details into /opt/mapr/logs/wardeninit.log. You can try checking there.
However, I will also caution that currently the logging done by the init script is sparse and not necessarily user friendly to read. If you can't discern the cause from the contents of the wardeninit.log you can post them here and maybe I can help.
Another thing you can do is edit /etc/init.d/mapr-warden and add "set -x" towards the top of the file, right before the "BASEMAPR=" line, then try starting warden again and you'll get a bunch of shell debugging output on your screen. If you copy and paste that output here that should be enough to tell the root cause of the problem.
One more thing to mention, you may be better off using the http://answers.mapr.com forum as that is MapR specific and I think there may be more users there that could help.
Was configure.sh (/opt/mapr/server/configure.sh -C nodeA -Z nodeA)run on the node? Did zookeeper come up successfully?
service mapr-zookeeper status
Even when using MapR in a single node configure.sh is still required. In fact, without configure.sh warden, zookeeper, cldb and other MapR components will lack their configuration and in many cases will fail to start.
You must run configure.sh after installing the software packages (deb or rpm).

Unable to view any folders on DFS locations connecting to hadoop from eclipse

I have setup Hadoop1.2.1 in windows with CYGWIN installed.
I have started sshd service.
Also started namenode, datanode, mapreduce (job tracker, task tracker). I am able to see the namenode, datanode and mapreduce running status through the following URLs.
When i try connecting the hadoop through eclipse, i am able to.Though i was able to connect hadoop from eclipse, i was not seeing any folders on opening DFS locations. Its displaying as (0) (refer Pic #1 ,
which i guess no directories/files available. The same i checked with namenode storage (refer Pic #2)
Even when i try creating a directory through CYGWIN terminal (refer Pic #4), i was not able to see it in DFS locations in eclipse environment.
That being said, i tried with WordCount example, by setting the input path and output path as follows,
// specify input and output dirs
FileInputFormat.addInputPath(conf, new Path("Input"));
FileOutputFormat.setOutputPath(conf, new Path("Output"));
When i run that in HDFS location from eclipse, i was getting the following exception
13/10/30 06:52:44 ERROR security.UserGroupInformation: PriviledgedActionException as:Administrator cause:org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://localhost:47110/user/Administrator/Input
org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://localhost:47110/user/Administrator/Input
Questions:
Why i am not able to see the directory that i created through CYGWIN terminal and any folders for that matter?
What does "hdfs://localhost:47110" point to?
Am i getting the above exception since it dont see the directory in datanode?
What is the input path should i set?
Please advice me on this.
Thanks in advance.
1st you should have check the all the setting of your hadoop cluster from scratch beacuse this problem shows that you have not configure your eclipse properly with the hadoop cluster
see the following link which help you...
https://www.youtube.com/watch?v=TavehEdfNDk
also check you dfs is connected to your cluster or not means are able to store file and in your dfs or not..