How to connect/run map reduce on Hadoop installed on Ubuntu

How to connect/run map reduce on Hadoop installed on Ubuntu - eclipse

I have successfully installed Hadoop on Ubutu and running well, now I want to run a sample mapReduce using Eclipse connecting the Hadoop I installed.
it is really great if some could helps to sort this.
Thanks
Rajesh

You can just export the eclipse MR code as Jar and place it in the Local file system and run the following command.
hadoop jar <jarFileName> [<argumentsToBePassed>]

Related

Snowpark connection errors with 0.6.0 jar

I am trying to use snowpark(0.6.0) via Jupiter notebooks(after installing Scala almond kernel). I am using Windows laptop. Had to change the examples here a bit to work around windows. Following documentation here
https://docs.snowflake.com/en/developer-guide/snowpark/quickstart-jupyter.html
Ran into this error
java.lang.NoClassDefFoundError: Could not initialize class com.snowflake.snowpark.Session$
ammonite.$sess.cmd5$Helper.<init>(cmd5.sc:6)
ammonite.$sess.cmd5$.<init>(cmd5.sc:7)
ammonite.$sess.cmd5$.<clinit>(cmd5.sc:-1)
Also tried earlier with IntelliJ IDE,got bunch of errors with missing dependencies for log4j etc.
Can I get help.

Have not set it up in Widows but only with Linux.
You have to do the setup steps for each notebook that is going to use Snowpark (part from installing the kernel).
It's important to make sure you are using a unique folder for each notebook, as in step 2 in the guide.
What was the output of the import $ivy.com.snowflake:snowpark:0.6.0?

PDI MongoDB Input and Output steps

I recently installed PDI 8.2 CE - doesn't look like it comes with the Mongo input and output steps...
I found it here https://github.com/pentaho/pentaho-mongodb-plugin I unzipped that and put the entire folder in to the ../data-integration/plugins directory and restarted PDI but still no mongo input/output.
What am I doing wrong?

you do not need to add any additional plugins because it's under big data section you will find both input and output for mongodb.

Figured this out and it appears that PDI is very picky when it comes to which versions of JDK/JRE your running on your machine regardless of what you have set as the PENTAHO_JAVA_HOME.
Had to uninstall all of them, and then ensure that only OpenJDK 1.8 was installed.

spark-submit working for Python Programs but pyspark doesn't work

Stating the obvious, I have recently installed SPARK on UBUNTU (VMWARE workstation). Below is my PC Specs.
Windows Dell laptop (running windows 10).
Installed VMWARE PRO 12 and loaded Ubuntu 15 on it.
Installed SPARK 1.6.1 / JAVA 1.7 / Python 2.7 and SCALA 2.11.8 using standard scripts.
I ran a sample program using spark submit command and it completes fine. But when i try to login to pyspark shell, I get the error message "pyspark: command not found"
What seems to be the problem. I can see all the files in the bin directory of spark (both pyspark and spark-submit).
Regards
VVSAUCSE

Not sure if this is how it works, the full path with command name works fine. My apologies for not knowing earlier or trying it out.
/SparkPath/bin/pyspark works fine rather than just pyspark

sbt/sbt : no such file or directory error

I'm trying to install spark in my ubuntu machine. I have installed sbt and scala. I'm able to view their versions. But, when I try to install spark using 'sbt/sbt assembly' command, i get the below error.
'bash: sbt/sbt: No such file or directory'
Can you please let me know where I am making a mistake. I have been stuck here since yesterday.
Thank you for the help in advance.

You may had downloaded the pre-built version of Spark. If its a pre-built you dont need to execute built tool command(sbt) and it wont be available.

Trouble with kafka example

I downloaded Apache Kafka and successfully completed the quickstart . When i tried running the example program provided, I got an error :
cannot load main class.
I dont know what I'm missing I even set the classpath.

Run this command if you are on Amazon EC2:
sudo yum install java-1.6.0-openjdk-devel
javac must be there. For that you need both JDK and JRE. You probably only have JRE installed.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

How to connect/run map reduce on Hadoop installed on Ubuntu - eclipse

I have successfully installed Hadoop on Ubutu and running well, now I want to run a sample mapReduce using Eclipse connecting the Hadoop I installed. it is really great if some could helps to sort this. Thanks Rajesh

You can just export the eclipse MR code as Jar and place it in the Local file system and run the following command. hadoop jar <jarFileName> [<argumentsToBePassed>]

Related

Snowpark connection errors with 0.6.0 jar

PDI MongoDB Input and Output steps

spark-submit working for Python Programs but pyspark doesn't work

sbt/sbt : no such file or directory error

Trouble with kafka example

Categories

Resources