"Not A Valid Jar" When trying to run Map Reduce Job - eclipse

I am trying to run a my MapReduce job by building a jar from eclipse , but while trying to execute the job , I am getting "Not a valid Jar" error.
I have tried to follow the link Not a valid Jar but that didnt help.
Can anyone please give me the instructions on how to build the jar from eclipse, for it to run on Hadoop.
I am aware of the process of building the Jar file from eclipse,however I am not sure, do I have to take any special care for building a jar file, so that it runs on Hadoop.

When you submit the command, make certain you have the following things on the line to do the command:
When you indicate the jar, make certain you are directing to the jar properly. It may be easiest to be certain by using the absolute path. To get the absolute path, if you navigate to the place where the jar is, then run 'readlink -f ' command to get the absolute path. So for you, not just hist.jar, but maybe /home/akash_user/jars/hist.jar or wherever it is on your system. If you are using Eclipse, it may be saving it somewhere funny, so make sure that is not the problem. The jar cannot be run from HDFS storage. must run from local storage.
When you name your main class, in your example Histogram, you must use the fully qualified name of the class, with the package, the project, and the class. So, usually, if the program/project is named Histogram, and there is a HistogramDriver, HistogramMapper, HistogramReducer, and your main() is in HistogramDriver, you need to type Histogram.HistogramDriver to get the program running. (Unless you made your jar runnable, which requires extra stuff at the beginning, making .mdf and things.)

Make sure that the jar you are submitting (hist.jar) is in the current directory from where you are submitting the 'hadoop jar' command.
If the issue is still persisting, please tell the Java, Hadoop and Linux version you are using.

You should not keep the jar file in HDFS when executing the MapReduce job. Make sure Jar is available in the local path. Input path and output directory should be the path from HDFS.

Related

integrate newrelic in flink scala project

I want to integrate newrelic in my flink project. I have downloaded my newrelic.yml file from my account and have changed the app name only and I have created a folder named newrelic in my project root folder and have placed newrelic.yml file in it.
I have also placed the following dependency in my buld.sbt file:
"com.newrelic.agent.java" % "newrelic-api" % "3.0.0"
I am using the following command to run my jar:
flink run -m yarn-cluster -yn 2 -c Main /home/hadoop/test-assembly-0.2.jar
I guess, my code is not able to read my newrelic.yml file because I can't see my app name in newrelic. Do i need to initialize newrelic agent somewhere (if yes, how?). Please help me with this integration.
You should only need the newrelic.jar and newrelic.yml files to be accessible and have -javaagent:path/to/newrelic.jar passed to the JVM as an argument. You could try putting both newrelic.jar and newrelic.yml into your lib/ directory so they get copied to the job & task managers, then adding this to your conf/flink-conf.yaml:
env.java.opts: -javaagent:lib/newrelic.jar
Both New Relic files should be in the same directory and you ought to be able to remove the New Relic line from your build.sbt file. Also double check that your license key is in the newrelic.yml file.
I haven't tested this but the main goal is for the .yml and .jar to be accessible in the same directory(the yml can go into a different directory but other JVM arguments will need to be passed to reference it) and to pass -javaagent:path/to/newrelic.jar to as a JVM argument. If you run into issues try checking for new relic logs in the log folder of the directory where the .jar is located.

Editing Spark Module in Spark-kernel

We are currently editing a specific module in Spark. We are using spark-kernel https://github.com/ibm-et/spark-kernel to run all our spark jobs. So, what we did is compile again the code that we have edited. This produces a jar file. However, we do not know how to point the code to the jar file.
It looks like it is referencing again to the old script and not to the newly edited and newly compiled one. Do you have some idea on how to modify some spark packages/modules and reflect the changes with spark-kernel? If we're not going to use spark-kernel, is there a way we can edit a particular module in spark for example, the ALS module in spark: https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala. Thanks!
You likely edited a scala or java file and recompiled (even though you call them scripts, they are not scripts in the strict sense because they are not interperted). Assuming that's what you did....
You probably then don't have a clean replacement of the resulting JAR file in the deployment you are testing. Odds are your newly compiled JAR file is somewhere, just not in the somewhere you are observing. To get it there properly, you will have to build more than the JAR file, you will have to repackage your installable and reinstall.
Other techniques exist, if you can identify the unpacked item in an installation, sometimes you can copy it in place; however, such a technique is inherently unmaintainable, so I recommend it only on throw away verification of the change and not on any system that will be used.
Keep in mind that with Spark, sometimes the worker nodes are dynamically deployed. If that is so, you might have to locate the installable of the dynamic deployment system and assure you have the right packaging there too.

Is it possible to save settings and load resources when compiling to just one standalone exe?

If I compile a script for distribution as a standalone exe, is there any way I can store settings within the exe itself, to save having to write to an external file? The main incentive for this is to save having to develop an installation process. I only need to store a few bytes.
Also, can resources such as images be compiled into the exe?
Using alternate data streams opens up a can of worms so i wouldn't go that way. Writing back config data into the exe itself won't work as the file is locked for write access during execution.
What i usually do is to store config data under %A_AppData%\%A_ScriptName%\%A_ScriptName%.ini
When the script starts i use IniRead which also provides a default value if the key isn't found - which is the case the script is executing for the first time.
The complementing IniWrite's in a OnExit subroutine/function will create the ini file if necessary.
This way no installation is needed and the config is stored in the proper, familiar place.
The autohotkey forum has dealt with this question before.
In that case, the user didn't want extra files -- period.
The method was to use the file system to save alternate data.
Unfortunately I can't find the post.
A simpler method is to use fileinstall command.
When the script is compiled, the external file is stored within the exe.
When the script executes the same command as an exe, the file is copied to the same
directory as the running script. It is a simple yet effective 'install'.
With a little testing for the config file, the fileinstall command can be skipped.
Skipping the fileinstall could allow changes to be made to the configuration after 'installation'
I have not tried saving settings within the compiled exe file, but I have included resources. I'm not sure which version of AHK you're using or how you are compiling, but I can right-click my scripts to compile. There's an option to compile with options, where you can include resources in your compiled exe.Compile with options

How to distribute my Java program so that it is runnable by double-clicking a single file?

I have a Java rich client desktop app. that I want to distribute on some computers at work, but I've never done something like this before. People aren't too computer-savy at my workplace and since it is a student job, I won't be there for much longer and I'd like it if I could make my program easy to run by making it runnable when people double-click on it.
I also don't want to have to manually install a JRE to have it run. Basically, what I'd like to know is how to make my java application runnable easily by double-clicking (even if it's only on windows, it's okay). I'm pretty sure I'm going to need to package the correct JRE version alongside, but I don't know what's the correct way of doing this.
I read on some sites that you should not package a JRE along with your program because it makes people have multiple different versions, some of which are outdated, and it causes security issues, but this is not a problem in this case since the computers that are going to run my application are not connected to the internet and are only used to run this program anyway.
Somewhat related question: Since my application is currently an Eclipse project, I get my resources such as icons, images, SQLite database (for read and write), etc. using relative paths (e.g.: img/test.png).
Am I going to have to change any of those paths to have them keep working even while packaged?
What you're looking for is a JAR file. In eclipse, it's quite easy to make a Jar file. Specifically, you'll want to right click on your project, go to Export, and then select "Runnable Jar." Be careful with paths to folders. You may need to keep a resources folder next to the Jar file. You may need to provide some more specifics to get an exact answer on that. Typically, a Resources folder is located in the same spot as the JAR file (in the same folder on your computer).
A better option for easy install of a Java app. with a GUI is to launch it using Java Web Start. For the user, JWS is the 'one click' installation option that can (install & launch the app. then) add desktop shortcuts and menu items. A JWS launch would mean some more work for you, but it is a breeze for the end user.
To ensure a suitable JRE is present to run the app., use deployJava.js (see the JWS link for more details). The script would need to be reconfigured to get the JRE installer from your local network - the default is to get it from Oracle.
Most of the resources should be packaged in Jar files and supplied along with the app., but for the DB, use the JNLP ExtensionInstallerService to call the DB installer.
..Java Web Start is kind of a link (or I can make it a shortcut on the desktop) that the users will click to either install the JRE and run the program if the JRE isn't installed, or just run the program if the JRE is present on the computer.
The way it would work is to have a web page on the local intranet. When the user visits the page, the script checks for a suitable JRE.
If it is present, it writes the link to the launch file.
If there is no JRE, or the version is too low, it will guide the user through installing it (just a matter of them clicking 'OK' when prompted). Then it will put the link to the app.
I can then configure the link to grab the JRE from the server on our network.
That's the part where you need to reconfigure the script. AFAIR the script exposes an URL at which to look for JREs - that can be changed to point to a place on the intranet.
..So "Web" is only just in the name, the computers don't have to be connected to the internet to have this work, right?
Yes. JWS is a great launch technology for Java rich clients, but is a poorly chosen name.
To make the problem run by double clicking it you can distribute it as a jar file or a batch file to call the jar file.
For the installation part you can make a batch file that checks if java is present and then call the installer if it isn't.
Edit:
The batch code:
IF DEFINED JAVA GOTO ok
java-installer.exe
GOTO end
:ok
your-application.jar
:end
If you are finding it tough to implement the above mentioned methods. You can proceed with this simple approach.
Create a folder lib at a location. Place all the jars that your application uses into this. If you are able to create a jar for your application, you can very well place your application.jar into the lib folder too. Create a batch file at the same location that will contain the java command for your main class in it. The text within your batch might look something similiar to this :
set path="\lib\"
java -cp %path% package1.package2.MainClass
If you have any other dependencies, for ex: if you use images in your code under img/icon.jpg. Then you just have to shift the img folder to this location too.
Just zip these files using winrar and share it across. Running the batch file after extracting the zip would launch your java MainClass irrespective of the location in which it is placed in the client system.
PS : If you are unable to create a jar for your application and placing it in lib folder, just copy your bin folder with class files and paste it in the location and change the batch file accordingly to look for classes inside bin.

Environment.CurrentDirectory with NUnit GUI differs to the TeamCity value, how can I sync them?

As above really, I have some integration tests that use files from a relative file path. To help picture it here is the file structure:
/Dependencies
/VideoTests/bin/release/video.dll
/SearchTests/bin/release/search.dll
/OtherProjects
The GUI is running the tests from the root, however when TeamCity runs the tests it is running the tests from each test dlls bin directory. Now I don't mind which one I can get to follow the other but I do need them to be the same otherwise my relative paths just won't work!
Any ideas?
P.S. Using TeamCity 5.0 and NUnit 2.5.
You probably don't want to rely on CurrentDirectory. I'd suggest reading the doc, but the main point you'll want to take away is that the CurrentDirectory is where the .exe was started from: it could be any path in the system. For example, let's assume your users add your .exe (or whatever .exe uses your DLLs) to their path. They could then navigate to c:\foo\bar and start the .exe from there, which would set the CurrentDirectory to "C:\foo\bar" and you may not be able to deal with that.
I think it would be preferable for you to rework whatever you're doing so you don't rely on CurrentDirectory. What problems are you encountering by relying on CurrentDirectory right now?
Have you made sure that both TeamCity and NUnit are using the same working directory when starting the application?
And if they aren't, you could adjust the current directory in the test code.