Selectively include dependencies in JAR - scala

I have a library that I wrote in Scala that uses Bouncy Castle and has a whole bunch of dependencies. When I roll a jar, I can either roll a "fat" jar that has all the dependencies (including scala), which weighs in around 19 MB, or I can roll a skinny jar, which doesn't have dependencies, but is only a few hundred KB.
The problem is that I need to include the Bouncy Castle classes/jar with my library, because if its not on the classpath at runtime, all kinds of exceptions get thrown.
So, I think the ideal situation is if there is some way that I can get either Maven or SBT to include some but not all dependencies in the jar that gets rolled. Some dependencies are needed at compile-time, but not at run time, such as the Scala standard libraries. Is there some way to get that to happen?
Thanks!

I would try out the sbt proguard plugin from https://github.com/nuttycom/sbt-proguard-plugin . It should be able to weed out the classes that are not in use.

If it is sufficient to explicitly define which dependencies should be added (one the artifact-level, i.e., single JARs), you can define an assembly (in case of a single project) or an additional assembly project (in case of a multi-module project). Assembly descriptors can explicitly exclude/include artifacts from the dependencies.
Here is some good documentation on this topic (section 8.5.4), here is the official documentation.
Note that you can include all artifacts that belong to one group by using the wildcard notation in dependecySets, e.g. hibernate:*:jar would include all JAR files belonging to the hibernate group.

Covering maven...
Because you declare your project to be dependent upon bouncy castle in your maven pom, anybody using maven to depend upon your library will by default pull in bouncy castle as a transitive dependency.
You should set the appropriate scope on your dependencies, e.g. compile for stuff needed at compile and runtime, test for dependencies only needed in testing and provided for stuff you expect to be provided by the environment.
Whether your library's dependencies are packaged into dependent projects when they are built is a question of how those are projects configured and setting the scopes will influence the default behaviour.
For example, jar type packaging by default does not include dependencies, whereas war will include those in compile scope (but not test or provided). The design aim here was to have packaging plugins behave in the most commonly required way without needing configuration, but of course packaging plugins in maven can be configured to have different behaviour if needed. The plugins themselves which do packaging are well documented at the apache maven site.
If users of your library are unlikely to be using maven to build their projects, an option is to use the shade plugin which will allow you to produce an "uber-jar" which contains all the dependencies you wish. You can configure particular includes or excludes.
This can be a problematic way to deliver, for example where your library includes dependencies which version clash with the direct dependencies of projects using it, i.e. they use a different version of the same libraries yours does.
However if you can it is best that you leave this to maven to manage so that projects using your library can decide whether they want your dependencies or to specify particular versions giving them more flexibility. This is the idiomatic approach.
For more information on dependencies and scopes in maven, see the reference guide published by Sonatype.

I'm not a scala guy, but I have played around with assembling stuff in Java + Maven.
Have you tried looking into creating your own assembly descriptor for the assembly plugin? https://maven.apache.org/plugins/maven-assembly-plugin/assembly.html
You can copy / paste the jar-with-dependencies descriptor then just add some excludes to your < dependencySet >. I'm not a Maven expert, but you should be able to configure it so different profiles kick off different assembly builds.
EDIT: Ack, didn't see my HTML got hidden

Related

Adding Dependency Management to an Existing Java Project

I'm working on upgrading a legacy Java project to be compatible with jboss wildfly. As part of that process, I'm replacing our old system of managing dependencies (manually scanning for jars in a folder) with an automated system.
My first thought was to use maven, which worked well initially. The maven plugin for eclipse was able to scan my project and create a pom with most of the required dependencies. That works fine for compiling and running with eclipse, but production deployment uses an ant build script. I looked into maven-ant-resolver ( https://maven.apache.org/resolver-ant-tasks/index.html ) but as far as I can tell that project doesn't have a way to add dependencies to the classpath, the best it can do is bundle them into a jar.
The other option I looked at was Ivy. It seems better suited to integration with ant. Unfortunately, the tooling for ivy seems primitive compared to maven. From what I can tell, there is no option to generate the dependency file (ivy.xml) from an existing project. With the number of dependencies I'm dealing with, especially from jboss, creating the dependency xml from scratch is not a realistic option.
What are my options for solving this problem? Is there a way to do what I want with maven or ivy that I'm not seeing? Is there another dependency management tool out there that offers all the features I need?
The maven-assembly-pluginis what i can recommend for likely usecases. Not sure if it suits you though.
In a nutshell:
You can pack folders, jars, resources, dependencies, whatever into a jar for production deployment. This jar is packaged with the, from maven-assembly-plugin internally used and thus not needed to be referenced explicitly, maven-archiver-plugin which also stores a MANIFEST.MF with the classpath in it (not by default but with few codes of tweaking).
Useful to know though: Maven allows you to quite easily create own Plugins that completely do what you want. If its just a file with the stored classpath, this could be a clean solution.

How to package only the necessary libs in Google App Engine Project(Java)?

Let me explain first what I mean by necessary libs. I'm creating my first project using the Google App Engine for Java with the official Google Maven Plugin, the main problem that have Maven as a packaging solution (or maybe the Java development as a whole) is that if the dependency tree grows too much, the release process may be harder.
Let me illustrate it with an example. Let's start with the Jackson JSON library (it's a good starting point since it has no parent dependencies), now someone makes a JSON-RPC library and uses Jackson for the JSON serialization/deserialization. Imagine that this library not just provides a JSON-RPC client implementation, but also a server, that means that the POM of this lib will add some Java EE related libraries such us Jetty as dependencies.
Probably the guidelines say that the application should be either divided into modules or mark the server related deps as optional, but you know that many people don't follow the standards.
Now someone need a JSON-RPC client for his/her project, call it Project X, and uses the lib mentioned above, at compile time there will be no problems, Maven will successfully download the required libs and the application will compile fine, but the problem comes when that person wants to release the application. Which dependencies should be distributed along with the package (in a lib folder for example)?
Actually that's something that happened to me, I wasn't too much familiar with Maven so I used the Eclipse Runnable Jar Exporter, that produced jar file with all the maven libs copied to a lb subfolder, so the workaround that I did then was to just delete the libs that looked unnecessary and then tested if the application was still working. If there are classes that are not executed, as far as I know they are not loaded by the ClassLoader so they could be omitted and are unnecessary
I can't use the same trick now since the scenario is much more complex, we are talking of a Java Web Application, not a desktop application like the other one, and the library that I want to include is a Liquid Template Engine, which uses the ANTLR framework to generate the parsers plus Jackson for the JSON handler and Jsoup for HTML parsing.
Which libs should be packaged inside the WEB-INF/lib folder? I'm sure that I will need Jackson for JSON parsing but I'm not so sure about Jsoup, and what about ANTLR, it is necessary or is used just at compile time?
Update: I think I need to re-formulate my question, actually what I want is to determine which dependencies are really necessary for the application, and package those into the app WEB-INF/lib folder
Solution: It seems that the POM file that is packaged in the WAR file of the web app is used once the app is in the Google App Engine production environment to retrieve the necessary dependencies, and probably the appengine:update goal only packages those dependencies that can't be retrieved from the maven central repo, so there is no need to worry about that.
Thanks to David to point this.
You should check Maven's dependency scopes. Here's an extract from the documentation :
There are 6 scopes available:
compile This is the default scope, used if none is specified. Compile
dependencies are available in all classpaths of a project.
Furthermore, those dependencies are propagated to dependent projects.
provided This is much like compile, but indicates you expect the JDK
or a container to provide the dependency at runtime. For example, when
building a web application for the Java Enterprise Edition, you would
set the dependency on the Servlet API and related Java EE APIs to
scope provided because the web container provides those classes. This
scope is only available on the compilation and test classpath, and is
not transitive.
runtime This scope indicates that the dependency is
not required for compilation, but is for execution. It is in the
runtime and test classpaths, but not the compile classpath.
test This
scope indicates that the dependency is not required for normal use of
the application, and is only available for the test compilation and
execution phases.
system This scope is similar to provided except that
you have to provide the JAR which contains it explicitly. The artifact
is always available and is not looked up in a repository.
import (only
available in Maven 2.0.9 or later) This scope is only used on a
dependency of type pom in the section. It
indicates that the specified POM should be replaced with the
dependencies in that POM's section. Since they
are replaced, dependencies with a scope of import do not actually
participate in limiting the transitivity of a dependency.
So in a Maven project, the developer indicates which dependencies should be bundled in the application and which should not.
Basically there are two cases here :
If you're building a web application (WAR or EAR format) and want to deploy it, or if you're building an actual runnable jar, then you will need to bundle it with all the dependencies with scope compile and runtime.
If you're building a library, then you do not package any dependency with your library. Instead you include the pom.xml so that others know what dependency your library requires. For Maven to know how to find the associated POM for a given jar, the best and most common solution is to deploy the library to a Maven repository. Repos have a directory structure that helps Maven find the right version of a library, and find the POM that indicates the required dependencies.
Depending on wether your library is open source or not, you will be able to be hosted for free by some repositories such as Sonatype (complete list here). But you can also setup your own repository either by installing a dedicated software such as Nexus or by configuring a Github project as the repo, as is explained on this blog.
You can exclude any transitive dependency.
For your case, to remove jetty from this json-rpc-library, you need:
<dependency>
<groupId>com.somecomp</groupId>
<artifactId>jsonrpclib</artifactId>
<version>1.0</version>
<scope>compile</scope>
<exclusions>
<exclusion>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-server</artifactId>
</exclusion>
</exclusions>
</dependency>
See docs: http://maven.apache.org/guides/introduction/introduction-to-optional-and-excludes-dependencies.html

How do I retrieve the struts2-junit-plugin without using Maven?

I've googled on it and I've seen a lot of different sites that offers the struts2-junit-plugin.
I'm currently using struts-2.2.1.1. Should I get struts2-junit-plugin-2.2.1.1 as well?
Also, my project doesn't use Maven. When I downloaded a struts2-junit-plugin, I inspected the .jar file and found a pom.xml containing all of its dependencies. Should I separately download all these dependencies manually since I don't use Maven?
Yes, you should use the matching junit-plugin version (2.2.1.1)
Yes, but you also need to load the dependencies' dependencies, the dependencies' dependencies' dependencies ad nauseam until you're at the end.
Point 2 is why you really should be using Maven or similar mechanism.
Getting dependencies by hand is error-prone, tedious, and silly.

is there a way to generate a pom.xml with dependencies from an eclipse project?

I have inherited a big project with several subprojects.
all of them use several jar files, all of them located under each project's lib directory. I want to take all the projects and migrate them to maven, but dependencies are a problem (too many of them), some of them are commonly used libraries (apache projects, xerces, jms, etc) and others are not.
is there a way to autogenerate maven dependencies for those jars that can be found on public maven repositories. for example, see that my project use the spice-jndikit-1.2.jar file and automatically get the appropiate depedency with group, artifact and (if possible) version?
thank you
I wrote a groovy script to generate a starting set of Apache ivy files.
https://github.com/myspotontheweb/ant2ivy
In my case, I wanted to "Maven-ize" my ANT builds without switching completely away from ANT.
It is feasible to extend this code to generate a Maven POM, if people were interested in this feature.
You can convert a project to Maven using the m2e plugin, but this erases your jar references, and should not be used.
I doubt that such a thing exists since typical jars (unless themselves built with Maven) don't have the necessary information to correlate the groupId, artifactId and version back to a repository to get the proper path.
You might be able to write something that parses the file name for the name and version, but you still have the package-based path to figure out.
If you're building using Ant, you might also consider using Apache Ivy, and its file-system based resolution (very fast and easy to configure), to get you started, and slowly role over to the Maven repos for the artifacts, this way you're not spending a lot of time up-front finding Maven dependencies.

eclipse, one classpath for compiling, another for launching

example:
For logging, my code uses log4j. but other jars my code is dependent upon, uses slf4j instead. So both jars must be in the build path. Unfortunately, its possible for my code to directly use (depend on) slf4j now, either by context-assist, or some other developers changes. I would like any use of slf4j to show up as an error, but my application (and tests) will still need it in the classpath when running.
explanation:
I'd like to find out if this is possible in eclipse. This scenario happens often for me. I'll have a large project, that uses alot of 3rd party libraries. And of course those 3rd party jars have their own dependencies as well. So I have to include all dependencies in the classpath ("build path" in eclipse) for the application and its tests to compile and run (from within eclipse).
But I don't want my code to use all of those jars, just the few direct dependencies I've decided upon myself. So if my code accidentally uses a dependency of a dependency, I want it to show up as a compilation error. Ideally, as class not found, but any error would do.
I know I can manually configure the classpath when running outside of eclipse, and even within eclipse I can modify the classpath for a specific class I'm running (in the run configurations), but thats not manageable if you run alot of individual test cases, or have alot of main() classes.
It sounds like your project has enough dependency relationships that you might consider structuring it with OSGi bundles (plug-ins). Each bundle gets its own classloader and gets to specify what bundles (and optionally what version ranges, etc.) it depends on, what packages it exports, whether it re-exports stuff from its dependencies, etc.
Eclipse itself is structured out of Eclipse plug-ins and fragments, which are just OSGi bundles with an optional tiny bit of additional Eclipse wiring (plugin.xml, which is used to declare Eclipse "extension points" and "extensions") attached. Eclipse thus has fairly good tooling for creating and managing bundles built-in (via the Plug-in Development Environment). Much of what you find out there may lead you to conflate "OSGi bundle" with "plug-in that extends the Eclipse IDE", but the two concepts are quite separable.
The Eclipse tooling does distinguish rather clearly (and sometimes annoyingly, but in the "helpful medicine" way) between the bundles in your build environment vs. the bundles that a particular run configuration includes.
After a few years of living in OSGi land, the default Java "flat classpath" feels weird and even kind of broken to me, largely because (as you've experienced) it throws all JARs into one giant arena and hopes they can sort of work things out. The OSGi environment gives me a lot more control over dependency relationships, and as a "side effect" also naturally demands clarification of those relationships. Between these clear declarations and the tooling's enforcement of them, the project's structure is more obvious to everyone on the team.
if my code accidentally uses a dependency of a dependency, I want it to show up as a compilation error. Ideally, as class not found, but any error would do.
Put your code in one plug-in, your direct dependencies in other plug-ins, their dependencies in other plug-ins, etc. and declare each plug-in's dependencies. Eclipse will immediately do exactly what you want. You won't be offered dependencies' dependencies' contents in autocompletes; you'll get red squiggles and build errors; etc.
Why not use access rules to keep your code clean?
It looks like it would better be managed with maven, integrated in eclipse with m2eclipse.
That way, you can only execute part of the maven build lifecycle, and you can manage separate set of dependencies per build steps.
In my experience it helps to be more resrictive, I made the team filling out (paper) forms why this jar is needed and what license...
and they did rather type in a few lines of code instead of drag along 20 jars to open a file using only one line of code, or another fancy 'feature'.
Using maven could help for a while, but when you first spot jars having names like nightly-build or snapshot, you will know you're in jar-hell.
conclusion: Choose dependencies well
Would using the slf4j-over-log4j jar be useful? That allows using slf4j with actual logging going to log4j.