How to isolate libraries in an unmanaged dependency .jar file so they don't conflict with others - scala

I need to add a .jar as an unmanaged dependency to an sbt Scala project (it is the java-stellar-sdk). Everything works well as long as I don't run sbt test. There seems to be a Mockito version in the .jar file that conflicts with the one I am using in the project. I get a lot of errors that certain Mockito matchers are not found but everything works fine without the .jar in the lib folder.
Is there a way to tell sbt that it should ignore certain libraries in the .jar or that managed dependencies take precedence? I also found this related question but obviously it didn't help me.
An alternative workaround would also help a lot. Is it possible to isolate the libraries in the jar in a way that allows me to just make a certain package visible to the outside?
Update: The .jar contains Mockito 2 but my project uses Mockito 1, so this is a very simple and obvious conflict, that I can solve by upgrading to Mockito 2 (which I tried and it works). However, the question remains: Is there another reasonable way to isolate the Mockito dependency in the .jar to not interfere with my project in case I can't or don't want to resolve the conflict buy switching to a newer version of the library in question. Maybe altering the .jar to rename the conflicting packages? I don't know. Something like that.
I know that this is a very general question that has likely been discussed somewhere else in depth. However, I didn't find anything that really satisfied me. Links to relevant discussions of the topic are of course appreciated as well.

I can think of 3 ways for you to do it (ordered from simple to difficult):
delete mockito 2 manually from the jar file.
Since the jar is just a zip file, you can extract it, delete all the conflicting files, and pack it again.
compile that jar from source by yourself, and set mockito as a test dependency (as it should be). If you do that, consider opening a PR with your change, to fix the problem for the community
Shade the mockito files in the jar.
shading is the process of renaming all files in a jar file by certain rules. you can either use jarjarlinks or with sbt assembly plugin. see this answer to get you started with sbt assembly: https://stackoverflow.com/a/47974750/245024

You should be able to arrange for your Mockito 1 classes to appear before the Mockito 2 classes on the classpath. That will cause your classes to win any conflicts.

Related

sbt-assembly: Generate a minimal JAR file

I've been using sbt-assembly to generate standalone JAR file for my scala project. However, I would like to reduce the size of my JAR file (its currently around 150MB and there's defintely room for improvement there).
I used the following command to list the contents of the JAR file that's produced:
jar tf <JAR file>
This revealed that there are lots of classes in the generated JAR file that are not used in the project. I believe these classes get included as part of third-party JARs.
Questions
(a) Is there an option that I can use to instruct sbt-assembly to generate a minimal JAR file that does not include the third-party classes that are not used in my project?
(b) I could use AssemblyStrategy to manually specify which files need to be excluded. Is this a sound strategy? I'm a bit concerned that with this approach the JAR file might end up throwing unexpected ClassNotFound exceptions.
Thanks in advance.
It's not easy to say what's used in your project and what is not. If you include a dependency into a project it might bring a few other ones in. Those child dependencies might also require their own dependencies and so on.
By default if you include some dependency in your project you intend to use it. The author of a dependency usually does the same thing. Thus, there is usually not much you can throw away, it's there for a reason. There are couple cases when this is not true:
Dependency author includes additional dependencies that will be used only in some settings, and that does not apply to your project
You are using a mega-dependency when you actually need only one of its libraries/features.
There are counter examples to this as well: Scalatest does not ship pegdown for generating html test reports because you don't need it usually. But it might be needed if you try to use -h flag to generate html.
Imagine the case when you use Apache Tika for pdf parsing. It wraps PDFBox to do the parsing. You don't need a bloat of all other libraries in that case that parse MS documents. The best thing to do is not to exclude files manually via sbt exclude or sbt-assembly rules because there is a risk you get it wrong and get run time class loading exception. Instead you need to use the right dependency like PDFBox directly. Unfortunately this is a lot of manual work in many cases to figure out all dependencies that you need, so it's your choice: easy and fat JAR, or painful and lean.
There are two ways to exclude dependencies:
Exclude transitive dependencies with exclude. See the docs here.
Don't use the top level dependency and manually add its subdependencies as you need them.
Ok, one more less fun option: use provided and make sure libraries are copied to your target environment and are on classpath. If you have many jars using the same libraries this helps to share those.
You can visualize your dependency tree with this plugin: https://github.com/jrudolph/sbt-dependency-graph. It's very helpful when trying to figure out what you are using and what you can remove. There are some tools like tattletale and loosejar that people suggest but I haven't tried them. If anyone has experience with those please share.
What might want to look at are treeshakers
For Java there's the following (I have not tried/used it):
http://proguard.sourceforge.net/

How do you tell sbt-eclipse to ignore (errors of) a very specific folder under /src

I have an infrastructure project that contains other projects as resources. (Because it compiles them on the fly). One of those contained projects is deliberately one that fails to compile.
This makes the entire project show in eclipse as "with errors".
How can I make sbt-eclipse configure eclipse such that e.g. anything under src/main/resources/foo should be ignored?
Of course this isn't exactly the scenario eclipse was built for, but might there be some clean way around it? as much as it matters, sbt itself does not try to compile these resources.
If not, maybe a way to tell eclipse to not even load source
directories under src/main/resources?
Thanks!

Generate a JAR from one Scala source file

I have no Scala experience, but I need to create a JAR to include on a project's classpath from a single Scala source file.
I'm thinking there is a relatively straightforward way to do this, but I can't seem to figure it out.
The Scala file is here: http://pastebin.com/MYqjNkac
The JAR doesn't need to be executable, it just needs to be able to be referenced from another program.
The most convenient way is to use some build tool like Sbt or Maven. For maven there is the maven-scala-plugin plugin, and for Sbt here is a tutorial.
If you don't want to use any build tool, you may want to compile the code with scalac and then create the jar file manually by using zip on the resulting class files and renaming it to jar. But you have to preserve the directory structure. In your pastebin you use the package org.apache.spark.examples.pythonconverters, so make sure the directories match.
Btw, if you want to just integrate this piece of code with your java project, and using maven, you can have the scala code in your 1 project as well (in src/main/scala). Just use the maven-scala-plugin plugin and hook it to the compile phase, or some sooner phase if your Java code depends on it. However, I don't recommend mixing multiple languages in one project, I would split it into two separate ones.

Setting up a Scala project in Eclipse, together with JUnit & Scalatest

I have recently completed the Scala course on Coursera, and since then I have been looking forward to getting my hands dirty with Scala again. I have written code for some years but I neither educated to be nor work as a programmer, so it took me a while to get a good opportunity but now that I have some time to invest and a good project to work on it's time...
Except I can't seem to get things set up properly, which I find really frustrating. I have OpenJDK 1.7.0_25 running on my Linux machine. I have downloaded and installed the Bundle Scala IDE build for Eclipse (just like we used in the course). And I got ScalaTest both as a jar file and the Eclipse plug-in.
I have a simple project (so far) and no matter what I do I can't seem to get my builds and tests in order. First off how exactly am I supposed to set up my project so that my classes and tests are actually run properly? All the assignments we got were projects that had the same structure, so do I have to have:
project
|--src
|--main
|--scala
|--test
|--scala
structure? If so why is it not the default way the project is setup when I create a new project? Do I create these folders manually, as packages or as source folders? The whole thing gets pretty murky..
I should mention that I tried to "Mavenize" the project using the contextual menu in Eclipse, added my ScalaTest dependency. The first thing that happens is that I get compile errors, at every point of dependency in my code. So clearly the library is not visible, in other words Maven does not seem to be doing much of management. I thought the whole point of Maven was to get and maintain dependencies as the project evolves. I concluded that I do not fully understand the way Maven works and thus I eventually gave up on Maven, once again, and went back to doing things manually.
Secondly, I can't seem to run my tests; the Run As... menu item does not include ScalaTest as it's mentioned in the documentation of ScalaTest Eclipse Plug-in. I have double checked that the plugin is installed. If I instead try to run using JUnitRunner then my tests are not recognized as valid tests. I have JUnit and ScalaTest on my build path, so it's got to be something else.
I suppose my overarching question is as follows:
given the Scala IDE build of Eclipse and ScalaTest, just exactly how am I supposed to set up my project (in Eclipse) so that I can just focus on writing my code and testing it, and hopefully not have any other headaches?
I work alone, and this project is not a product I need to deliver to some client. In other words I do not need to adhere to strict professionalism here. Honestly I just want to be able to code, get better acquainted with Scala and hopefully build a small data analysis tool that I will be using from time to time.
Thanks in advance!
Try using the sbt eclipse plugin:
https://github.com/typesafehub/sbteclipse
This is of course assumes that you use sbt as you build tool. If you don't at the moment you can find instructions on installation and usage here: http://www.scala-sbt.org/
Personally I've been using typesafe giter8 template (https://github.com/typesafehub/scala-sbt.g8) to setup my Scala projects, and then I use the sbt plugin mentioned above to generate eclipse project files.
Scala is somewhat Maven-based (sometimes implicitly), that's why you use that structure.
The easiest way I think is to create a simple Sbt/Maven POM and create the Eclipse project configurations (like with sbt eclipse). There you can set the dependencies (like the actual version of JUnit, Scalatest to use), so you can use the ScalaTest plugin easily.
In case of other issues, feel free to ask at the ScalaTest mailing list, Chee Seng and Bill Venners can help you a lot there.
The Scala IDE website has a full documentation on how to run unit testing frameworks with the IDE, have a look ! If you find missing elements, the bug tracker of the scala-IDE project is here.

eclipse, one classpath for compiling, another for launching

example:
For logging, my code uses log4j. but other jars my code is dependent upon, uses slf4j instead. So both jars must be in the build path. Unfortunately, its possible for my code to directly use (depend on) slf4j now, either by context-assist, or some other developers changes. I would like any use of slf4j to show up as an error, but my application (and tests) will still need it in the classpath when running.
explanation:
I'd like to find out if this is possible in eclipse. This scenario happens often for me. I'll have a large project, that uses alot of 3rd party libraries. And of course those 3rd party jars have their own dependencies as well. So I have to include all dependencies in the classpath ("build path" in eclipse) for the application and its tests to compile and run (from within eclipse).
But I don't want my code to use all of those jars, just the few direct dependencies I've decided upon myself. So if my code accidentally uses a dependency of a dependency, I want it to show up as a compilation error. Ideally, as class not found, but any error would do.
I know I can manually configure the classpath when running outside of eclipse, and even within eclipse I can modify the classpath for a specific class I'm running (in the run configurations), but thats not manageable if you run alot of individual test cases, or have alot of main() classes.
It sounds like your project has enough dependency relationships that you might consider structuring it with OSGi bundles (plug-ins). Each bundle gets its own classloader and gets to specify what bundles (and optionally what version ranges, etc.) it depends on, what packages it exports, whether it re-exports stuff from its dependencies, etc.
Eclipse itself is structured out of Eclipse plug-ins and fragments, which are just OSGi bundles with an optional tiny bit of additional Eclipse wiring (plugin.xml, which is used to declare Eclipse "extension points" and "extensions") attached. Eclipse thus has fairly good tooling for creating and managing bundles built-in (via the Plug-in Development Environment). Much of what you find out there may lead you to conflate "OSGi bundle" with "plug-in that extends the Eclipse IDE", but the two concepts are quite separable.
The Eclipse tooling does distinguish rather clearly (and sometimes annoyingly, but in the "helpful medicine" way) between the bundles in your build environment vs. the bundles that a particular run configuration includes.
After a few years of living in OSGi land, the default Java "flat classpath" feels weird and even kind of broken to me, largely because (as you've experienced) it throws all JARs into one giant arena and hopes they can sort of work things out. The OSGi environment gives me a lot more control over dependency relationships, and as a "side effect" also naturally demands clarification of those relationships. Between these clear declarations and the tooling's enforcement of them, the project's structure is more obvious to everyone on the team.
if my code accidentally uses a dependency of a dependency, I want it to show up as a compilation error. Ideally, as class not found, but any error would do.
Put your code in one plug-in, your direct dependencies in other plug-ins, their dependencies in other plug-ins, etc. and declare each plug-in's dependencies. Eclipse will immediately do exactly what you want. You won't be offered dependencies' dependencies' contents in autocompletes; you'll get red squiggles and build errors; etc.
Why not use access rules to keep your code clean?
It looks like it would better be managed with maven, integrated in eclipse with m2eclipse.
That way, you can only execute part of the maven build lifecycle, and you can manage separate set of dependencies per build steps.
In my experience it helps to be more resrictive, I made the team filling out (paper) forms why this jar is needed and what license...
and they did rather type in a few lines of code instead of drag along 20 jars to open a file using only one line of code, or another fancy 'feature'.
Using maven could help for a while, but when you first spot jars having names like nightly-build or snapshot, you will know you're in jar-hell.
conclusion: Choose dependencies well
Would using the slf4j-over-log4j jar be useful? That allows using slf4j with actual logging going to log4j.