Is there an equivalent to shap (Python) in spark scala? - scala

I'm currently working in spark scala. I need to calculate some shapley values. And I can't find any package equivalent to the python shap 's one.
Is there any equivalent in scala ?
If not, I'm currently working on a project where I need to explain the biggest mistakes the ml algo does. If any of you have some tips or advice, it would be welcomed!
PS : English is not my mother tongue, I might have done big mistakes.

Related

Consume Scala syntax-trees from external tool

I would like to develop a tool that would consume scala syntax-trees (as the title suggests). More specifically it would be great if I could consume the trees after each compilation phase.
My research led me to Dotty's TASTY interchange format which seemed to be what I was looking for. Perhaps it is.
However, I was not able to find adequate documentation on-line to figure out how to extract it and consume it.
I also looked at dotc compiler flags and couldn't figure out an obvious approach.
I noticed the option : "-print-tasty" but I couldn't verify the expected output or perhaps I am missing something ?
Of course I can always print the AST after each phase using the scala printer (i.e., -Yshow-trees etc.). Is this my only option ? If it is, then fine.
Ideally, it would be great if I could consume the ASTs in a more "machine-friendly" format if you will. TASTY seems to be what I want in theory, i.e., a serialization of the AST, but I am not sure how to extract this after each phase.
I do apologize if my question is too trivial or has already been addressed. Any feedback would be highly appreciated ! Thanks !
P.S.: What if the ASTs were encoded in a JSON format ? Would a scala tool like that make sense, (i.e., a tool that converts Scala ASTs to JSON and back) ?

Converting SBML model into a simulatable Matlab Function

I'm looking for a tool to convert a SBML model into a Matlab function. I've tried SBMLTranslate() function from libSBML but this returns a Matlab struct, not a function. Does anybody know if such tool exists? Thanks
There are at least three efforts in this direction:
Frank Bergmann offers an online service for SBML translation where you can upload an SBML file and it will generate a MATLAB file. The comments at the top of the generated MATLAB file explain how to use the results. The C++ source code is available on SourceForge.
Bergmann's code referenced above was used by Stanley Gu to create sbml2matlab, a Windows standalone program. Off-hand, I don't know whether Gu's version changed or enhanced the algorithm used by the Bergmann version, but it seems likely. (Note: Gu now works at Google and does not maintain this code anymore, as far as I know.)
The Systems Biology Format Converter (SBFC) is a framework written principally by Nicolas Rodriguez; it includes a collection of converters, one of which is an SBML-to-MATLAB converter. This converter is written in Java.
I have not compared the results of the translators myself yet, so cannot speak to the differences or quality of output. If you try them and have any feedback to relate, please let the authors know. Knowing what has or hasn't worked for real users will help improve things in the future.
A final caveat is that all of these have been research projects, so make sure to set your expectations accordingly. (This is not a criticism of the authors; the authors are very good – I know most of them personally – but the reality of academic development work is that we all lack the time and resources to make these systems comprehensive, hardened, polished, and documented to the degree that we wish we could.)

Scala criterion equivalent

Is there a Scala (or Java, I guess) equivalent of criterion? I'm not just talking about a benchmarking library: check out what criterion does for HTML results.
No. As far as I can tell as of 2012-Nov-26 Criterion has not been ported to any other language ecosystem. There's no fundamental reason for this.

random forest code review

I'm doing a research project on random forest algorithm. I have found numerous implementations of the algorithm but the main part of the code is often written in Fortran while I'm completely naive in it.
I have to edit the code, change the main parameters (like tree depth, num of feature variables, ...) and trace the algorithm's performance during each run.
Currently I'm using "Windows-Precompiled-RF_MexStandalone-v0.02-". The train and predict functions are matlab mex files and can not be opened or edited. Can anyone give me a piece of advice on what to do or is there a valid and completely matlab-based version of random forests.
I've read the randomforest-matlab carefully. The main training part unfortunately is a dll file. Through reading more, most of my wonders is now resolved. My question mainly was how to run several trees simultaneously.
Have you taken a look at these libraries?
Stochastic Bosque
randomforest-matlab
If you're doing a research project on it, the best thing is probably to implement the individual tree training yourself in C and then write Mex wrappers. I'd start with an ID3 tree (before attempting C4.5 for instance.) Then write the random forest code itself, which, once you write the tree code, isn't all that hard.
You'll:
learn a lot
be able to modify them as much as you like
eventually move on to exploring new areas with them
I've implemented them myself from scratch so I can help once you post some of your own code. But I don't think anybody on this site will write the code for you.
Will it take effort? Yes. Will you come out of it with more knowledge and ability than you had going in? Undoubtably.
There is a nice library in R called randomForest. It is based on the original implementation of Breiman in Fortran but it is now mainly recoded in C.
http://cran.r-project.org/web/packages/randomForest/index.html
The main parameters you talk about (tree depth, number of features to be tested, ...) are directly available.
Another library I would recommend is Weka. It is java based and lucid.Performance is slightly off though compared to R. The source code can be downloaded from http://www.cs.waikato.ac.nz/ml/weka/

How to start on scala [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
I am .NET developer and I'd like to broaden my horizons a bit and after checking out modern tendencies decided to try Scala. Can you please advise a good strategy to start on it? Should I learn Java first? What source or handbook should I read? Is there any OS projects to practice Scala and grow on them?
Thanks,
Dominique
You might gain a first impression by visiting Simply Scala where you have an online interpreter available.
An absolute classic is Scala for Java Refugees which was originally written for people coming from Java, but will be quite helpful for you, considering how similar the basics of C#/Java are.
You don't need to learn Java first , but you need to have the Java runtime/development kit installed and working.
Then go to http://www.scala-lang.org/downloads and download the appropriate package for your operating system (I always prefer the nightly builds of Scala, they have more bug-fixes than the latest stable one).
After that, run the Scala REPL which is basically "Simply Scala offline" (Simply Scala uses the Scala REPL behind the covers, too). Even many Java programmers use the Scala REPL to prototype things first.
If you prefer books to learn I can recommend Programming in Scala (2nd edition) by Martin Odersky (if you start from a language design point of view and want the "reference book"). There are others like "Programming Scala" which are more targeted at beginners so to speak, but personally I found "Programming in Scala" excellent and have learned Scala with just that book.
A nice way to start Scala is working with the collection classes. .NET has added something similar lately with LINQ and extension methods, so it will be easy to pick up for you.
A small example to get you started:
//Define a class with some properties
case class Person(name: String, var age: Int, spokenLanguages: String*)
//Create some persons
val joe = Person("Joe", 42, "English","French","Danish")
val doe = Person("Doe", 23, "English","German")
val don = Person("Don", 11, "Italian","French","Polish")
val bob = Person("Bob", 17, "German")
//Access a property
joe.name
//Don had his 12th birthday!
don.age = 12
//Put the persons into a list
val persons = List(joe, doe, don, bob)
//Divide the list into minors and adults
val (minors, adults) = persons.partition(_.age < 18)
//Get the total age of all persons
val personsTotalAge = persons.map(_.age).sum
//Return a list with only those speaking English
val englishSpeakers = persons.filter(_.spokenLanguages.contains("English"))
//Same as the example above.
val englishSpeakers2 =
for{ person <- persons
language <- person.spokenLanguages
if language == "English"
} yield person
I'm not that fluent in C#, but I believe many things might look similar to you.
Some examples of Scala's XML support:
//The shoppingCart for breakfast
val shoppingCart = <list>
<item><name>Tomatoes</name><price>0.30</price><amount>4</amount></item>
<item><name>Eggs</name><price>0.15</price><amount>10</amount></item>
<item><name>Bread</name><price>2.20</price><amount>1</amount></item>
</list>
//How much does it cost?
val total = (shoppingCart \ "item").map(i => (i \ "price").text.toDouble * (i \ "amount").text.toDouble).sum
//This is a Symbol
val sym = 'SomeSymbol
//I'm too lazy to use Strings for XML! (Example for implicits)
implicit def symbol2string(symbol: Symbol) = symbol.name
//Now I can use Symbols too!
val total = (shoppingCart \ 'item).map(i => (i \ 'price).text.toDouble * (i \ 'amount).text.toDouble).sum
You don't need to learn Java first. Are you familiar with functional programming? If you are, you should be able to jump in quite fast. Anyway, here are some thoughts on how you can learn Scala:
Get a good reference book. I recommend Programming In Scala by Odersky, Spoon, and Venners. I find it as one of the most comprehensive Scala books.
As with learning any new language, try writing several small application using Scala. If you're not a functional programmer, you might program it in a different paradigm, but that's okay for now. Try writing your program without using "var, (use val instead)" not using loops, and minimizing state change overall.
Use sbt to build your program. I'm kinda hesitant to recommend this since you have to learn a new tool to write your program. But I find it a great too to write Scala apps with. And many Scala projects use sbt it seems like.
Also check out this comment and that thread overall to help you transition to Scala. Struggle against habits formed by Java when migrating to Scala
Java as a language will not be necessary to start with scala (and anyway java itself is very similar to c#, or actually it's the other way around...).
Once you start doing productive things with scala, though, you will be interacting with a lot of java libraries and learn that java-world is a much broader galaxy of more-or-less standard libraries than .net-world where a lots of the things you need are directly in the standard .NET libraries. You can learn them as you go, but not coming from a java background, the experience might feel overwhelming. It would be the same thing had you started learning java, though....
Other java-specific things you may have to learn are about generics being much less powerful in the JVM and how scala tries to work around this.
As for scala itself as a language, coming from .NET, you may benefit more from reading a few things on functional programming than from learning java. The functional paradigm is the part where I was the most ignorant in my initial approach to scala and that caused me the most trouble in understanding example code from the resources you can find on the scala website.
Building a foundation
For functional programming I would recommend reading SICP (it's online and free).
While you learn java try to take a look at F#. It's a good language, it's well documented and "lives" in the .NET ecosystem
learning scala
Resources on the scala website
Daniel Spiewak's blog
If you want to learn Scala, you'll be much better served by learning Scala and picking up the Java you need as you go. You don't need to learn Java to be able to start using Scala.
A good place to start would be reading through this question, which lists most of the Scala books currently available.
https://stackoverflow.com/questions/3359852/scala-programming-book/3360308#3360308
I don't think you need to know Java to get started on Scala. It's helpful to know the broad strokes though, because most Scala documentation I have read refers back to Java features or bugs.
Regarding Open Source projects, you cna have a look on Github for scala projects. Most open source Scala projects tend to be frameworks, though.
Regarding books, I found both the Artima book, Programming in Scala, and the Pragmatic Programmers, Programming Scala, very good.
In addition to the others: Write simple programs for a while and completely ignore advanced features of Scala. Only move there when you have a good grasp of the basics, the functional paradigm und the type system.
I advise you to take part in Coursera Functional Programming Principles in Scala by Martin Odersky. There are video lectures, assignments and even final exam.
Twitter (one of the scala lovers) recently unveiled their own scala complete tutorial -- Scala School. It is awesome guide which I recommend to all scala beginners.
Scala school was started as a series of lectures at Twitter to prepare
experienced engineers to be productive Scala programmers. ... We
think it makes the most sense to approach teaching Scala not as if
it's an improved Java but as a new language. Experience in Java is not
expected. Focus will be around the interpreter and the
object-functional style as well as the style of programming we do
here. An emphasis will be placed on maintainability, clarity of
expression, and leveraging the type system.
Most of the lessons require no software other than a Scala REPL. The
reader is encouraged to follow along, and go further! Use these
lessons as a starting point to explore the language.
I'm currently developing one which doesn't require prior programming knowledge. It will show the strength of combining functional and imperative programming, and it is very simple to follow. Furthermore, the posts will cover best practice and solve increasingly difficult problems using Scala.
http://vigtig.it/blog/
You should give it a try, part 2 is almost finished :)
Download and install Apache Maven. Use it to create a blank Scala project and write the classic Scala hello world application. This will require you to configure the pom.xml file in the project directory that Maven creates.
Then run mvn compile over and over, correcting errors until it compiles.
Then run mvn package until you pass all the unittests.
And finally, run mvn scala:run
Once you build a real project scala:run has an option to spit out the full Java command needed to run it from a shell script or batch file.