Scala run in parallel two functions - scala

I have the following declaration of a function
def myfunc(l: List[RoseTree]): Option[RoseTree] = {
//Complex calculations
}
Now, I have to run this function on two huge different lists. So, I wish to run the same function, with different data, in parallel.
I have been taking a look at the Future module of Scala.
However, I also wish that when either of the functions returns a "Some(RoseTree)", then it tells the other call to stop and keep the result. Is this possible?
Kind regards.

Scala has built-in feature for this. You can use Future in combination of Future.firstCompletedOf().
Please have a check at the Scala doc and I would suggest to have a look at this post
As point in the comments section, the Future.firstCompletedOf does not meet the requirement "cancel the futures after one is completed". You can use Monix Task (which is a replacement for Scala Futures), which support by default this. Please have a look at this.

Related

Scala legacy code: how to access input parameters at different points in execution path?

I am working with a legacy scala codebase, and as is always the case modifying the code is quite difficult without touching different parts.
One of my new requirement in to make several decisions based on some input parameters. Problem is that these decisions are to be made at various points along the execution. So either I encapsulate all those parameters in a case class instance and pass it along. But it means I would have to modify multiple methods signatures, and I want to avoid this approach as much as possible.
Another approach can be to create a global object containing all those input parameters and accessible from different points in the execution. Is it a good approach in Scala?
No, using global mutable variables to pass “hidden” parameters is not a good idea, not in Scala and not in any other programming language. It makes the code hard to understand and modify, because a function's behaviour will now depend on which functions were invoked earlier. And it's extremely fragile, because you might forget setting one of those global parameters before invoking the function, which means that it will use whatever value was stored there before. This is the kind of thing that can appear to work for years, and then break when you modify a completely unrelated part of the program.
I can't stress this enough: do not use global mutable variables, period. The solution is to man up and change those method signatures. Depending on the details, dependency injection may or may not help in your particular case.

scala companion object templates (Iterator.tabulate)

I'm new to scala and struggling with the documentation a little bit. I was looking at a piece of code in the spark codebase (cosine similarity for RowMatrix) and saw that they use Iterator.tabulate. Not knowing what that function does I looked in the scala API docs, only to find out the function does not exist. Except that it does exist, because I can use it in the repl (hmm, maybe I'm looking at the wrong API docs version ... no, this this is the current version).
After a bit of searching I find out that tabulate is defined (at least) in scala.collection.generic.SeqFactory and scala.collection.generic.TraversableFactory. These two however appear not the be connected in the dependency graph. I can't find any path between the two, and hence no way of actually knowing - from looking at the API docs - that .tabulate even exists.
So the question is: how do you find .tabulate and it's documentation from looking at the API docs for the class (say Iterator or Seq). Do I just have to google my way around it, or is there some magic button in the scala docs that will make the thing appear?
This doesn't seem to be limited to just .tabulate but a more common issue (at least for me), looking at library code functions seem to exist that are never mentioned in the API. Another example is
org.apache.spark.mllib.linalg.distributed.RowMatrix.toBreeze
I still don't know if that function exists, some code seems to use it, but I can't find any documentation about it.
In Scala source code all logic of Iterator defined in one file Iterator.scala. Function tabulate that you're looking for is defined in object Iterator in Scala API you make search by trait Iterator so this is why you can't find it.
In right corner of doc you can switch to object iterator and here you will find Iterator$#tabulate util function.

Scala pipelines - DSL for building a DAG workflow

Im curious about the current libraries for Scala & Akka which would allow me to elegantly build a workflow pipeline.
In my case a workflow is just a DAG of operations so actors/Akka feels like a good fit.
My question is what's the best approach? There are Libs like reactive streams which allow really elegant composition of a pipeline but they seem very record focused.
My use case is a flow of operations passing messages between them. Future composition is nice but syntax becomes unwieldy after a while. Maybe there is something better with scalaz and shapeless.
What are the approaches and tools to building a DSL for pipelines of computation steps using message passing?
While still in early development (pre 1.0 as of writing), you should have a look at akka-streams, which are exactly that - a way to describe a computation graph and then run it asynchronously.
If your pipeline is very much like a chain of method calls, use a chain of method calls!
There's no point making the solution more complicated than it needs to be; if it's well-modelled by a chain of methods calls, just use that. (Or functions, which you can compose.)
If you need something slightly more complicated but you don't actually need any message-passing, you might want something like AsyncFP or Scala.Rx.
If you need a multi-core solution, but you have stretches that look like method calls, then have a chain of method calls inside one stop. You could use Akka streams for that without having to worry so much about the overhead to useful computation ratio.

Scala formatter - show named parameter

I have a relatively large Scala code base that does not use named parameters for any function/class calls. Rather than going in and manually entering it, which would be a very tedious process, I was looking at a formatter to do the job. The best I found is scalariform, but I'm not sure whether I can even write a rule for something so complex.
I'm curious if anyone has ran into a similar problem and found a powerful formatter.
The Scala Refactoring library might be something you could use. You will need some knowledge of Scala's Abstract Syntax Tree representation.
Why do you want to use named parameters throughout your code base? I like IntelliJ's default which is to suggest to name boolean arguments (only).

How can I reify a Symbol in order to pass it into runtime?

Macro contexts in Scala come with two handy methods: reifyType and reifyTree which essentially generate code that, when executed at runtime, will return the Type or Tree being reified.
I wonder if there is some way to achieve something similar with Symbols - some kind of reifySymbol method?
We didn't implement reifySymbol yet, but it might be decently emulated by wrapping a symbol in an Ident and then reifying the resulting tree. Pull requests are welcome as well :)