Scala: Invokation of methods with/without () with overridable implicits - scala

Here is a definition of method, that uses ExecutionContext implicitly, and allows client to override it. Two execution contexts are used to test it:
val defaultEc = ExecutionContext.fromExecutor(
Executors.newFixedThreadPool(5))
Names of threads look like: 'pool-1-thread-1' to 'pool-1-thread-5'
And the 2nd one from Scala:
scala.concurrent.ExecutionContext.Implicits.global
Names of threads look like: 'scala-execution-context-global-11'
Client can override default implicit via:
implicit val newEc = scala.concurrent.ExecutionContext.Implicits.global
Unfortunately it is overridable only, when a method with implicit is invoked without ():
val r = FutureClient.f("testDefault") //prints scala-execution-context-global-11
not working:
val r = FutureClient.f("testDefault")() //still prints: pool-1-thread-1
The question is WHY it works this way? Cause it makes it much more complicated for clients of API
Here is a full code to run it and play:
object FutureClient {
//thread names will be from 'pool-1-thread-1' to 'pool-1-thread-5'
val defaultEc = ExecutionContext.fromExecutor(
Executors.newFixedThreadPool(5))
def f(beans: String)
(implicit executor:ExecutionContext = defaultEc)
: Future[String] = Future {
println("thread: " + Thread.currentThread().getName)
TimeUnit.SECONDS.sleep(Random.nextInt(3))
s"$beans"
}
}
class FutureTest {
//prints thread: pool-1-thread-1
#Test def testFDefault(): Unit ={
val r = FutureClient.f("testDefault")
while (!r.isCompleted) {
TimeUnit.SECONDS.sleep(2)
}
}
//thread: scala-execution-context-global-11
#Test def testFOverridable(): Unit ={
implicit val newEc = scala.concurrent.ExecutionContext.Implicits.global
val r = FutureClient.f("testDefault")
while (!r.isCompleted) {
TimeUnit.SECONDS.sleep(2)
}
}
//prints pool-1-thread-1, but not 'scala-execution-context-global-11'
//cause the client invokes f with () at the end
#Test def testFOverridableWrong(): Unit ={
implicit val newEc = scala.concurrent.ExecutionContext.Implicits.global
val r = FutureClient.f("testDefault")()
while (!r.isCompleted) {
TimeUnit.SECONDS.sleep(2)
}
}
}
I have already discussed a couple of related topics, but they are related to API definition, so it is a new issue, not covered by those topics.

Scala Patterns To Avoid: Implicit Arguments With Default Values
f("testDefault") (or f("testDefault")(implicitly)) means that implicit argument is taken from implicit context.
f("testDefault")(newEc) means that you specify implicit argument explicitly. If you write f("testDefault")() this means that you specify implicit argument explicitly but since the value isn't provided it should be taken from default value.

Related

How do I abstract over effects and use ContextShift with Scala Cats?

I am creating in Scala and Cats a function that does some I/O and that will be called by other parts of the code. I'm also learning Cats and I want my function to:
Be generic in its effect and use a F[_]
Run on a dedicated thread pool
I want to introduce async boundaries
I assume that all my functions are generic in F[_] up to the main method because I'm trying to follow these Cat's guidelines
But I struggle to make these constraint to work by using ContextShift or ExecutionContext. I have written a full example here and this is an exctract from the example:
object ComplexOperation {
// Thread pool for ComplexOperation internal use only
val cs = IO.contextShift(
ExecutionContext.fromExecutor(Executors.newSingleThreadExecutor())
)
// Complex operation that takes resources and time
def run[F[_]: Sync](input: String): F[String] =
for {
r1 <- Sync[F].delay(cs.shift) *> op1(input)
r2 <- Sync[F].delay(cs.shift) *> op2(r1)
r3 <- Sync[F].delay(cs.shift) *> op3(r2)
} yield r3
def op1[F[_]: Sync](input: String): F[Int] = Sync[F].delay(input.length)
def op2[F[_]: Sync](input: Int): F[Boolean] = Sync[F].delay(input % 2 == 0)
def op3[F[_]: Sync](input: Boolean): F[String] = Sync[F].delay(s"Complex result: $input")
}
This clearly doesn't abstract over effects as ComplexOperation.run needs a ContextShift[IO] to be able to introduce async boundaries. What is the right (or best) way of doing this?
Creating ContextShift[IO] inside ComplexOperation.run makes the function depend on IO which I don't want.
Moving the creation of a ContextShift[IO] on the caller will simply shift the problem: the caller is also generic in F[_] so how does it obtain a ContextShift[IO] to pass to ComplexOperation.run without explicitly depending on IO?
Remember that I don't want to use one global ContextShift[IO] defined at the topmost level but I want each component to decide for itself.
Should my ComplexOperation.run create the ContextShift[IO] or is it the responsibility of the caller?
Am I doing this right at least? Or am I going against standard practices?
So I took the liberty to rewrite your code, hope it helps:
import cats.effect._
object Functions {
def sampleFunction[F[_]: Sync : ContextShift](file: String, blocker: Blocker): F[String] = {
val handler: Resource[F, Int] =
Resource.make(
blocker.blockOn(openFile(file))
) { handler =>
blocker.blockOn(closeFile(handler))
}
handler.use(handler => doWork(handler))
}
private def openFile[F[_]: Sync](file: String): F[Int] = Sync[F].delay {
println(s"Opening file $file with handler 2")
2
}
private def closeFile[F[_]: Sync](handler: Int): F[Unit] = Sync[F].delay {
println(s"Closing file handler $handler")
}
private def doWork[F[_]: Sync](handler: Int): F[String] = Sync[F].delay {
println(s"Calculating the value on file handler $handler")
"The final value"
}
}
object Main extends IOApp {
override def run(args: List[String]): IO[ExitCode] = {
val result = Blocker[IO].use { blocker =>
Functions.sampleFunction[IO](file = "filePath", blocker)
}
for {
data <- result
_ <- IO(println(data))
} yield ExitCode.Success
}
}
You can see it running here.
So, what does this code does.
First, it creates a Resource for the file, since close has to be done, even on guarantee or on failure.
It is using Blocker to run the open and close operations on a blocking thread poo (that is done using ContextShift).
Finally, on the main, it creates a default Blocker for instance, for **IO*, and uses it to call your function; and prints the result.
Fell free to ask any question.

Scala Implicit conversion for Lambdas

I am trying to understand the implicit function types from the link - http://www.scala-lang.org/blog/2016/12/07/implicit-function-types.html and below is a sample code as example. In the below code we first create a class Transaction.
class Transaction {
private val log = scala.collection.mutable.ListBuffer.empty[String]
def println(s: String): Unit = log += s
private var aborted = false
private var committed = false
def abort(): Unit = { aborted = true }
def isAborted = aborted
def commit(): Unit =
if (!aborted && !committed) {
Console.println("******* log ********")
log.foreach(Console.println)
committed = true
}
}
Next I define two methods f1 and f2 as shown below
def f1(x: Int)(implicit thisTransaction: Transaction): Int = {
thisTransaction.println(s"first step: $x")
f2(x + 1)
}
def f2(x: Int)(implicit thisTransaction: Transaction): Int = {
thisTransaction.println(s"second step: $x")
x
}
Then a method is defined to invoke the functions
def transaction[T](op: Transaction => T) = {
val trans: Transaction = new Transaction
op(trans)
trans.commit()
}
Below lambda is used to invoke the code
transaction {
implicit thisTransaction =>
val res = f1(3)
println(if (thisTransaction.isAborted) "aborted" else s"result: $res")
}
My question is if I change the val trans: Transaction = new Transaction to implicit val thisTransaction = new Transaction and change op(trans) to op it is not working.
I am not able to understand why even if thisTransaction of type Transaction is present in the scope it is not being used?
Implicit function types are planned for a future version of Scala. As far as I know, not even the next version (2.13).
For now, you can only use them in Dotty.
This here compiles just fine with dotty 0.4.0-RC1:
def transaction[T](op: implicit Transaction => T) = {
implicit val trans: Transaction = new Transaction
op
trans.commit()
}
I think it should be clear from the introduction of the blog post that this is a new feature that Odersky invented to eliminate some boilerplate in the implementation of the Dotty compiler, quote:
For instance in the dotty compiler, almost every function takes an implicit context parameter which defines all elements relating to the current state of the compilation.
This seems to be currently not available in the mainstream versions of Scala.
EDIT
(answer to follow-up question in the comment)
If I understood the blog post correctly, it is desugared into something like
transaction(
new ImplicitFunction1[Transaction, Unit] {
def apply(implicit thisTransaction: Transaction): Unit = {
val res = f1(args.length)(implicit thisTransaction:Transaction)
println(if (thisTransaction.isAborted) "aborted" else s"result: $res")
}
}
)
the new ImplicitFunction1[...]{} is an anonymous local class. The base class ImplicitFunction1 is this new feature, which is similar to the Function for ordinary lambdas, but with implicit arguments.
op is of type Transaction => T, not (and I don't think is possible to specify this, unfortunately) implicit Transaction => T.
So its argument of type Transaction can't be supplied implicitly. It must be an explicit argument. (Only arguments to a function in an argument list marked implicit can be supplied implicitly.)

Implicits over function closures in Scala

I've been trying to understand implicits for Scala and trying to use them at work - one particular place im stuck at is trying to pass implicits in the following manner
object DBUtils {
case class DB(val jdbcConnection: Connection) {
def execute[A](op: =>Unit): Any = {
implicit val con = jdbcConnection
op
}
}
object DB {
def SQL(query: String)(implicit jdbcConnection: Connection): PreparedStatement = {
jdbcConnection.prepareStatement(query)
}
}
val someDB1 = DB(jdbcConnection)
val someDB2 = DB(jdbcConnection2)
val someSQL = SQL("SOME SQL HERE")
someDB1.execute{someSQL}
someDB2.execute{someSQL}
Currently i get an execption saying that the SQL() function cannot find the implicit jdbcConnection.What gives and what do i do to make it work in the format i need?
Ps-:Im on a slightly older version of Scala(2.10.4) and cannot upgrade
Edit: Changed the problem statement to be more clear - I cannot use a single implicit connection in scope since i can have multiple DBs with different Connections
At the point where SQL is invoked there is no implicit value of type Connection in scope.
In your code snippet the declaration of jdbcConnection is missing, but if you change it from
val jdbcConnection = //...
to
implicit val jdbcConnection = // ...
then you will have an implicit instance of Connection in scope and the compiler should be happy.
Try this:
implicit val con = jdbcConnection // define implicit here
val someDB = DB(jdbcConnection)
val someSQL = SQL("SOME SQL HERE") // need implicit here
someDB.execute{someSQL}
The implicit must be defined in the scope where you need it. (In reality, it's more complicated, because there are rules for looking elsewhere, as you can find in the documentation. But the simplest thing is to make sure the implicit is available in the scope where you need it.)
Make the following changes
1) execute method take a function from Connection to Unit
2) Instead of this val someDB1 = DB(jdbcConnection) use this someDB1.execute{implicit con => someSQL}
object DBUtils {
case class DB(val jdbcConnection: Connection) {
def execute[A](op: Connection =>Unit): Any = {
val con = jdbcConnection
op(con)
}
}
Here is the complete code.
object DB {
def SQL(query: String)(implicit jdbcConnection: Connection): PreparedStatement = {
jdbcConnection.prepareStatement(query)
}
}
val someDB1 = DB(jdbcConnection)
val someDB2 = DB(jdbcConnection2)
val someSQL = SQL("SOME SQL HERE")
someDB1.execute{implicit con => someSQL}
someDB2.execute{implicit con => someSQL}

Invoke a method on a generic type with scala and reflect package

My question is based on a search that I have made on the following pages (but I am still to new to scala to succeed in what I want to do):
reflection overview
The purpose of my code is to invoke a method from a generic type and not an instance of a known type.
The following demonstrate the idea:
class A {
def process = {
(1 to 1000).foreach(x => x + 10)
}
}
def getTypeTag[T: ru.TypeTag](obj: T) = ru.typeTag[T]
def perf[T: ru.TypeTag](t: T, sMethodName: String): Any = {
val m = ru.runtimeMirror(t.getClass.getClassLoader)
val myType = ru.typeTag[T].tpe
val mn = myType.declaration(ru.newTermName(sMethodName)).asMethod
val im = m.reflect(getTypeTag(t))
val toCall = im.reflectMethod(mn)
toCall()
}
val a = new A
perf(a, "process")
The code compile perfectly (on a worksheet) but give the following stack at execution:
scala.ScalaReflectionException: expected a member of class TypeTagImpl, you provided method A$A11.A$A11.A.process
at scala.reflect.runtime.JavaMirrors$JavaMirror.scala$reflect$runtime$JavaMirrors$JavaMirror$$ErrorNotMember(test-log4j.sc:126)
at scala.reflect.runtime.JavaMirrors$JavaMirror$$anonfun$scala$reflect$runtime$JavaMirrors$JavaMirror$$checkMemberOf$1.apply(test-log4j.sc:221)
at scala.reflect.runtime.JavaMirrors$JavaMirror.ensuringNotFree(test-log4j.sc:210)
at scala.reflect.runtime.JavaMirrors$JavaMirror.scala$reflect$runtime$JavaMirrors$JavaMirror$$checkMemberOf(test-log4j.sc:220)
at scala.reflect.runtime.JavaMirrors$JavaMirror$JavaInstanceMirror.reflectMethod(test-log4j.sc:257)
at scala.reflect.runtime.JavaMirrors$JavaMirror$JavaInstanceMirror.reflectMethod(test-log4j.sc:239)
at #worksheet#.perf(test-log4j.sc:20)
at #worksheet#.get$$instance$$res0(test-log4j.sc:28)
at #worksheet#.#worksheet#(test-log4j.sc:138)
Any idea about how to correct this ?
Many thanks to all
In order to reflect a particular object, you have to pass it to Mirror.reflect(obj: T), and you're passing its typeTag for some reason. To fix, you have to modify perf signature to generate a ClassTag along with a TypeTag, and pass t directly to reflect, like so:
class A {
def process = {
(1 to 1000).foreach(x => x + 10)
println("ok!")
}
}
def perf[T : ClassTag : ru.TypeTag](t: T, sMethodName: String): Any = {
// ^ modified here
val m = ru.runtimeMirror(t.getClass.getClassLoader)
val myType = ru.typeTag[T].tpe
val mn = myType.decl(ru.TermName(sMethodName)).asMethod
val im = m.reflect(t)
// ^ and here
val toCall = im.reflectMethod(mn)
toCall()
}
val a = new A
perf(a, "process")
// ok!
// res0: Any = ()
(Note: I also replaced deprecated declaration and newTermName with recommended alternatives)

Access Spark broadcast variable in different classes

I am broadcasting a value in Spark Streaming application . But I am not sure how to access that variable in a different class than the class where it was broadcasted.
My code looks as follows:
object AppMain{
def main(args: Array[String]){
//...
val broadcastA = sc.broadcast(a)
//..
lines.foreachRDD(rdd => {
val obj = AppObject1
rdd.filter(p => obj.apply(p))
rdd.count
}
}
object AppObject1: Boolean{
def apply(str: String){
AnotherObject.process(str)
}
}
object AnotherObject{
// I want to use broadcast variable in this object
val B = broadcastA.Value // compilation error here
def process(): Boolean{
//need to use B inside this method
}
}
Can anyone suggest how to access broadcast variable in this case?
There is nothing particularly Spark specific here ignoring possible serialization issues. If you want to use some object it has to be available in the current scope and you can achieve this the same way as usual:
you can define your helpers in a scope where broadcast is already defined:
{
...
val x = sc.broadcast(1)
object Foo {
def foo = x.value
}
...
}
you can use it as a constructor argument:
case class Foo(x: org.apache.spark.broadcast.Broadcast[Int]) {
def foo = x.value
}
...
Foo(sc.broadcast(1)).foo
method argument
case class Foo() {
def foo(x: org.apache.spark.broadcast.Broadcast[Int]) = x.value
}
...
Foo().foo(sc.broadcast(1))
or even mixed-in your helpers like this:
trait Foo {
val x: org.apache.spark.broadcast.Broadcast[Int]
def foo = x.value
}
object Main extends Foo {
val sc = new SparkContext("local", "test", new SparkConf())
val x = sc.broadcast(1)
def main(args: Array[String]) {
sc.parallelize(Seq(None)).map(_ => foo).first
sc.stop
}
}
Just a short take on performance considerations that were introduced earlier.
Options proposed by zero233 are indeed very elegant way of doing this kind of things in Scala. At the same time it is important to understand implications of using certain patters in distributed system.
It is not the best idea to use mixin approach / any logic that uses enclosing class state. Whenever you use a state of enclosing class within lambdas Spark will have to serialize outer object. This is not always true but you'd better off writing safer code than one day accidentally blow up the whole cluster.
Being aware of this, I would personally go for explicit argument passing to the methods as this would not result in outer class serialization (method argument approach).
you can use classes and pass the broadcast variable to classes
your psudo code should look like :
object AppMain{
def main(args: Array[String]){
//...
val broadcastA = sc.broadcast(a)
//..
lines.foreach(rdd => {
val obj = new AppObject1(broadcastA)
rdd.filter(p => obj.apply(p))
rdd.count
})
}
}
class AppObject1(bc : Broadcast[String]){
val anotherObject = new AnotherObject(bc)
def apply(str: String): Boolean ={
anotherObject.process(str)
}
}
class AnotherObject(bc : Broadcast[String]){
// I want to use broadcast variable in this object
def process(str : String): Boolean = {
val a = bc.value
true
//need to use B inside this method
}
}