Scala: Rewrite code duplication with closures

Scala: Rewrite code duplication with closures - scala

I have this code:
val arr: Array[Int] = ...
val largestIndex = {
var i = arr.length-2
while (arr(i) > arr(i+1)) i -= 1
i
}
val smallestIndex = {
var k = arr.length-1
while (arr(largestIndex) > arr(k)) k -= 1
k
}
But there is to much code duplication. I tried to rewrite this with closures but I failed. I tried something like this:
def index(sub: Int, f: => Boolean): Int = {
var i = arr.length-sub
while (f) i -= 1
i
}
val largest = index(2, i => arr(i) > arr(i+1))
val smallest = index(1, i => arr(largest) > arr(i))
The problem is that i can't use parameter i of the method index() in the closure. Is there a way to avoid this problem?

val arr = Array(1,2,4,3,3,4,5)
def index(sub: Int, f: Int => Boolean): Int = {
var i = arr.length-sub
while (f(i)) i -= 1
i
}
val largest = index(2, i => arr(i) > arr(i+1))
val smallest = index(1, i => arr(largest) > arr(i))

val arr = Array(1,2,4,3,3,4,5)
arr: Array[Int] = Array(1, 2, 4, 3, 3, 4, 5)
scala> arr.zipWithIndex.max(Ordering.by((x: (Int, Int)) => x._1))._2
res0: Int = 6
scala> arr.zipWithIndex.min(Ordering.by((x: (Int, Int)) => x._1))._2
res1: Int = 0
or
scala> val pairOrdering = Ordering.by((x: (Int, Int)) => x._1)
pairOrdering: scala.math.Ordering[(Int, Int)] = scala.math.Ordering$$anon$4#145ad3d
scala> arr.zipWithIndex.max(pairOrdering)._2
res2: Int = 6
scala> arr.zipWithIndex.min(pairOrdering)._2
res3: Int = 0

Related

Spark in Scala - Map with Function with Extra Arguments

Is there a way in Scala to define an explicit function for an RDD Transformation with additional/extra arguments?
For example, the Python code below uses a lambda expression to apply the transformation map (requiring a function with one argument) with the function my_power (actually having 2 arguments).
def my_power(a, b):
res = a ** b
return res
def my_main(sc, n):
inputRDD = sc.parallelize([1, 2, 3, 4])
powerRDD = inputRDD.map(lambda x: my_power(x, n))
resVAL = powerRDD.collect()
for item in resVAL:
print(item)
However, when attempting an equivalent implementation in Scala, I get a Task not serializable exception.
val myPower: (Int, Int) => Int = (a: Int, b: Int) => {
val res: Int = math.pow(a, b).toInt
res
}
def myMain(sc: SparkContext, n: Int): Unit = {
val inputRDD: RDD[Int] = sc.parallelize(Array(1, 2, 3, 4))
val squareRDD: RDD[Int] = inputRDD.map( (x: Int) => myPower(x, n) )
val resVAL: Array[Int] = squareRDD.collect()
for (item <- resVAL){
println(item)
}
}

In this way it was working for me.
package examples
import org.apache.log4j.Level
import org.apache.spark.SparkContext
import org.apache.spark.rdd.RDD
import org.apache.spark.sql.SparkSession
object RDDTest extends App {
val logger = org.apache.log4j.Logger.getLogger("org")
logger.setLevel(Level.WARN)
val spark = SparkSession.builder()
.appName(this.getClass.getName)
.config("spark.master", "local[*]").getOrCreate()
val myPower: (Int, Int) => Int = (a: Int, b: Int) => {
val res: Int = math.pow(a, b).toInt
res
}
val scontext = spark.sparkContext
myMain(scontext, 10);
def myMain(sc: SparkContext, n: Int): Unit = {
val inputRDD: RDD[Int] = sc.parallelize(Array(1, 2, 3, 4))
val squareRDD: RDD[Int] = inputRDD.map((x: Int) => myPower(x, n))
val resVAL: Array[Int] = squareRDD.collect()
for ( item <- resVAL ) {
println(item)
}
}
}
Result :
1024
59049
1048576
There is another option to broadcast n using sc.broadcast and access in the closure like map is also possible...

Simply adding a local variable as a function alias made it work:
val myPower: (Int, Int) => Int = (a: Int, b: Int) => {
val res: Int = math.pow(a, b).toInt
res
}
def myMain(sc: SparkContext, n: Int): Unit = {
val inputRDD: RDD[Int] = sc.parallelize(Array(1, 2, 3, 4))
val myPowerAlias = myPower
val squareRDD: RDD[Int] = inputRDD.map( (x: Int) => myPowerAlias(x, n) )
val resVAL: Array[Int] = squareRDD.collect()
for (item <- resVAL){
println(item)
}
}

None if no clear winner in Scala Map maxBy

val valueCountsMap: mutable.Map[String, Int] = mutable.Map[String, Int]()
valueCountsMap("a") = 1
valueCountsMap("b") = 1
valueCountsMap("c") = 1
val maxOccurredValueNCount: (String, Int) = valueCountsMap.maxBy(_._2)
// maxOccurredValueNCount: (String, Int) = (b,1)
How can I get None if there's no clear winner when I do maxBy values? I am wondering if there's any native solution already implemented within scala mutable Maps.

No, there's no native solution for what you've described.
Here's how I might go about it.
implicit class UniqMax[K,V:Ordering](m: Map[K,V]) {
def uniqMaxByValue: Option[(K,V)] = {
m.headOption.fold(None:Option[(K,V)]){ hd =>
val ev = implicitly[Ordering[V]]
val (count, max) = m.tail.foldLeft((1,hd)) {case ((c, x), v) =>
if (ev.gt(v._2, x._2)) (1, v)
else if (v._2 == x._2) (c+1, x)
else (c, x)
}
if (count == 1) Some(max) else None
}
}
}
Usage:
Map("a"->11, "b"->12, "c"->11).uniqMaxByValue //res0: Option[(String, Int)] = Some((b,12))
Map(2->"abc", 1->"abx", 0->"ab").uniqMaxByValue //res1: Option[(Int, String)] = Some((1,abx))
Map.empty[Long,Boolean].uniqMaxByValue //res2: Option[(Long, Boolean)] = None
Map('c'->2.2, 'w'->2.2, 'x'->2.1).uniqMaxByValue //res3: Option[(Char, Double)] = None

Partially applied function as object

Let
def f(i:Int)(j:Int) = i + j
and so
f(1) _
Int => Int = <function1>
However,
val f: (Int)(Int) => Int = (a:Int)(b:Int) => a + b // wrong
namely, error: ';' expected but '(' found. How to declare val f ?

Is this what you were looking for?
scala> val f: Int => Int => Int = a => b => a + b
f: Int => (Int => Int) = <function1>
scala> f(1)
res7: Int => Int = <function1>
scala> f(1)(2)
res8: Int = 3

How to find two elements satisfying two predicates in one pass with Scala?

Suppose there is a List[A] and two predicates p1: A => Boolean and p2: A => Boolean.
I need to find two elements in the list: the first element a1 satisfying p1 and the first element a2 satisfying p2 (in my case a1 != a2)
Obviously, I can run find twice but I would like to do it in one pass. How would you do it in one pass in Scala ?

So, here's an attempt. It's fairly straightforward to generalise it to take a list of predicates (and return a list of elements found)
def find2[A](xs: List[A], p1: A => Boolean, p2: A => Boolean): (Option[A], Option[A]) = {
def find2helper(xs: List[A], p1: A => Boolean, p2: A => Boolean, soFar: (Option[A], Option[A])): (Option[A], Option[A]) = {
if (xs == Nil) soFar
else {
val a1 = if (soFar._1.isDefined) soFar._1 else if (p1(xs.head)) Some(xs.head) else None
val a2 = if (soFar._2.isDefined) soFar._2 else if (p2(xs.head)) Some(xs.head) else None
if (a1.isDefined && a2.isDefined) (a1, a2) else find2helper(xs.tail, p1, p2, (a1, a2))
}
}
find2helper(xs, p1, p2, (None, None))
} //> find2: [A](xs: List[A], p1: A => Boolean, p2: A => Boolean)(Option[A], Option[A])
val foo = List(1, 2, 3, 4, 5) //> foo : List[Int] = List(1, 2, 3, 4, 5)
find2[Int](foo, { x: Int => x > 2 }, { x: Int => x % 2 == 0 })
//> res0: (Option[Int], Option[Int]) = (Some(3),Some(2))
find2[Int](foo, { x: Int => x > 2 }, { x: Int => x % 7 == 0 })
//> res1: (Option[Int], Option[Int]) = (Some(3),None)
find2[Int](foo, { x: Int => x > 7 }, { x: Int => x % 2 == 0 })
//> res2: (Option[Int], Option[Int]) = (None,Some(2))
find2[Int](foo, { x: Int => x > 7 }, { x: Int => x % 7 == 0 })
//> res3: (Option[Int], Option[Int]) = (None,None)
Generalised version (which is actually slightly clearer, I think)
def findN[A](xs: List[A], ps: List[A => Boolean]): List[Option[A]] = {
def findNhelper(xs: List[A], ps: List[A => Boolean], soFar: List[Option[A]]): List[Option[A]] = {
if (xs == Nil) soFar
else {
val as = ps.zip(soFar).map {
case (p, e) => if (e.isDefined) e else if (p(xs.head)) Some(xs.head) else None
}
if (as.forall(_.isDefined)) as else findNhelper(xs.tail, ps, as)
}
}
findNhelper(xs, ps, List.fill(ps.length)(None))
} //> findN: [A](xs: List[A], ps: List[A => Boolean])List[Option[A]]
val foo = List(1, 2, 3, 4, 5) //> foo : List[Int] = List(1, 2, 3, 4, 5)
findN[Int](foo, List({ x: Int => x > 2 }, { x: Int => x % 2 == 0 }))
//> res0: List[Option[Int]] = List(Some(3), Some(2))
findN[Int](foo, List({ x: Int => x > 2 }, { x: Int => x % 7 == 0 }))
//> res1: List[Option[Int]] = List(Some(3), None)
findN[Int](foo, List({ x: Int => x > 7 }, { x: Int => x % 2 == 0 }))
//> res2: List[Option[Int]] = List(None, Some(2))
findN[Int](foo, List({ x: Int => x > 7 }, { x: Int => x % 7 == 0 }))
//> res3: List[Option[Int]] = List(None, None)

scala> val l = List(1,2,3)
l: List[Int] = List(1, 2, 3)
scala> val p1 = {x:Int => x % 2 == 0}
p1: Int => Boolean = <function1>
scala> val p2 = {x:Int => x % 3 == 0}
p2: Int => Boolean = <function1>
scala> val pp = {x:Int => p1(x) || p2(x) }
pp: Int => Boolean = <function1>
scala> l.find(pp)
res2: Option[Int] = Some(2)
scala> l.filter(pp)
res3: List[Int] = List(2, 3)

Does this work for you?
def predFilter[A](lst: List[A], p1: A => Boolean, p2: A => Boolean): List[A] =
lst.filter(x => p1(x) || p2(x)) // or p1(x) && p2(x) depending on your need
This will return you a new list that matches either of the predicates.
val a = List(1,2,3,4,5)
val b = predFilter[Int](a, _ % 2 == 0, _ % 3 == 0) // b is now List(2, 3, 4)

Scala, extending the iterator

Im looking to extended the iterator to create a new method takeWhileInclusive, which will operate like takeWhile but include the last element.
My issue is what is best practice to extend the iterator to return a new iterator which I would like to be lazy evaluated. Coming from a C# background I normal use IEnumerable and use the yield keyword, but such an option doesn't appear to exist in Scala.
for example I could have
List(0,1,2,3,4,5,6,7).iterator.map(complex time consuming algorithm).takeWhileInclusive(_ < 6)
so in this case the takeWhileInclusive would only have resolve the predicate on the values until I get the a result greater than 6, and it will include this first result
so far I have:
object ImplicitIterator {
implicit def extendIterator(i : Iterator[Any]) = new IteratorExtension(i)
}
class IteratorExtension[T <: Any](i : Iterator[T]) {
def takeWhileInclusive(predicate:(T) => Boolean) = ?
}

You can use the span method of Iterator to do this pretty cleanly:
class IteratorExtension[A](i : Iterator[A]) {
def takeWhileInclusive(p: A => Boolean) = {
val (a, b) = i.span(p)
a ++ (if (b.hasNext) Some(b.next) else None)
}
}
object ImplicitIterator {
implicit def extendIterator[A](i : Iterator[A]) = new IteratorExtension(i)
}
import ImplicitIterator._
Now (0 until 10).toIterator.takeWhileInclusive(_ < 4).toList gives List(0, 1, 2, 3, 4), for example.

This is one case where I find the mutable solution superior:
class InclusiveIterator[A](ia: Iterator[A]) {
def takeWhileInclusive(p: A => Boolean) = {
var done = false
val p2 = (a: A) => !done && { if (!p(a)) done=true; true }
ia.takeWhile(p2)
}
}
implicit def iterator_can_include[A](ia: Iterator[A]) = new InclusiveIterator(ia)

The following requires scalaz to get fold on a tuple (A, B)
scala> implicit def Iterator_Is_TWI[A](itr: Iterator[A]) = new {
| def takeWhileIncl(p: A => Boolean)
| = itr span p fold (_ ++ _.toStream.headOption)
| }
Iterator_Is_TWI: [A](itr: Iterator[A])java.lang.Object{def takeWhileIncl(p: A => Boolean): Iterator[A]}
Here it is at work:
scala> List(1, 2, 3, 4, 5).iterator takeWhileIncl (_ < 4)
res0: Iterator[Int] = non-empty iterator
scala> res0.toList
res1: List[Int] = List(1, 2, 3, 4)
You can roll your own fold over a pair like this:
scala> implicit def Pair_Is_Foldable[A, B](pair: (A, B)) = new {
| def fold[C](f: (A, B) => C): C = f.tupled(pair)
| }
Pair_Is_Foldable: [A, B](pair: (A, B))java.lang.Object{def fold[C](f: (A, B) => C): C}

class IteratorExtension[T](i : Iterator[T]) {
def takeWhileInclusive(predicate:(T) => Boolean) = new Iterator[T] {
val it = i
var isLastRead = false
def hasNext = it.hasNext && !isLastRead
def next = {
val res = it.next
isLastRead = !predicate(res)
res
}
}
}
And there's an error in your implicit. Here it is fixed:
object ImplicitIterator {
implicit def extendIterator[T](i : Iterator[T]) = new IteratorExtension(i)
}

scala> List(0,1,2,3,4,5,6,7).toStream.filter (_ < 6).take(2)
res8: scala.collection.immutable.Stream[Int] = Stream(0, ?)
scala> res8.toList
res9: List[Int] = List(0, 1)
After your update:
scala> def timeConsumeDummy (n: Int): Int = {
| println ("Time flies like an arrow ...")
| n }
timeConsumeDummy: (n: Int)Int
scala> List(0,1,2,3,4,5,6,7).toStream.filter (x => timeConsumeDummy (x) < 6)
Time flies like an arrow ...
res14: scala.collection.immutable.Stream[Int] = Stream(0, ?)
scala> res14.take (4).toList
Time flies like an arrow ...
Time flies like an arrow ...
Time flies like an arrow ...
res15: List[Int] = List(0, 1, 2, 3)
timeConsumeDummy is called 4 times. Am I missing something?

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Scala: Rewrite code duplication with closures - scala

val arr = Array(1,2,4,3,3,4,5) def index(sub: Int, f: Int => Boolean): Int = { var i = arr.length-sub while (f(i)) i -= 1 i } val largest = index(2, i => arr(i) > arr(i+1)) val smallest = index(1, i => arr(largest) > arr(i))

Related

Spark in Scala - Map with Function with Extra Arguments

None if no clear winner in Scala Map maxBy

Partially applied function as object

How to find two elements satisfying two predicates in one pass with Scala?

Scala, extending the iterator

Categories

Resources