ReduceLeft with Vector of pairs? - scala

I have a Vector of Pairs
Vector((9,1), (16,2), (21,3), (24,4), (25,5), (24,6), (21,7), (16,8), (9,9), (0,10))
and I want to return pair with maximum first element in pair.
I've tried it to do like this:
data reduceLeft[(Int, Int)]((y:(Int, Int),z:(Int,Int))=>y._1 max z._1)
and
data reduceLeft((y:(Int, Int),z:(Int,Int))=>y._1 max z._1)
but there is type mismatch error and I can't understand what is wrong with this code.

Why using reduceLeft ?
Just the default max method works very well
scala> val v = Vector((9,1), (16,2), (21,3), (24,4), (25,5), (24,6), (21,7), (16,8), (9,9), (0,10))
v: scala.collection.immutable.Vector[(Int, Int)] = Vector((9,1), (16,2), (21,3), (24,4), (25,5), (24,6), (21,7), (16,8), (9,9), (0,10))
scala> v.max
res1: (Int, Int) = (25,5)
If you want reduceLeft instead :
v.reduceLeft( (x, y) => if (x._1 >= y._1) x else y )
Your error is you have to return a tuple, not an int
y._1 max z._1
The max function here on two int return an int.

max works great in this example. However, if you are wondering how to do this using reduceLeft here it is:
val v = Vector((9,1), (16,2), (21,3), (24,4), (25,5), (24,6), (21,7), (16,8), (9,9), (0,10))
v.reduceLeft( ( x:(Int, Int), y:(Int,Int) ) => if(y._1 > x._1) y else x)

Related

How to sum up pair elements individually in Scala

I have the following method to sum up the pair elements in an array of pairs. I am new to scala and feel like there will be a better way than the following piece of code.
def accumulate(results: Array[(Int, Int)]): (Int, Int) = {
var x: Int = 0
var y: Int = 0
for (elem <- results) {
x = x + elem._1
y = y + elem._2
}
(x, y)
}
Yes, you can use foldLeft.
(BTW, I would also use List, instead of Array)
results.foldLeft((0, 0)) {
case ((accX, accY), (x, y)) =>
(accX + x, accY + y)
}
All of the operations in scala.collection.ArrayOps are available on Array[T]. In particular, you can unzip an array of pairs into a pair of arrays
val (xs, ys) = results.unzip
Summing a container is a standard use of fold
val x = xs.fold(0)(_ + _)
val y = ys.fold(0)(_ + _)
And then you can return the pair of values
(x, y)
https://scalafiddle.io/sf/meEKv6T/0 has a complete working example.

Error: Value min is not a member of (Int, Int)

I am trying to produce the RDD that contains an array of tuples that has country names as the first element, and the minimum integer of the tuple as the second element.
I have this code here.
val test = sc.parallelize(Array(("US", (4,2)), ("France", (1,2)), ("Italy", (2,3))))
I want to store a variable to a value that looks like this:
Array( ("US", 2), ("France", 1), ("Italy", 2) )
I tried to use this code, but it produced a 'Value min is not a member of (Int, Int)' error.
val test1 = test.map(x => (x._1, x._2.min))
How to get minimum of Tuple2[Int, Int]?
To compute the minimum of numeric elements in a Tuple (x, y), you could use x min y:
val test = sc.parallelize(Array(("US", (4,2)), ("France", (1,2)), ("Italy", (2,3))))
test.map(t => (t._1, t._2._1 min t._2._2)).collect
// res1: Array[(String, Int)] = Array((US,2), (France,1), (Italy,2))
For readability, an alternative is to use case partial function, as follows:
test.map{ case (country, (t1, t2)) => (country, t1 min t2) }

Scala type mismatch when adding elements to hashmap

I am representing a graph's adjacency list in Scala in the variable a.
val a = new HashMap[Int, Vector[Tuple2[Int, Int]]] withDefaultValue Vector.empty
for(i <- 1 to N) {
val Array(x, y, r) = readLine.split(" ").map(_.toInt)
a(x) += new Tuple2(y, r)
a(y) += new Tuple2(x, r)
}
I am reading each edge in turn(x and y are nodes, while r is the cost of the edge). After reading it, I am adding it to the adjacency list.
However, when adding the Tuples containing a neighbouring node and a cost to the HashMap I get:
Solution.scala:17: error: type mismatch;
found : (Int, Int)
required: String
a(x) += new Tuple2(y, r)
I don't understand why it wants String. I haven't specified String anywhere.
+= is the operator for concatenating to a String.
You would probably want to do something like: a.update(x, a.getOrElse(x, Vector()) :+ (x, r)).
Also, you are writing Java code in Scala. It compiles, but amounts to abuse of the language :/
Consider doing something like this next time:
val a = Range(1, N)
.map { _ => readline.split(" ").map (_.toInt) }
.flatMap { case Array(x, y, r) =>
Seq(x -> (y, r), y -> (x, r))
}
.groupBy(_._1)
.mapValues { _.map ( _._2) }

Scala: apply Map to a list of tuples

very simple question: I want to do something like this:
var arr1: Array[Double] = ...
var arr2: Array[Double] = ...
var arr3: Array[(Double,Double)] = arr1.zip(arr2)
arr3.foreach(x => {if (x._1 > treshold) {x._2 = x._2 * factor}})
I tried a lot differnt syntax versions, but I failed with all of them. How could I solve this? It can not be very difficult ...
Thanks!
Multiple approaches to solve this, consider for instance the use of collect which delivers an immutable collection arr4, as follows,
val arr4 = arr3.collect {
case (x, y) if x > threshold => (x ,y * factor)
case v => v
}
With a for comprehension like this,
for ((x, y) <- arr3)
yield (x, if (x > threshold) y * factor else y)
I think you want to do something like
scala> val arr1 = Array(1.1, 1.2)
arr1: Array[Double] = Array(1.1, 1.2)
scala> val arr2 = Array(1.1, 1.2)
arr2: Array[Double] = Array(1.1, 1.2)
scala> val arr3 = arr1.zip(arr2)
arr3: Array[(Double, Double)] = Array((1.1,1.1), (1.2,1.2))
scala> arr3.filter(_._1> 1.1).map(_._2*2)
res0: Array[Double] = Array(2.4)
I think there are two problems:
You're using foreach, which returns Unit, where you want to use map, which returns an Array[B].
You're trying to update an immutable value, when you want to return a new, updated value. This is the difference between _._2 = _._2 * factor and _._2 * factor.
To filter the values not meeting the threshold:
arr1.zip(arr2).filter(_._1 > threshold).map(_._2 * factor)
To keep all values, but only multiply the ones meeting the threshold:
arr1.zip(arr2).map {
case (x, y) if x > threshold => y * factor
case (_, y) => y
}
You can do it with this,
arr3.map(x => if (x._1 > threshold) (x._1, x._2 * factor) else x)
How about this?
arr3.map { case(x1, x2) => // extract first and second value
if (x1 > treshold) (x1, x2 * factor) // if first value is greater than threshold, 'change' x2
else (x1, x2) // otherwise leave it as it is
}.toMap
Scala is generally functional, which means you do not change values, but create new values, for example you do not write x._2 = …, since tuple is immutable (you can't change it), but create a new tuple.
This will do what you need.
arr3.map(x => if(x._1 > treshold) (x._1, x._2 * factor) else x)
The key here is that you can return tuple from the map lambda expression by putting two variable into (..).
Edit: You want to change every element of an array without creating a new array. Then you need to do the next.
arr3.indices.foreach(x => if(arr3(x)._1 > treshold) (arr3(x)._1, arr3(x)._2 * factor) else x)

Find min and max elements of array

I want to find the min and max elements of an array using for comprehension. Is it possible to do that with one iteration of array to find both min element and max element?
I am looking for a solution without using scala provided array.min or max.
You can get min and max values of an Array[Int] with reduceLeft function.
scala> val a = Array(20, 12, 6, 15, 2, 9)
a: Array[Int] = Array(20, 12, 6, 15, 2, 9)
scala> a.reduceLeft(_ min _)
res: Int = 2
scala> a.reduceLeft(_ max _)
res: Int = 20
See this link for more information and examples of reduceLeft method: http://alvinalexander.com/scala/scala-reduceleft-examples
Here is a concise and readable solution, that avoids the ugly if statements :
def minMax(a: Array[Int]) : (Int, Int) = {
if (a.isEmpty) throw new java.lang.UnsupportedOperationException("array is empty")
a.foldLeft((a(0), a(0)))
{ case ((min, max), e) => (math.min(min, e), math.max(max, e))}
}
Explanation : foldLeft is a standard method in Scala on many collections. It allows to pass an accumulator to a callback function that will be called for each element of the array.
Take a look at scaladoc for further details
def findMinAndMax(array: Array[Int]) = { // a non-empty array
val initial = (array.head, array.head) // a tuple representing min-max
// foldLeft takes an initial value of type of result, in this case a tuple
// foldLeft also takes a function of 2 parameters.
// the 'left' parameter is an accumulator (foldLeft -> accum is left)
// the other parameter is a value from the collection.
// the function2 should return a value which replaces accumulator (in next iteration)
// when the next value from collection will be picked.
// so on till all values are iterated, in the end accum is returned.
array.foldLeft(initial) { ((min, max), x) =>
if (x < min) (x, max)
else if (x > max) (min, x)
else acc
}
}
Following on from the other answers - a more general solution is possible, that works for other collections as well as Array, and other contents as well as Int:
def minmax[B >: A, A](xs: Iterable[A])(implicit cmp: Ordering[B]): (A, A) = {
if (xs.isEmpty) throw new UnsupportedOperationException("empty.minmax")
val initial = (xs.head, xs.head)
xs.foldLeft(initial) { case ((min, max), x) =>
(if (cmp.lt(x, min)) x else min, if (cmp.gt(x, max)) x else max) }
}
For example:
minmax(List(4, 3, 1, 2, 5)) //> res0: (Int, Int) = (1,5)
minmax(Vector('Z', 'K', 'B', 'A')) //> res1: (Char, Char) = (A,Z)
minmax(Array(3.0, 2.0, 1.0)) //> res2: (Double, Double) = (1.0,3.0)
(It's also possible to write this a bit more concisely using cmp.min() and cmp.max(), but only if you remove the B >: A type bound, which makes the function less general).
Consider this (for non-empty orderable arrays),
val ys = xs.sorted
val (min,max) = (ys.head, ys.last)
val xs: Array[Int] = ???
var min: Int = Int.MaxValue
var max: Int = Int.MinValue
for (x <- xs) {
if (x < min) min = x
if (x > max) max = x
}
I'm super late to the party on this one, but I'm new to Scala and thought I'd contribute anyway. Here is a solution using tail recursion:
#tailrec
def max(list: List[Int], currentMax: Int = Int.MinValue): Int = {
if(list.isEmpty) currentMax
else if ( list.head > currentMax) max(list.tail, list.head)
else max(list.tail,currentMax)
}
Of all of the answers I reviewed to this questions, DNA's solution was the closest to "Scala idiomatic" I could find. However, it can be slightly improved by...:
Performing as few comparisons as needed (important for very large collections)
Provide ideal ordering consistency by only using the Ordering.lt method
Avoiding throwing an Exception
Making the code more readable for those new to and learning Scala
The comments should help clarify the changes.
def minAndMax[B>: A, A](iterable: Iterable[A])(implicit ordering: Ordering[B]): Option[(A, A)] =
if (iterable.nonEmpty)
Some(
iterable.foldLeft((iterable.head, iterable.head)) {
case (minAndMaxTuple, element) =>
val (min, max) =
minAndMaxTuple //decode reference to tuple
if (ordering.lt(element, min))
(element, max) //if replacing min, it isn't possible max will change so no need for the max comparison
else
if (ordering.lt(max, element))
(min, element)
else
minAndMaxTuple //use original reference to avoid instantiating a new tuple
}
)
else
None
And here's the solution expanded to return the lower and upper bounds of a 2d space in a single pass, again using the above optimizations:
def minAndMax2d[B >: A, A](iterable: Iterable[(A, A)])(implicit ordering: Ordering[B]): Option[((A, A), (A, A))] =
if (iterable.nonEmpty)
Some(
iterable.foldLeft(((iterable.head._1, iterable.head._1), (iterable.head._2, iterable.head._2))) {
case ((minAndMaxTupleX, minAndMaxTupleY), (elementX, elementY)) =>
val ((minX, maxX), (minY, maxY)) =
(minAndMaxTupleX, minAndMaxTupleY) //decode reference to tuple
(
if (ordering.lt(elementX, minX))
(elementX, maxX) //if replacing minX, it isn't possible maxX will change so no need for the maxX comparison
else
if (ordering.lt(maxX, elementX))
(minX, elementX)
else
minAndMaxTupleX //use original reference to avoid instantiating a new tuple
, if (ordering.lt(elementY, minY))
(elementY, maxY) //if replacing minY, it isn't possible maxY will change so no need for the maxY comparison
else
if (ordering.lt(maxY, elementY))
(minY, elementY)
else
minAndMaxTupleY //use original reference to avoid instantiating a new tuple
)
}
)
else
None
You could always write your own foldLeft function - that will guarantee one iteration and known performance.
val array = Array(3,4,62,8,9,2,1)
if(array.isEmpty) throw new IllegalArgumentException // Just so we can safely call array.head
val (minimum, maximum) = array.foldLeft((array.head, array.head)) { // We start of with the first element as min and max
case ((min, max), next) =>
if(next > max) (min, next)
else if(next < min) (next, max)
else (min, max)
}
println(minimum, maximum) //1, 62
scala> val v = Vector(1,2)
scala> v.max
res0: Int = 2
scala> v.min
res1: Int = 2
You could use the min and max methods of Vector