scala array.product function in 2d array - scala

I have 2 d array:
val arr =Array(Array(2,1),Array(3,1),Array(4,1))
I should multiply all inner 1st elements and sum all inner 2nd elements to get as result:
Array(24,3)
I`m looking a way to use map there, something like :
arr.map(a=>Array(a(1stElemnt).product , a(2ndElemnt).sum ))
Any suggestion
Regards.

Following works but note that it is not safe, it throws exception if arr contains element/s that does not have exact 2 elements. You should add additional missing cases in pattern match as per your use case
val result = arr.fold(Array(1, 0)) {
case (Array(x1, x2), Array(y1, y2)) => Array(x1 * y1, x2 + y2)
}
Update
As #Luis suggested, if you make your original Array[Array] to Array[Tuple], another implementation could look like this
val arr = Array((2, 1), (3, 1), (4, 1))
val (prdArr, sumArr) = arr.unzip
val result = (prdArr.product, sumArr.sum)
println(result) // (24, 3)

Related

How to trick Scala map method to produce more than one output per each input item?

Quite complex algorith is being applied to list of Spark Dataset's rows (list was obtained using groupByKey and flatMapGroups). Most rows are transformed 1 : 1 from input to output, but in some scenarios require more than one output per each input. The input row schema can change anytime. The map() fits the requirements quite well for the 1:1 transformation, but is there a way to use it producing 1 : n output?
The only work-around I found relies on foreach method which has unpleasant overhed cause by creating the initial empty list (remember, unlike the simplified example below, real-life list structure is changing randomly).
My original problem is too complex to share here, but this example demonstrates the concept. Let's have a list of integers. Each should be transformed into its square value and if the input is even it should also transform into one half of the original value:
val X = Seq(1, 2, 3, 4, 5)
val y = X.map(x => x * x) //map is intended for 1:1 transformation so it works great here
val z = X.map(x => for(n <- 1 to 5) (n, x * x)) //this attempt FAILS - generates list of five rows with emtpy tuples
// this work-around works, but newX definition is problematic
var newX = List[Int]() //in reality defining as head of the input list and dropping result's tail at the end
val za = X.foreach(x => {
newX = x*x :: newX
if(x % 2 == 0) newX = (x / 2) :: newX
})
newX
Is there a better way than foreach construct?
.flatMap produces any number of outputs from a single input.
val X = Seq(1, 2, 3, 4, 5)
X.flatMap { x =>
if (x % 2 == 0) Seq(x*x, x / 2) else Seq(x / 2)
}
#=> Seq[Int] = List(0, 4, 1, 1, 16, 2, 2)
flatMap in more detail
In X.map(f), f is a function that maps each input to a single output. By contrast, in X.flatMap(g), the function g maps each input to a sequence of outputs. flatMap then takes all the sequences produced (one for each element in f) and concatenates them.
The neat thing is .flatMap works not just for sequences, but for all sequence-like objects. For an option, for instance, Option(x)#flatMap(g) will allow g to return an Option. Similarly, Future(x)#flatMap(g) will allow g to return a Future.
Whenever the number of elements you return depends on the input, you should think of flatMap.

Adding Sparse Vectors 3.0.0 Apache Spark Scala

I am trying to create a function as the following to add
two org.apache.spark.ml.linalg.Vector. or i.e two sparse vectors
This vector could look as the following
(28,[1,2,3,4,7,11,12,13,14,15,17,20,22,23,24,25],[0.13028398104008743,0.23648605632753023,0.7094581689825907,0.13028398104008743,0.23648605632753023,0.0,0.14218861229025295,0.3580566057240087,0.14218861229025295,0.13028398104008743,0.26056796208017485,0.0,0.14218861229025295,0.06514199052004371,0.13028398104008743,0.23648605632753023])
For e.g.
def add_vectors(x: org.apache.spark.ml.linalg.Vector,y:org.apache.spark.ml.linalg.Vector): org.apache.spark.ml.linalg.Vector = {
}
Let's look at a use case
val x = Vectors.sparse(2, List(0), List(1)) // [1, 0]
val y = Vectors.sparse(2, List(1), List(1)) // [0, 1]
I want to output to be
Vectors.sparse(2, List(0,1), List(1,1))
Here's another case where they share the same indices
val x = Vectors.sparse(2, List(1), List(1))
val y = Vectors.sparse(2, List(1), List(1))
This output should be
Vectors.sparse(2, List(1), List(2))
I've realized doing this is harder than it seems. I looked into one possible solution of converting the vectors into breeze, adding them in breeze and then converting it back to a vector. e.g Addition of two RDD[mllib.linalg.Vector]'s. So I tried implementing this.
def add_vectors(x: org.apache.spark.ml.linalg.Vector,y:org.apache.spark.ml.linalg.Vector) ={
val dense_x = x.toDense
val dense_y = y.toDense
val bv1 = new DenseVector(dense_x.toArray)
val bv2 = new DenseVector(dense_y.toArray)
val vectout = Vectors.dense((bv1 + bv2).toArray)
vectout
}
however this gave me an error in the last line
val vectout = Vectors.dense((bv1 + bv2).toArray)
Cannot resolve the overloaded method 'dense'.
I'm wondering why is error is occurring and ways to fix it?
To answer my own question, I had to think about how sparse vectors are. For e.g. Sparse Vectors require 3 arguments. the number of dimensions, an array of indices, and finally an array of values. For e.g.
val indices: Array[Int] = Array(1,2)
val norms: Array[Double] = Array(0.5,0.3)
val num_int = 4
val vector: Vector = Vectors.sparse(num_int, indices, norms)
If I converted this SparseVector to an Array I would get the following.
code:
val choiced_array = vector.toArray
choiced_array.map(element => print(element + " "))
Output:
[0.0, 0.5,0.3,0.0].
This is considered a more dense representation of it. So once you convert the two vectors to array you can add them with the following code
val add: Array[Double] = (vector.toArray, vector_2.toArray).zipped.map(_ + _)
This gives you another array of them both added. Next to create your new sparse vector, you would want to create an indices array as shown in the construction
var i = -1;
val new_indices_pre = add.map( (element:Double) => {
i = i + 1
if(element > 0.0)
i
else{
-1
}
})
Then lets filter out all -1 indices indication that indicate zero for that indice.
new_indices_pre.filter(element => element != -1)
Remember to filter out none zero values from the array which has the addition of the two vectors.
val final_add = add.filter(element => element > 0.0)
Lastly, we can make the new sparse Vector
Vectors.sparse(num_int,new_indices,final_add)

Scala - create a new list and update particular element from existing list

I am new to Scala and new OOP too. How can I update a particular element in a list while creating a new list.
val numbers= List(1,2,3,4,5)
val result = numbers.map(_*2)
I need to update third element only -> multiply by 2. How can I do that by using map?
You can use zipWithIndex to map the list into a list of tuples, where each element is accompanied by its index. Then, using map with pattern matching - you single out the third element (index = 2):
val numbers = List(1,2,3,4,5)
val result = numbers.zipWithIndex.map {
case (v, i) if i == 2 => v * 2
case (v, _) => v
}
// result: List[Int] = List(1, 2, 6, 4, 5)
Alternatively - you can use patch, which replaces a sub-sequence with a provided one:
numbers.patch(from = 2, patch = Seq(numbers(2) * 2), replaced = 1)
I think the clearest way of achieving this is by using updated(index: Int, elem: Int). For your example, it could be applied as follows:
val result = numbers.updated(2, numbers(2) * 2)
list.zipWithIndex creates a list of pairs with original element on the left, and index in the list on the right (indices are 0-based, so "third element" is at index 2).
val result = number.zipWithIndex.map {
case (n, 2) => n*2
case n => n
}
This creates an intermediate list holding the pairs, and then maps through it to do your transformation. A bit more efficient approach is to use iterator. Iterators a 'lazy', so, rather than creating an intermediate container, it will generate the pairs one-by-one, and send them straight to the .map:
val result = number.iterator.zipWithIndex.map {
case (n, 2) => n*2
case n => n
}.toList
1st and the foremost scala is FOP and not OOP. You can update any element of a list through the keyword "updated", see the following example for details:
Signature :- updated(index,value)
val numbers= List(1,2,3,4,5)
print(numbers.updated(2,10))
Now here the 1st argument is the index and the 2nd argument is the value. The result of this code will modify the list to:
List(1, 2, 10, 4, 5).

How to traverse array from both left to right and from right to left?

Suppose I have an imperative algorithm that keeps two indices left and right and moves them from left to right and from right to left
var left = 0
var right = array.length - 1
while (left < right) { .... } // move left and right inside the loop
Now I would like to write this algorithm without mutable indices.
How can I do that ? Do you have any examples of such algorithms ? I would prefer a non-recursive approach.
You can map pairs of elements between your list and its reverse, then go from left to right through that list of pairs and keep taking as long as your condition is satisfied:
val list = List(1, 2, 3, 4, 5)
val zipped = list zip list.reverse
val filtered = zipped takeWhile { case (a, b) => (a < b) }
Value of filtered is List((1, 5), (2, 4)).
Now you can do whatever you need with those elements:
val result = filtered map {
case (a, b) =>
// do something with each left-right pair, e.g. sum them
a + b
}
println(result) // List(6, 6)
If you need some kind of context dependant operation (that is, each
iteration depends on the result of the previous one) then you have to
use a more powerful abstraction (monad), but let's not go there if
this is enough for you. Even better would be to simply use recursion, as pointed out by others, but you said that's not an option.
EDIT:
Version without extra pass for reversing, only constant-time access for elem(length - index):
val list = List(1, 2, 3, 4, 5)
val zipped = list.view.zipWithIndex
val filtered = zipped takeWhile { case (a, index) => (a < list(list.length - 1 - index)) }
println(filtered.toList) // List((1, 0), (2, 1))
val result = filtered map {
case (elem, index) => // do something with each left-right pair, e.g. sum them
val (a, b) = (elem, list(list.length - 1 - index))
a + b
}
println(result.toList) // List(6, 6)
Use reverseIterator:
scala> val arr = Array(1,2,3,4,5)
arr: Array[Int] = Array(1, 2, 3, 4, 5)
scala> arr.iterator.zip(arr.reverseIterator).foreach(println)
(1,5)
(2,4)
(3,3)
(4,2)
(5,1)
This function is efficient on IndexedSeq collections, which Array is implicitly convertible to.
It really depends on what needs to be done at each iteration, but here's something to think about.
array.foldRight(0){case (elem, index) =>
if (index < array.length/2) {
/* array(index) and elem are opposite elements in the array */
/* do whatever (note: requires side effects) */
index+1
} else index // do nothing
} // ignore result
Upside: Traverse the array only once and no mutable variables.
Downside: Requires side effects (but that was implied in your example). Also, it'd be better if it traversed only half the array, but that would require early breakout and Scala doesn't offer an easy/elegant solution for that.
myarray = [1,2,3,4,5,6]
rmyarray = myarray[::-1]
Final_Result = []
for i in range(len(myarray)//2):
Final_Result.append(myarray[i])
Final_Result.append(rmyarray[i])
print(Final_Result)
# This is the simple approach I think 😉.

How to Map Partial Elements in Scala/Spark

I have a list of integers:
val mylist = List(1, 2, 3, 4)
What I want to do is to map the element which are even numbers in mylist, and multiply them by 2.
Maybe the code should be:
mylist.map{ case x%2==2 => x*2 }
I expect the result to be List(4, 8) but it's not. What is the correct code?
I know I could realize this function by using filter + map
a.filter(_%2 == 0).map(_*2)
but is there some way to realize this function by only using map()?
map does not reduce number of elements in transformation.
filter + map is right approach.
But if single method is needed, use collect:
mylist.collect{ case x if x % 2 == 0 => 2 * x }
Edit:
withFilter + map is more efficient than filter + map (as withFilter does not create intermediate collection, i.e. it works lazily):
mylist.withFilter(_ % 2 == 0).map(_ * 2)
which is same as for :
for { e <- mylist if (e % 2 == 0) } yield 2 * e