using scala Find four elements from list that sum to a given value - scala

I am new to scala programming language and want to implement the code having below scenerio.
given a list sampleone of n integer and an integer samplethree, there are elements a,b,c and d in sampleone such that a+b+c+d = samplethree. Find all unique quadruplet in the list which gives the sum of samplethree
Example:
sampleone =[1,0,-1,0,-2,2] and samplethree = 0
a solution set is
[-1,0,0,1]
[-2,-1,1,2]
[-2,0,0,2]
the code that I have used is
scala> def findFourElements(A: List[Int], n: Int, x: Int) = {
| {
| for(a <- 0 to A.length-3)
| {
| for(b <- a+1 to A.length-2)
| {
| for(c <- b+1 to A.length-1)
| {
| for(d <- c+1 to A.length)
| {
| if(A(a) + A(b) + A(c) + A(d) == x)
| {
| print(A(a)+A(b)+A(c)+A(d))
| }}}}}}
| }
findFourElements: (A: List[Int], n: Int, x: Int)Unit
scala> val sampleone = List(1,0,-1,0,-2,2)
sampleone: List[Int] = List(1, 0, -1, 0, -2, 2)
scala> val sampletwo = sampleone.length
sampletwo: Int = 6
scala> val samplethree = 0
samplethree: Int = 0
scala> findFourElements(sampleone,sampletwo,samplethree)
0java.lang.IndexOutOfBoundsException: 6
at scala.collection.LinearSeqOptimized$class.apply(LinearSeqOptimized.scala:65)
at scala.collection.immutable.List.apply(List.scala:84)
at $anonfun$findFourElements$1$$anonfun$apply$mcVI$sp$1$$anonfun$apply$mcVI$sp$2$$anonfun$apply$mcVI$sp$3.apply$mcVI$sp(<console>:33)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160)
at $anonfun$findFourElements$1$$anonfun$apply$mcVI$sp$1$$anonfun$apply$mcVI$sp$2.apply$mcVI$sp(<console>:31)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160)
at $anonfun$findFourElements$1$$anonfun$apply$mcVI$sp$1.apply$mcVI$sp(<console>:29)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160)
at $anonfun$findFourElements$1.apply$mcVI$sp(<console>:27)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:160)
at findFourElements(<console>:25)
... 48 elided
But I am getting error of index out of bound exception.
Also is there a way to have a more optimized code in scala.
Thanks for help.

This may do what you want:
sampleone.combinations(4).filter(_.sum == samplethree)
The combinations method gives an iterator that delivers each possible combination of values in turn. If there is more than one way to construct the same sequence, only one will be returned.
The filter call removes any sequences that do not sum to the samplethree value.

Related

Scala: Run a function that is written for arrays on a dataframe that contains column of array

So, I have the following functions that work perfectly when I use them on arrays:
def magnitude(x: Array[Int]): Double = {
math.sqrt(x map(i => i*i) sum)
}
def dotProduct(x: Array[Int], y: Array[Int]): Int = {
(for((a, b) <- x zip y) yield a * b) sum
}
def cosineSimilarity(x: Array[Int], y: Array[Int]): Double = {
require(x.size == y.size)
dotProduct(x, y)/(magnitude(x) * magnitude(y))
}
But, I don't know how to run it on an array that I have in a spark dataframe column.
I know the problem is that the function expects an array, but I am giving a column to it. But, I don't know how to solve the problem.
One way is to wrap your functions within UDFs. Yet UDFs are known to be suboptimal most of the time. You could therefore rewrite your functions with spark primitives. To ease the reuse of the expression you write, you can write functions that take Column objects as parameters.
import org.apache.spark.sql.Column
def magnitude(x : Column) = {
aggregate(transform(x, _ * _), lit(0), _ + _)
}
def dotProduct(x : Column, y : Column) = {
val products = transform(arrays_zip(x, y), s => s(x.toString) * s(y.toString))
aggregate(products, lit(0), _ + _)
}
def cosineSimilarity(x : Column, y : Column) = {
dotProduct(x, y) / (magnitude(x) * magnitude(y))
}
Let's test this:
val df = spark.range(1).select(
array(lit(1), lit(2), lit(3)) as "x",
array(lit(1), lit(3), lit(5)) as "y"
)
df.select(
'x, 'y,
magnitude('x) as "magnitude_x",
dotProduct('x, 'y) as "dot_prod_x_y",
cosineSimilarity('x, 'y) as "cosine_x_y"
).show()
which yields:
+---------+---------+-----------+------------+--------------------+
| x| y|magnitude_x|dot_prod_x_y| cosine_x_y|
+---------+---------+-----------+------------+--------------------+
|[1, 2, 3]|[1, 3, 5]| 14| 22|0.044897959183673466|
+---------+---------+-----------+------------+--------------------+
To use your own functions within sparkSQL, you need to wrap them inside of a UDF (user defined function).
val df = spark.range(1)
.withColumn("x", array(lit(1), lit(2), lit(3)))
// defining the user defined functions from the scala functions.
val magnitude_udf = udf(magnitude _)
val dot_product_udf = udf(dotProduct(_,_))
df
.withColumn("magnitude", magnitude_udf('x))
.withColumn("dot_product", dot_product_udf('x, 'x))
.show
+---+---------+------------------+-----------+
| id| x| magnitude|dot_product|
+---+---------+------------------+-----------+
| 0|[1, 2, 3]|3.7416573867739413| 14|
+---+---------+------------------+-----------+

Scala foldRight giving different results than reverse.foldLeft

I am writing a function to convert a byte array to an integer. It detects the endianness of the system to determine which direction it should read the 4 byte array from. I am using a foldLeft with bitshifting and bitwise or. Ideally, if it is little endian, you just fold right instead
However, this is returning the wrong value.
scala> val bytes = Array[Byte](0xAA.toByte, 0xBB.toByte, 0xCC.toByte, 0xDD.toByte)
val bytes: Array[Byte] = Array(-86, -69, -52, -35)
scala> bytes.foldLeft(0)((accum, num) => {
| (num & 0x000000FF) | (accum << 8)
| }).toHexString
val res38: String = aabbccdd <- correct
scala> bytes.foldRight(0)((accum, num) => {
| (num & 0x000000FF) | (accum << 8)
| }).toHexString
val res36: String = ffffaa00 <- wat
scala> bytes.reverse.foldLeft(0)((accum, num) => {
| (num & 0x000000FF) | (accum << 8)
| }).toHexString
val res37: String = ddccbbaa <- correct
I'm a bit of a scala noob, would someone mind pointing out why this is producing different values?

Recursion in Scala

New to Scala and trying to figure out recursion.
Having the fallowing definitions in my session:
def inc(n: Int) = n + 1
def dec(n: Int) = n – 1
How could I redefine function below to use recursion inc and dec?
add(n: Int, m: Int) = n + m
I'm interested in learning both regular recursion and tail recursion.
Thanks
How about this:
scala> def inc(n: Int) = n + 1
inc: (n: Int)Int
scala> def dec(n: Int) = n - 1
dec: (n: Int)Int
scala> def add(n: Int, m: Int): Int = m match {
| case 0 => n
| case _ if m > 0 => add(inc(n), dec(m))
| case _ => add(dec(n), inc(m))
| }
add: (n: Int, m: Int)Int
scala> add(100, 99)
res0: Int = 199
scala> add(100, -99)
res1: Int = 1
Or there is another solution, which is an implementation of the Peano axioms.
scala> def add2(n: Int, m: Int): Int = m match {
| case 0 => n
| case _ if m > 0 => inc(add2(n, dec(m)))
| case _ => dec(add2(n, inc(m)))
| }
add2: (n: Int, m: Int)Int
Tail Recursion has 3 parts as far as I'm concerning:
Condition to end recursion
return value if the condition is met, the returned value is one (or derived from) the parameters of the tail recursive function
and the call to itself if the condition is unmet.
sample:
def inc(n: Int) = n + 1
def dec(n: Int) = n - 1
def add(n:Int, m:Int, sum: Int):Int = {
//condition to break/end the recursion
if (m <= 0) {
// returned value once condition is met. This is the final output of the recursion
sum
} else {
//call to itself once condition is unmet
add(inc(n), dec(m), n + m)
}
}
as you can see, it feels like you are doing while loop but more functional way.
on recursion, calls are stack which result to having it's call stack size as depth of the recursive calls (which can result to stackoverflowexception) on tail recursion it is like how while loop is interpreted.
sample of recursion:
def addAllNumberFromNToZero(n:Int):Int = {
if (m <= 0) {
sum
} else {
n + add(n - 1)
}
}
Using regular recursion, you could try something like:
def inc(n: Int) = n + 1
def dec(n: Int) = n - 1
def add(n: Int, m: Int): Int = {
if (m == 0) n
else add(inc(n), dec(m))
}
The add function recursively calls itself add, each time incrementing n and reducing m. The recursion stops when m reaches zero, at which point m is returned.

Accessing the index of a particular cell in Scala

I have to write a method "all()" which returns a list of tuples; each tuple will contain the row, column and set relevant to a particular given row and column, when the function meets a 0 in the list. I already have written the "hyp" function which returns the set part I need, eg: Set(1,2). I am using a list of lists:
| 0 | 0 | 9 |
| 0 | x | 0 |
| 7 | 0 | 8 |
If Set (1,2) are referring to the cell marked as x, all() should return: (1,1, Set(1,2)) where 1,1 are the index of the row and column.
I wrote this method by using zipWithIndex. Is there any simpler way how to access an index as in this case without zipWithIndex? Thanks in advance
Code:
def all(): List[(Int, Int, Set[Int])] =
{
puzzle.list.zipWithIndex flatMap
{
rowAndIndex =>
rowAndIndex._1.zipWithIndex.withFilter(_._1 == 0) map
{
colAndIndex =>
(rowAndIndex._2, colAndIndex._2, hyp(rowAndIndex._2, colAndIndex._2))
}
}
}
The (_._1 == 0 ) is because the function has to return the (Int,Int, Set()) only when it finds a 0 in the grid
It's fairly common to use zipWithIndex. Can minimise wrestling with Tuples/Pairs through pattern matching vars within the tuple:
def all(grid: List[List[Int]]): List[(Int, Int, Set[Int])] =
grid.zipWithIndex flatMap {case (row, r) =>
row.zipWithIndex.withFilter(_._1 == 0) map {case (col, c) => (r, c, hyp(r, c))}
}
Can be converted to a 100% equivalent for-comprehension:
def all(grid: List[List[Int]]): List[(Int, Int, Set[Int])] =
for {(row, r) <- grid.zipWithIndex;
(col, c) <- row.zipWithIndex if (col == 0)} yield (r, c, hyp(r, c))
Both of above produce the same compiled code.
Note that your requirement means that all solutions are minimum O(n) = O(r*c) - you must visit each and every cell. However the overall behaviour of user60561's answer is O(n^2) = O((r*c)^2): for each cell, there's an O(n) lookup in list(x)(y):
for{ x <- list.indices
y <- list(0).indices
if list(x)(y) == 0 } yield (x, y, hyp(x, y))
Here's similar (imperative!) logic, but O(n):
var r, c = -1
for{ row <- list; col <- row if col == 0} yield {
r += 1
c += 1
(r, c, hyp(r, c))
}
Recursive version (uses results-accumulator to enable tail-recursion):
type Grid = List[List[Int]]
type GridHyp = List[(Int, Int, Set[Int])]
def all(grid: Grid): GridHyp = {
def rowHypIter(row: List[Int], r: Int, c: Int, accum: GridHyp) = row match {
case Nil => accum
case col :: othCols => rowHypIter(othCols, r, c + 1, hyp(r, c) :: accum)}
def gridHypIter(grid: Grid, r: Int, accum: GridHyp) = grid match {
case Nil => accum
case row :: othRows => gridHypIter(othRows, r + 1, rowHyp(row, r, 0, accum))}
gridHypIter(grid, 0, Nil)
}
'Monadic' logic (flatmap/map/withFilter OR equivalent for-comprehensions) is often/usually neater than recursion + pattern-matching - evident here.
The simplest way I can think of is just a classic for loop:
for{ x <- list.indices
y <- list(0).indices
if list(x)(y) == 0 } yield (x, y, hyp(x, y))
It assumes that your second dimension is of an uniform size. With this code, I would also recommend you use an Array or Vector if your grid sizes are larger then 100 or so because list(x)(y) is a O(n) operation.

Scala methods ending in _=

I seem to remember Scala treating methods ending in _= specially, so something like this:
object X { var x: Int = 0; def y_=(n : Int) { x = n }}
X.y = 1
should call X.y_=(1). However, in 2.8.0 RC1, I get an error message:
<console>:6: error: value y is not a member of object X
X.y = 1
^
Interestingly, just trying to call the method without parentheses fails as well:
scala> X.y_= 1
<console>:1: error: ';' expected but integer literal found.
X.y_= 1
^
Am I misremembering something which does actually exist or did I just invent it out of whole cloth?
This is one of those corner cases in Scala. You cannot have a setter without a getter and vice versa.
The following works fine:
scala> object X {
| var x: Int = 0
| def y = x
| def y_=(n: Int) { x = n }
| }
defined module X
scala> X.y = 45
scala> X.y
res0: Int = 45
scala> object X { var x: Int = 0; def y_=(n : Int) { x = n }}
defined module X
scala>
scala> X y_= 1
scala> X.x
res1: Int = 1
scala> object X { var x: Int = _; def y = x ; def y_=(n: Int) { x = n } }
defined module X
scala> X.y = 1
scala> X.y
res2: Int = 1
scala>