Max subsequence sum in the array with no two adjacent elements in Scala - scala

I am trying to solve a problem of calculating max subsequence sum of an array with no adjacent elements are part of that sum.
For every element at ith index, i am checking max of i-2 and i-3 elements and adding ith element to that to get max so that two adjacent elements are not included in any sum.
I solved it in Scala below recursive way : ideone link
/**
* Question: Given an array of positive numbers, find the maximum sum of a subsequence with the constraint that no 2 numbers in the sequence should be adjacent in the array.
*/
object Main extends App {
val inputArray = Array(5, 15, 10, 40, 50, 35)
print(getMaxAlternativeElementSum(0, 0, inputArray(0)))
def getMaxAlternativeElementSum(tracker: Int, prevSum: Int, curSum: Int):Int = tracker match {
case _ if tracker == 0 => getMaxAlternativeElementSum(tracker+1, 0, inputArray(tracker))
case _ if tracker >= inputArray.length => curSum
case _ => val maxSum = curSum.max(prevSum)
getMaxAlternativeElementSum(tracker+1, maxSum, prevSum+inputArray(tracker))
}
}
Every time, i am carrying previous two sums to next iteration using recursive approach. Can i do this elegantly using any Scala idioms?

Not sure if I understood correctly what you want to do but maybe this will work for you:
def getMaxAlternativeElementSum(input: Array[Int]) : Int = {
val sums =
input.zipWithIndex.fold((0, 0)) { (acc, elem) =>
elem._2 % 2 match {
case 0 => (acc._1 + elem._1, acc._2)
case 1 => (acc._1, acc._2 + elem._1)
}
}
if (sums._1 > sums._2) sums._1 else sums._2
}

Related

Avoid ListBuffer while preparing an element-wise multiplication of two SparseVectors

I'm trying to implement a element-wise multiplication of two ml.linalg.SparseVector instances (also called a Hadamard product).
A SparseVector represents a vector, but rather than having space taken up by all the "0" values, they are omitted. The vector is represented as two lists of Indices and Values.
For example: SparseVector(indices: [0, 100, 100000], values: [0.25, 1, 0.8]) concisely represents an array of 100,000 elements, where only 3 values are non-zero.
I now need an element-wise multiplication of two of these, and there seems to be no built-in. Conceptually, it should be simple - any indices they don't have in common are dropped, and for the indices in common, the numbers are multiplied together.
For example: SparseVector(indices: [0, 500, 100000], values: [10, 1, 10]) when multiplied with the above should return: SparseVector(indices: [0, 100000], values: [2.5, 8])
Sadly, I've found no built-in for this. I have an approach for doing this in a single pass, but it isn't very scala-y, it has to build up the list in a loop as it discovers which indices are in common, and then grab the corresponding values for each index (which have the same cardinal position, but in a second array).
import org.apache.spark.ml.linalg._
import org.apache.spark.sql.functions.udf
import scala.collection.mutable.ListBuffer
// Return a new SparseVector whose values are the element-wise product (Hadamard product)
val multSparseVectors = udf((v1: SparseVector, v2: SparseVector) => {
// val commonIndexes = v1.indices.intersect(v2.indices); // Missing scale factors are assumed to have a value of 0, so only common elements remain
// TODO: No clear way to map common indices to the values that go with those indices. E.g. no "valueForIndex" method
// new SparseVector(v1.size, commonIndexes, commonIndexes.map(i => v1.valueForIndex(i) * v2.valueForIndex(i)).toArray);
val indices = ListBuffer[Int](); // TODO: Some way to do this without mutable lists?
val values = ListBuffer[Double]();
var v1Pos = 0; // Current index of SparseVector v1 (we will be making a single pass)
var v2pos = 0; // Current index of SparseVector v2 (we will be making a single pass)
while(v1Pos < v1.indices.length && v2pos < v2.indices.length) {
while(v1.indices(v1Pos) < v2.indices(v2pos))
v2pos += 1; // Advance our position in SparseVector 2 until we've matched or passed the current SparseVector 1 index
if(v2pos > v2.indices.length && v1.indices(v1Pos) == v2.indices(v2pos)) {
indices += v1.indices(v1Pos);
values += v1.values(v1Pos) * v2.values(v2pos);
}
v1Pos += 1;
}
new SparseVector(v1.size, indices.toArray, values.toArray);
})
spark.udf.register("multSparseVectors", multSparseVectors)
Can anyone think of a way that I can do this using a map or similar? My main goal is I want to avoid having to make multiple O(N) passes over the second vector to "lookup" the position of a value in the indices list so that I can grab the corresponding values entry, because this would take O(K + N*2) time when I know there's an O(K + N) solution possible.
I've come up with a solution by boiling this problem into a more general one:
Finding the indices at which two arrays intersect
Given an answer to the above question (where to the two arrays v1.indices and v2.indices intersect), we can trivially use those indices to extract back the new SparseVector indices, and the values from each vector to be multiplied together.
The solution is given below:
%scala
import scala.annotation.tailrec
import org.apache.spark.ml.linalg._
import org.apache.spark.sql.functions.udf
// This fanciness from https://stackoverflow.com/a/71928709/529618 finds the indices at which two lists intersect
#tailrec
def indicesOfIntersection(left: List[Int], right: List[Int], lidx: Int = 0, ridx: Int = 0, result: List[(Int, Int)] = Nil): List[(Int, Int)] = (left, right) match {
case (Nil, _) | (_, Nil) => result.reverse
case (l::tail, r::_) if l < r => indicesOfIntersection(tail, right, lidx+1, ridx, result)
case (l::_, r::tail) if l > r => indicesOfIntersection(left, tail, lidx, ridx+1, result)
case (l::ltail, r::rtail) => indicesOfIntersection(ltail, rtail, lidx+1, ridx+1, (lidx, ridx) :: result)
}
// Return a new SparseVector whose values are the element-wise product (Hadamard product)
val multSparseVectors = udf((v1: SparseVector, v2: SparseVector) => {
val intersection = indicesOfIntersection(v1.indices.toList, v2.indices.toList);
new SparseVector(v1.size,
intersection.map{case (x1,_) => v1.indices(x1)}.toArray,
intersection.map{case (x1,x2) => v1.values(x1) * v2.values(x2)}.toArray);
})
spark.udf.register("multSparseVectors", multSparseVectors)

Splitting a Scala Range into evenly-sized contiguous sub-Ranges

If I have a Range, how can I split it into a sequence of contiguous sub-ranges, where the number of sub-ranges (buckets) is specified? Empty buckets should be omitted if there are not enough items.
For example:
splitRange(1 to 6, 3) == Seq(Range(1,2), Range(3,4), Range(5,6))
splitRange(1 to 2, 3) == Seq(Range(1), Range(2))
Some additional constraints, that rule out some of the solutions I've seen:
Roughly even bucket size - the bucket size should vary by 1, at most
The length of the input range may sometimes be very large, so the ranges should not be materialized into sequences (e.g. can't use grouped)
This also implies that we don't allocate numbers to buckets in round-robin fashion, because then numbers in each bucket wouldn't be contiguous and so wouldn't form a Range
Ideally, the sub-ranges would be produced in order, i.e (1,2)(3,4), not (3,4)(1,2)
A colleague found a solution here:
def splitRange(r: Range, chunks: Int): Seq[Range] = {
if (r.step != 1)
throw new IllegalArgumentException("Range must have step size equal to 1")
val nchunks = scala.math.max(chunks, 1)
val chunkSize = scala.math.max(r.length / nchunks, 1)
val starts = r.by(chunkSize).take(nchunks)
val ends = starts.map(_ - 1).drop(1) :+ r.end
starts.zip(ends).map(x => x._1 to x._2)
}
but this can produce very uneven bucket sizes when N is small, e.g:
splitRange(1 to 14, 5)
//> Vector(Range(1, 2), Range(3, 4), Range(5, 6),
//| Range(7, 8), Range(9, 10, 11, 12, 13, 14))
^^^^^^^^^^^^^^^^^^^^^
Floating-point approaches
One way is to generate a fractional (floating-point) offset for each bucket, then convert these to integer Ranges, by zipping. Empty Ranges also need filtering out using collect.
def splitRange(r: Range, chunks: Int): Seq[Range] = {
require(r.step == 1, "Range must have step size equal to 1")
require(chunks >= 1, "Must ask for at least 1 chunk")
val m = r.length.toDouble
val chunkSize = m / chunks
val bins = (0 to chunks).map { x => math.round((x.toDouble * m) / chunks).toInt }
val pairs = bins zip (bins.tail)
pairs.collect { case (a, b) if b > a => a to b }
}
(The first version of this solution had a rounding problem such that it could not handle Int.MaxValue - this has now been fixed based on Rex Kerr's recursive floating-point solution below)
Another floating-point approach is to recurse down the range, taking the head off the range each time, so we cannot miss any elements. This version can handle Int.MaxValue correctly.
def splitRange(r: Range, chunks: Int): Seq[Range] = {
require(r.step == 1, "Range must have step size equal to 1")
require(chunks >= 1, "Must ask for at least 1 chunk")
val chunkSize = r.length.toDouble / chunks
def go(i: Int, r: Range, delta: Double, acc: List[Range]): List[Range] = {
if (i == chunks) r :: acc
// ensures the last chunk has all remaining values, even if error accumulates
else {
val s = delta + chunkSize
val (chunk, rest) = r.splitAt(s.toInt)
go(i + 1, rest, s - s.toInt, if (chunk.length > 0) chunk :: acc else acc)
}
}
go(1, r, 0.0D, Nil).reverse
}
One can also recurse to generate the (start,end) pairs, rather than zipping them. This is adapted from Rex Kerr's answer to a similar question
def splitRange(r: Range, chunks: Int): Seq[Range] = {
require(r.step == 1, "Range must have step size equal to 1")
require(chunks >= 1, "Must ask for at least 1 chunk")
val m = r.length
val bins = (0 to chunks).map { x => math.round((x.toDouble * m) / chunks).toInt }
def snip(r: Range, ns: Seq[Int], got: Vector[Range]): Vector[Range] = {
if (ns.length < 2) got
else {
val (i, j) = (ns.head, ns.tail.head)
snip(r.drop(j - i), ns.tail, got :+ r.take(j - i))
}
}
snip(r, bins, Vector.empty).filter(_.length > 0)
}
Integer approach
Finally, I realized that this can be done with purely integer arithmetic by adapting Bresenham's line-drawing algorithm, which solves a basically equivalent problem - how to allocate the x-pixels evenly across the y rows, using only integer operations!
I initially translated the pseudo-code into an imperative solution using var and ArrayBuffer, then converted it into a tail-recursive solution:
def splitRange(r: Range, chunks: Int): List[Range] = {
require(r.step == 1, "Range must have step size equal to 1")
require(chunks >= 1, "Must ask for at least 1 chunk")
val dy = r.length
val dx = chunks
#tailrec
def go(y0:Int, y:Int, d:Int, ch:Int, acc: List[Range]):List[Range] = {
if (ch == 0) acc
else {
if (d > 0) go(y0, y-1, d-dx, ch, acc)
else go(y-1, y, d+dy, ch-1, if (y > y0) acc
else (y to y0) :: acc)
}
}
go(r.end, r.end, dy - dx, chunks, Nil)
}
Please see the Wikipedia link for a full explanation, but essentially the algorithm zig-zags up the slope of a line, alternatively adding the y-range dy and subtracting the x-range dx. If these don't divide exactly, then an error accumulates until it divides exactly, leading to an extra pixel in some sub-ranges.
splitRange(3 to 15, 5)
//> List(Range(3, 4), Range(5, 6, 7), Range(8, 9),
//| Range(10, 11, 12), Range(13, 14, 15))

Recursively accumulate elements of collection

I have a collection of elements with an integer range such as
case class Element(id: Int, from: Int, to: Int)
val elementColl: Traversable[Element]
and I want to accumulate them into
case class ElementAcc(ids: List[Int], from: Int, to: Int)
according to the following algorithm:
Take one Element from my elementColl and use it to create a new ElementsAcc which has the same from/to as the Element taken.
Iterate over remaining elements in elementColl to look for an Element that has an overlapping integer range with our ElementAcc.
If one is found, add it to ElementAcc and expand the integer range of ElementAcc to include the range of the new Element
If none is found, repeat the process above on the remaining elements of elementColl that have not yet been assigned to an ElementAcc
This should result in collection of ElementAcc's. While just recursively adding elements to an accumulator seems easy enough, I don't know how to handle the shrinking size of elementColl so that I don't add the same Element to multiple ElementAcc's
Edit: I think I was unclear regarding the extension of the range. So let my clarify this on an example:
My accumulator currently has a range from 1 to 5. An Element with a range from 6 to 8 does not overlap with the accumulator range and thus will not be included. An Element with a range of 4 to 7 does overlap, will be included and the resulting accumulator has a range from 1 to 7.
I'll go like this:
1) Write a function that takes an ElementAcc and an Element and returns an ElementAcc.
The function would look like:
def extend(acc: ElementAcc, e: Element): ElementAcc = {
if(acc.from <= e.from && e.from <= acc.to)
ElementAcc(e.id :: acc.ids, acc.from, math.max(acc.to, e.to))
else if (acc.from <= e.to && e.to <= acc.to)
ElementAcc(e.id :: acc.ids, math.min(acc.from, e.from), acc.to)
else acc
}
foldLeft is often the good solution when accumulating objects.
It needs an initial value for the accumulator and an function that takes an accumulator and an element and returns an accumulator. Then it accumulates all elements of the traversable.
EDIT:
2) To accumulate on different lists you would have to create another function to combine a List[ElementAcc] and an Element :
def overlap(acc: ElementAcc, e: Element): Boolean = {
(acc.from <= e.from && e.from <= acc.to) || (acc.from <= e.to && e.to <= acc.to)
}
def dispatch(accList: List[ElementAcc], e: Element): List[ElementAcc] = accList match {
case Nil => List(ElementAcc(List(e.id), e.from, e.to))
case acc :: tail =>
if (overlap(acc, e)) extend(acc, e) :: tail
else acc :: dispatch(tail, e)
}
3) And it's used with a foldLeft:
val a = Element(0, 0, 5)
val b = Element(1, 3, 8)
val c = Element(2, 20, 30)
val sorted = List(a, b, c).foldLeft(List[ElementAcc]())(dispatch)
sorted: List[ElementAcc] = List(ElementAcc(List(1, 0),0,8), ElementAcc(List(2),20,30))

Kadane's Algorithm in Scala

Does anyone have a Scala implementation of Kadane's algorithm done in a functional style?
Edit Note: The definition on the link has changed in a way that invalidated answers to this question -- which goes to show why questions (and answers) should be self-contained instead of relying on external links. Here's the original definition:
In computer science, the maximum subarray problem is the task of finding the contiguous subarray within a one-dimensional array of numbers (containing at least one positive number) which has the largest sum. For example, for the sequence of values −2, 1, −3, 4, −1, 2, 1, −5, 4; the contiguous subarray with the largest sum is 4, −1, 2, 1, with sum 6.
What about this, if an empty subarray is allowed or the input array cannot be all negative:
numbers.scanLeft(0)((acc, n) => math.max(0, acc + n)).max
Or, failing the conditions above this (which assumes the input is non-empty):
numbers.tail.scanLeft(numbers.head)((acc, n) => (acc + n).max(n)).max
I prefer the folding solution to the scan solution -- though there's certainly elegance to the latter. Anyway,
numbers.foldLeft(0 -> 0) {
case ((maxUpToHere, maxSoFar), n) =>
val maxEndingHere = 0 max maxUpToHere + n
maxEndingHere -> (maxEndingHere max maxSoFar)
}._2
The following code returns the start and end index as well as the sum:
import scala.math.Numeric.Implicits.infixNumericOps
import scala.math.Ordering.Implicits.infixOrderingOps
case class Sub[T: Numeric](start: Index, end: Index, sum: T)
def maxSubSeq[T](arr: collection.IndexedSeq[T])(implicit n: Numeric[T]) =
arr
.view
.zipWithIndex
.scanLeft(Sub(-1, -1, n.zero)) {
case (p, (x, i)) if p.sum > n.zero => Sub(p.start, i, p.sum + x)
case (_, (x, i)) => Sub(i, i, x)
}
.drop(1)
.maxByOption(_.sum)

How do I populate a list of objects with new values

Apologies: I'm well noob
I have an items class
class item(ind:Int,freq:Int,gap:Int){}
I have an ordered list of ints
val listVar = a.toList
where a is an array
I want a list of items called metrics where
ind is the (unique) integer
freq is the number of times that ind appears in list
gap is the minimum gap between ind and the number in the list before it
so far I have:
def metrics = for {
n <- 0 until 255
listVar filter (x == n) count > 0
}
yield new item(n, (listVar filter == n).count,0)
It's crap and I know it - any clues?
Well, some of it is easy:
val freqMap = listVar groupBy identity mapValues (_.size)
This gives you ind and freq. To get gap I'd use a fold:
val gapMap = listVar.sliding(2).foldLeft(Map[Int, Int]()) {
case (map, List(prev, ind)) =>
map + (ind -> (map.getOrElse(ind, Int.MaxValue) min ind - prev))
}
Now you just need to unify them:
freqMap.keys.map( k => new item(k, freqMap(k), gapMap.getOrElse(k, 0)) )
Ideally you want to traverse the list only once and in the course for each different Int, you want to increment a counter (the frequency) as well as keep track of the minimum gap.
You can use a case class to store the frequency and the minimum gap, the value stored will be immutable. Note that minGap may not be defined.
case class Metric(frequency: Int, minGap: Option[Int])
In the general case you can use a Map[Int, Metric] to lookup the Metric immutable object. Looking for the minimum gap is the harder part. To look for gap, you can use the sliding(2) method. It will traverse the list with a sliding window of size two allowing to compare each Int to its previous value so that you can compute the gap.
Finally you need to accumulate and update the information as you traverse the list. This can be done by folding each element of the list into your temporary result until you traverse the whole list and get the complete result.
Putting things together:
listVar.sliding(2).foldLeft(
Map[Int, Metric]().withDefaultValue(Metric(0, None))
) {
case (map, List(a, b)) =>
val metric = map(b)
val newGap = metric.minGap match {
case None => math.abs(b - a)
case Some(gap) => math.min(gap, math.abs(b - a))
}
val newMetric = Metric(metric.frequency + 1, Some(newGap))
map + (b -> newMetric)
case (map, List(a)) =>
map + (a -> Metric(1, None))
case (map, _) =>
map
}
Result for listVar: List[Int] = List(2, 2, 4, 4, 0, 2, 2, 2, 4, 4)
scala.collection.immutable.Map[Int,Metric] = Map(2 -> Metric(4,Some(0)),
4 -> Metric(4,Some(0)), 0 -> Metric(1,Some(4)))
You can then turn the result into your desired item class using map.toSeq.map((i, m) => new Item(i, m.frequency, m.minGap.getOrElse(-1))).
You can also create directly your Item object in the process, but I thought the code would be harder to read.