Scala List.filter with two conditions, applied only once - scala

Don't know if this is possible, but I have some code like this:
val list = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
val evens = list.filter { e => e % 2 == 0 }
if(someCondition) {
val result = evens.filter { e => e % 3 == 0 }
} else {
val result = evens.filter { e => e % 5 == 0 }
}
But I don't want to iterate over all elements twice, so is there a way that I can create a "generic pick-all-the-evens numbers on this collection" and apply some other function, so that it would only iterate once?

If you turn list into a lazy collection, such as an Iterator, then you can apply all the filter operations (or other things like map etc) in one pass:
val list = (1 to 12).toList
val doubleFiltered: List[Int] =
list.iterator
.filter(_ % 2 == 0)
.filter(_ % 3 == 0)
.toList
println(doubleFiltered)
When you convert the collection to an Iterator with .iterator, Scala will keep track of the operations to be performed (here, two filters), but will wait to perform them until the result is actually accessed (here, via the call to .toList).
So I might rewrite your code like this:
val list = (1 to 12).toList
val evens = list.iterator.filter(_ % 2 == 0)
val result =
if(someCondition)
evens.filter(_ % 3 == 0)
else
evens.filter(_ % 5 == 0)
result foreach println
Depending on exactly what you want to do, you might want an Iterator, a Stream, or a View. They are all lazily computed (so the one-pass aspect will apply), but they differ on things like whether they can be iterated over multiple times (Stream and View) or whether they keep the computed value around for later access (Stream).
To really see these different lazy behaviors, try running this bit of code and set <OPERATION> to either toList, iterator, view, or toStream:
val result =
(1 to 12).<OPERATION>
.filter { e => println("filter 1: " + e); e % 2 == 0 }
.filter { e => println("filter 2: " + e); e % 3 == 0 }
result foreach println
result foreach println
Here's the behavior you will see:
List (or any other non-lazy collection): Each filter is requires a separate iteration through the collection. The resulting filtered collection is stored in memory so that each foreach can just display it.
Iterator: Both filters and the first foreach are done in a single iteration. The second foreach does nothing since the Iterator has been consumed. Results are not stored in memory.
View: Both foreach calls result in their own single-pass iteration over the collection to perform the filters. Results are not stored in memory.
Stream: Both filters and the first foreach are done in a single iteration. The resulting filtered collection is stored in memory so that each foreach can just display it.

You could use function composition. someCondition here is only called once, when deciding which function to compose with:
def modN(n: Int)(xs: List[Int]) = xs filter (_ % n == 0)
val f = modN(2) _ andThen (if (someCondition) modN(3) else modN(5))
val result = f(list)
(This doesn't do what you want - it still traverses the list twice)
Just do this:
val f: Int => Boolean = if (someCondition) { _ % 3 == 0 } else { _ % 5 == 0 }
val result = list filter (x => x % 2 == 0 && f(x))
or maybe better:
val n = if (someCondition) 3 else 5
val result = list filter (x => x % 2 == 0 && x % n == 0)

Wouldn't this work:
list.filter{e => e % 2 == 0 && (if (someCondition) e % 3 == 0 else e % 5 == 0)}
also FYI e % 2 == 0 is going to give you all the even numbers, unless you're naming the val odds for another reason.

You just write two conditions in the filter:
val list = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
var result = List(0)
val someCondition = true
result = if (someCondition) list.filter { e => e % 2 == 0 && e % 3 == 0 }
else list.filter { e => e % 2 == 0 && e % 5 == 0 }

Related

Find last index where an element should be inserted in order to maintain order

I am trying to implement the searchsorted algorithm(side=right) in scala
that is, find last index where an element should be inserted in order to maintain the sorting order.
For ex.
val arr = List(1,2,2,3,4,4,6,6). // assume list is sorted, also array can have negetive elements(albeit sorted)
if elem = 2, output index = 3
if elem = 3, output index = 4
if elem = 5, output index = 6
if elem = -3, output index = 0
I came up with this, iterating through each element in the list, and checking the smallest diff
val a = List(1,1,2,2,3,3,4,5,5)
val elem = 1
val (_, index) = a
.foldLeft(Int.MaxValue, 0) {
case ((smallestDiff, index), currNum) =>
val currDiff = (elem - currNum).abs
if (currDiff > smallestDiff) (smallestDiff, index)
else (currDiff, index + 1)
}
This one works fine for 1st 2 examples, but completely breaks for last 2 ex
indexWhere() can do most of the heavy lifting.
def insertHere(ns:Seq[Int], elem:Int):Int = {
val idx = ns.indexWhere(_ > elem)
if (idx < 0) ns.length
else idx
}
insertHere( List(1,2,2,3,4,4,6,6), 2) // 3
insertHere( Seq(1,2,2,3,4,4,6,6), 3) // 4
insertHere( Array(1,2,2,3,4,4,6,6), 5) // 6
insertHere(Vector(1,2,2,3,4,4,6,6),-3) // 0
insertHere( List(1,2,2,3,4,4,6,6),12) // 8
This returns the non-existing index when elem should be appended at the end.

Scala: Problem with foldLeft with negative numbers in list

I am writing a Scala function that returns the sum of even elements in a list, minus sum of odd elements in a list. I cannot use mutables, recursion or for/while loops for my solution. The code below passes 2/3 tests, but I can't seem to figure out why it can't compute the last test correctly.
def sumOfEvenMinusOdd(l: List[Int]) : Int = {
if (l.length == 0) return 0
val evens = l.filter(_%2==0)
val odds = l.filter(_%2==1)
val evenSum = evens.foldLeft(0)(_+_)
val oddSum = odds.foldLeft(0)(_+_)
evenSum-oddSum
}
//BEGIN TESTS
val i1 = sumOfEvenMinusOdd(List(1,3,5,4,5,2,1,0)) //answer: -9
val i2 = sumOfEvenMinusOdd(List(2,4,5,6,7,8,10)) //answer: 18
val i3 = sumOfEvenMinusOdd(List(109, 19, 12, 1, -5, -120, -15, 30,-33,-13, 12, 19, 3, 18, 1, -1)) //answer -133
My code is outputting this:
defined function sumOfEvenMinusOdd
i1: Int = -9
i2: Int = 18
i3: Int = -200
I am extremely confused why these negative numbers are tripping up the rest of my code. I saw a post explaining the order of operations with foldLeft foldRight, but even changing to foldRight still yields i3: Int = -200. Is there a detail I'm missing? Any guidance / help would be greatly appreciated.
The problem isn't foldLeft or foldRight, the problem is the way you filter out odd values:
val odds = l.filter(_ % 2 == 1)
Should be:
val odds = l.filter(_ % 2 != 0)
The predicate _ % 2 == 1 will only yield true for positive elements. For example, the expression -15 % 2 is equal to -1, and not 1.
As as side note, we can also make this a bit more efficient:
def sumOfEvenMinusOdd(l: List[Int]): Int = {
val (evenSum, oddSum) = l.foldLeft((0, 0)) {
case ((even, odd), element) =>
if (element % 2 == 0) (even + element, odd) else (even, odd + element)
}
evenSum - oddSum
}
Or even better by accumulating the difference only:
def sumOfEvenMinusOdd(l: List[Int]): Int = {
l.foldLeft(0) {
case (diff, element) =>
diff + element * (if (element % 2 == 0) 1 else -1)
}
}
The problem is on the filter condition that you apply on list to find odd numbers.
the odd condition that you doesn't work for negative odd number because mod 2 return -1 for this kind of number.
number % 2 == 0 if number is even
number % 2 != 0 if number is odd
so if you change the filter conditions all works as expected.
Another suggestion:
Why you want use foldleft function for a simple sum operation when you can use directly the sum functions?
test("Test sum Of even minus odd") {
def sumOfEvenMinusOdd(l: List[Int]) : Int = {
val evensSum = l.filter(_%2 == 0).sum
val oddsSum = l.filter(_%2 != 0).sum
evensSum-oddsSum
}
assert(sumOfEvenMinusOdd(List.empty[Int]) == 0)
assert(sumOfEvenMinusOdd(List(1,3,5,4,5,2,1,0)) == -9) //answer: -9
assert(sumOfEvenMinusOdd(List(2,4,5,6,7,8,10)) == 18) //answer: 18
assert(sumOfEvenMinusOdd(List(109, 19, 12, 1, -5, -120, -15, 30,-33,-13, 12, 19, 3, 18, 1, -1)) == -133)
}
With this solution your function is more clear and you can remove the if on the funciton

Combine multiple sequential entries in Scala/Spark

I have an array of numbers separated by comma as shown:
a:{108,109,110,112,114,115,116,118}
I need the output something like this:
a:{108-110, 112, 114-116, 118}
I am trying to group the continuous numbers with "-" in between.
For example, 108,109,110 are continuous numbers, so I get 108-110. 112 is separate entry; 114,115,116 again represents a sequence, so I get 114-116. 118 is separate and treated as such.
I am doing this in Spark. I wrote the following code:
import scala.collection.mutable.ArrayBuffer
def Sample(x:String):ArrayBuffer[String]={
val x1 = x.split(",")
var a:Int = 0
var present=""
var next:Int = 0
var yrTemp = ""
var yrAr= ArrayBuffer[String]()
var che:Int = 0
var storeV = ""
var p:Int = 0
var q:Int = 0
var count:Int = 1
while(a < x1.length)
{
yrTemp = x1(a)
if(x1.length == 1)
{
yrAr+=x1(a)
}
else
if(a < x1.length - 1)
{
present = x1(a)
if(che == 0)
{
storeV = present
}
p = x1(a).toInt
q = x1(a+1).toInt
if(p == q)
{
yrTemp = yrTemp
che = 1
}
else
if(p != q)
{
yrTemp = storeV + "-" + present
che = 0
yrAr+=yrTemp
}
}
else
if(a == x1.length-1)
{
present = x1(a)
yrTemp = present
che = 0
yrAr+=yrTemp
}
a = a+1
}
yrAr
}
val SampleUDF = udf(Sample(_:String))
I am getting the output as follows:
a:{108-108, 109-109, 110-110, 112, 114-114, 115-115, 116-116, 118}
I am not able to figure out where I am going wrong. Can you please help me in correcting this. TIA.
Here's another way:
def rangeToString(a: Int, b: Int) = if (a == b) s"$a" else s"$a-$b"
def reduce(xs: Seq[Int], min: Int, max: Int, ranges: Seq[String]): Seq[String] = xs match {
case y +: ys if (y - max <= 1) => reduce(ys, min, y, ranges)
case y +: ys => reduce(ys, y, y, ranges :+ rangeToString(min, max))
case Seq() => ranges :+ rangeToString(min, max)
}
def output(xs: Array[Int]) = reduce(xs, xs.head, xs.head, Vector())//.toArray
Which you can test:
println(output(Array(108,109,110,112,114,115,116,118)))
// Vector(108-110, 112, 114-116, 118)
Basically this is a tail recursive function - i.e. you take your "variables" as the input, then it calls itself with updated "variables" on each loop. So here xs is your array, min and max are integers used to keep track of the lowest and highest numbers so far, and ranges is the output sequence of Strings that gets added to when required.
The first pattern (y being the first element, and ys being the rest of the sequence - because that's how the +: extractor works) is matched if there's at least one element (ys can be an empty list) and it follows on from the previous maximum.
The second is if it doesn't follow on, and needs to reset the minimum and add the completed range to the output.
The third case is where we've got to the end of the input and just output the result, rather than calling the loop again.
Internet karma points to anyone who can work out how to eliminate the duplication of ranges :+ rangeToString(min, max)!
here is a solution :
def combineConsecutive(s: String): Seq[String] = {
val ints: List[Int] = s.split(',').map(_.toInt).toList.reverse
ints
.drop(1)
.foldLeft(List(List(ints.head)))((acc, e) => if ((acc.head.head - e) <= 1)
(e :: acc.head) :: acc.tail
else
List(e) :: acc)
.map(group => if (group.size > 1) group.min + "-" + group.max else group.head.toString)
}
val in = "108,109,110,112,114,115,116,118"
val result = combineConsecutive(in)
println(result) // List(108-110, 112, 114-116, 118)
}
This solution partly uses code from this question: Grouping list items by comparing them with their neighbors

Scala: transform a collection, yielding 0..many elements on each iteration

Given a collection in Scala, I'd like to traverse this collection and for each object I'd like to emit (yield) from 0 to multiple elements that should be joined together into a new collection.
For example, I expect something like this:
val input = Range(0, 15)
val output = input.somefancymapfunction((x) => {
if (x % 3 == 0)
yield(s"${x}/3")
if (x % 5 == 0)
yield(s"${x}/5")
})
to build an output collection that will contain
(0/3, 0/5, 3/3, 5/5, 6/3, 9/3, 10/5, 12/3)
Basically, I want a superset of what filter (1 → 0..1) and map (1 → 1) allows to do: mapping (1 → 0..n).
Solutions I've tried
Imperative solutions
Obviously, it's possible to do so in non-functional maneer, like:
var output = mutable.ListBuffer()
input.foreach((x) => {
if (x % 3 == 0)
output += s"${x}/3"
if (x % 5 == 0)
output += s"${x}/5"
})
Flatmap solutions
I know of flatMap, but it again, either:
1) becomes really ugly if we're talking about arbitrary number of output elements:
val output = input.flatMap((x) => {
val v1 = if (x % 3 == 0) {
Some(s"${x}/3")
} else {
None
}
val v2 = if (x % 5 == 0) {
Some(s"${x}/5")
} else {
None
}
List(v1, v2).flatten
})
2) requires usage of mutable collections inside it:
val output = input.flatMap((x) => {
val r = ListBuffer[String]()
if (x % 3 == 0)
r += s"${x}/3"
if (x % 5 == 0)
r += s"${x}/5"
r
})
which is actually even worse that using mutable collection from the very beginning, or
3) requires major logic overhaul:
val output = input.flatMap((x) => {
if (x % 3 == 0) {
if (x % 5 == 0) {
List(s"${x}/3", s"${x}/5")
} else {
List(s"${x}/3")
}
} else if (x % 5 == 0) {
List(s"${x}/5")
} else {
List()
}
})
which is, IMHO, also looks ugly and requires duplicating the generating code.
Roll-your-own-map-function
Last, but not least, I can roll my own function of that kind:
def myMultiOutputMap[T, R](coll: TraversableOnce[T], func: (T, ListBuffer[R]) => Unit): List[R] = {
val out = ListBuffer[R]()
coll.foreach((x) => func.apply(x, out))
out.toList
}
which can be used almost like I want:
val output = myMultiOutputMap[Int, String](input, (x, out) => {
if (x % 3 == 0)
out += s"${x}/3"
if (x % 5 == 0)
out += s"${x}/5"
})
Am I really overlooking something and there's no such functionality in standard Scala collection libraries?
Similar questions
This question bears some similarity to Can I yield or map one element into many in Scala? — but that question discusses 1 element → 3 elements mapping, and I want 1 element → arbitrary number of elements mapping.
Final note
Please note that this is not the question about division / divisors, such conditions are included purely for illustrative purposes.
Rather than having a separate case for each divisor, put them in a container and iterate over them in a for comprehension:
val output = for {
n <- input
d <- Seq(3, 5)
if n % d == 0
} yield s"$n/$d"
Or equivalently in a collect nested in a flatMap:
val output = input.flatMap { n =>
Seq(3, 5).collect {
case d if n % d == 0 => s"$n/$d"
}
}
In the more general case where the different cases may have different logic, you can put each case in a separate partial function and iterate over the partial functions:
val output = for {
n <- input
f <- Seq[PartialFunction[Int, String]](
{case x if x % 3 == 0 => s"$x/3"},
{case x if x % 5 == 0 => s"$x/5"})
if f.isDefinedAt(n)
} yield f(n)
You can also use some functional library (e.g. scalaz) to express this:
import scalaz._, Scalaz._
def divisibleBy(byWhat: Int)(what: Int): List[String] =
(what % byWhat == 0).option(s"$what/$byWhat").toList
(0 to 15) flatMap (divisibleBy(3) _ |+| divisibleBy(5))
This uses the semigroup append operation |+|. For Lists this operation means a simple list concatenation. So for functions Int => List[String], this append operation will produce a function that runs both functions and appends their results.
If you have complex computation, during which you should sometimes add some elements to operation global accumulator, you can use popular approach named Writer Monad
Preparation in scala is somewhat bulky but results are extremely composable thanks to Monad interface
import scalaz.Writer
import scalaz.syntax.writer._
import scalaz.syntax.monad._
import scalaz.std.vector._
import scalaz.syntax.traverse._
type Messages[T] = Writer[Vector[String], T]
def yieldW(a: String): Messages[Unit] = Vector(a).tell
val output = Vector.range(0, 15).traverse { n =>
yieldW(s"$n / 3").whenM(n % 3 == 0) >>
yieldW(s"$n / 5").whenM(n % 5 == 0)
}.run._1
Here is my proposition for a custom function, might be better with pimp my library pattern
def fancyMap[A, B](list: TraversableOnce[A])(fs: (A => Boolean, A => B)*) = {
def valuesForElement(elem: A) = fs collect { case (predicate, mapper) if predicate(elem) => mapper(elem) }
list flatMap valuesForElement
}
fancyMap[Int, String](0 to 15)((_ % 3 == 0, _ + "/3"), (_ % 5 == 0, _ + "/5"))
You can try collect:
val input = Range(0,15)
val output = input.flatMap { x =>
List(3,5) collect { case n if (x%n == 0) => s"${x}/${n}" }
}
System.out.println(output)
I would us a fold:
val input = Range(0, 15)
val output = input.foldLeft(List[String]()) {
case (acc, value) =>
val acc1 = if (value % 3 == 0) s"$value/3" :: acc else acc
val acc2 = if (value % 5 == 0) s"$value/5" :: acc1 else acc1
acc2
}.reverse
output contains
List(0/3, 0/5, 3/3, 5/5, 6/3, 9/3, 10/5, 12/3)
A fold takes an accumumlator (acc), a collection, and a function. The function is called with the initial value of the accumumator, in this case an empty List[String], and each value of the collection. The function should return an updated collection.
On each iteration, we take the growing accumulator and, if the inside if statements are true, prepend the calculation to the new accumulator. The function finally returns the updated accumulator.
When the fold is done, it returns the final accumulator, but unfortunately, it is in reverse order. We simply reverse the accumulator with .reverse.
There is a nice paper on folds: A tutorial on the universality and expressiveness of fold, by Graham Hutton.

How to Map Partial Elements in Scala/Spark

I have a list of integers:
val mylist = List(1, 2, 3, 4)
What I want to do is to map the element which are even numbers in mylist, and multiply them by 2.
Maybe the code should be:
mylist.map{ case x%2==2 => x*2 }
I expect the result to be List(4, 8) but it's not. What is the correct code?
I know I could realize this function by using filter + map
a.filter(_%2 == 0).map(_*2)
but is there some way to realize this function by only using map()?
map does not reduce number of elements in transformation.
filter + map is right approach.
But if single method is needed, use collect:
mylist.collect{ case x if x % 2 == 0 => 2 * x }
Edit:
withFilter + map is more efficient than filter + map (as withFilter does not create intermediate collection, i.e. it works lazily):
mylist.withFilter(_ % 2 == 0).map(_ * 2)
which is same as for :
for { e <- mylist if (e % 2 == 0) } yield 2 * e