Find Indexes *Where* - scala

There's a indexWhere function in Vector that finds the index of a match.
def indexWhere(p: (A) ⇒ Boolean, from: Int): Int
> Finds index of the first element satisfying some predicate after or
> at some start index.
http://www.scala-lang.org/api/current/index.html#scala.collection.immutable.Vector
I wrote this function to find all indexes where such a match occurs.
def getAllIndexesWhere[A,B](as: List[A])(f: (B => Boolean))(g: A => B): Vector[B] = {
def go(y: List[A], acc: List[Option[B]]): Vector[B] = as match {
case x :: xs => val result = if (f(g(x))) Some(g(x)) else None
go(xs, acc :+ result)
case Nil => acc.flatten.toVector
}
go(as, Nil)
}
However, is there already a built-in function of a collection?

zipWithIndex, filter, and map are built-ins that can be combined to get all the indices of some predicate.
Get the indices of the even values in the list.
scala> List(1,2,3,4,5,6,7,8,9,10).zipWithIndex.filter(_._1 % 2 == 0).map(_._2)
res0: List[Int] = List(1, 3, 5, 7, 9)
You can also use collect as #0__ notes.
scala> List(1,2,3,4,5,6,7,8,9,10).zipWithIndex.collect{ case(a,b) if a % 2 == 0 => b}
res1: List[Int] = List(1, 3, 5, 7, 9)

Related

Drop a given number of positive items of a given list

Suppose I need a function List[Int] => Option[List[Int]] to drop exactly n elements of a given list if and only if all of them > 0. If the list size <= n the function should return None.
For instance:
def posn(n: Int): List[Int] => Option[List[Int]] = ???
val pos4: List[Int] => Option[List[Int]] = posn(4)
scala> pos4(Nil)
res18: Option[List[Int]] = None
scala> pos4(List(-1))
res19: Option[List[Int]] = None
scala> pos4(List(-1, 2, 3))
res20: Option[List[Int]] = None
scala> pos4(List(1, 2, 3))
res21: Option[List[Int]] = None
scala> pos4(List(1, 2, 3, 4, 5))
res22: Option[List[Int]] = Some(List(5))
scala> pos4(List(1, 2, 3, -4, 5))
res23: Option[List[Int]] = None
I am writing posn like that:
def posn(n: Int): List[Int] => Option[List[Int]] = xs =>
if (xs.size >= n && xs.take(n).forall(_ > 0)) Some(xs.drop(n)) else None
This function seems working bit it doesn't seem elegant and idiomatic. How would you re-write it ?
Here's an (arguably) more idiomatic implementation using Pattern Matching and a recursive call to posn - but I'm not sure it's preferable to your suggested implementation:
def posn(n: Int): List[Int] => Option[List[Int]] = xs => (n, xs) match {
case (0, _) => Some(xs) // stop if enough objects dropped
case (_, head :: tail) if head > 0 => posn(n - 1)(tail) // drop positive and move on
case _ => None // found a negative item or end of xs => "fail"
}
I don't know if there is an idiomatic or elegant way to do this. There seems to be no generic pattern that can be extracted from your logic, except what you have already done (using drop and take), so I don't believe you will find some more useful predefined method
However, you are traversing your list a few times, and this could be avoided:
def posn(n: Int): List[Int] => Option[List[Int]] = xs => {
val (head, tail) = xs.splitAt(n) //does take and drop in one run
if (head.lengthCompare(n) == 0 && head.forall(_ > 0)) Some(tail) // lengthCompare does not compute the whole length if there is no need to
else None
}
This is still not perfect, and more verbose than your version.
You could also do all of it at once, with tail recursion (here assuming n>=0):
def posn(n: Int): List[Int] => Option[List[Int]] = xs =>
if (n == 0) Some(xs)
else if (xs.isEmpty || xs.head <= 0) None
else posn(n - 1)(xs.tail)
This would be more efficient if List was naively implemented, but I really doubt you will see any improvement.
I would write a generic version and use that to define posn:
def dropWhen[T](n: Int, p: T => Boolean, l: List[T]): Option[List[T]] = {
val (f, s) = l.splitAt(n)
if (f.length >= n && f.forall(p)) { Some(s) } else { None }
}
def posn(n: Int): List[Int] => Option[List[Int]] = l => dropWhen(n, (i : Int) => i > 0, l)
Note this method scans the prefix of length n twice
Another (non-recursive) alternative: use zipWithIndex and dropWhile to drop the first N positive numbers, and then check head to see whether the first remaining item is exactly at position n: if it is, we got what we want, otherwise we can return None:
def posn(n: Int): List[Int] => Option[List[Int]] = xs =>
Some(xs.zipWithIndex.dropWhile { case (v, i) => v > 0 && i < n })
.find(_.headOption.exists(_._2 == n)) // first index should be n
.map(_.map(_._1)) // remove indices

How to remove 2 or more duplicates from list and maintain their initial order?

Lets assume we have a Scala list:
val l1 = List(1, 2, 3, 1, 1, 3, 2, 5, 1)
We can easily remove duplicates using the following code:
l1.distinct
or
l1.toSet.toList
But what if we want to remove duplicates only if there are more than 2 of them? So if there are more than 2 elements with the same value we remain only two and remove the rest of them.
I could achieve it with following code:
l1.groupBy(identity).mapValues(_.take(2)).values.toList.flatten
that gave me the result:
List(2, 2, 5, 1, 1, 3, 3)
Elements are removed but the order of remaining elements is different from how these elements appeared in the initial list. How to do this operation and remain the order from original list?
So the result for l1 should be:
List(1, 2, 3, 1, 3, 2, 5)
Not the most efficient.
scala> val l1 = List(1, 2, 3, 1, 1, 3, 2, 5, 1)
l1: List[Int] = List(1, 2, 3, 1, 1, 3, 2, 5, 1)
scala> l1.zipWithIndex.groupBy( _._1 ).map(_._2.take(2)).flatten.toList.sortBy(_._2).unzip._1
res10: List[Int] = List(1, 2, 3, 1, 3, 2, 5)
My humble answer:
def distinctOrder[A](x:List[A]):List[A] = {
#scala.annotation.tailrec
def distinctOrderRec(list: List[A], covered: List[A]): List[A] = {
(list, covered) match {
case (Nil, _) => covered.reverse
case (lst, c) if c.count(_ == lst.head) >= 2 => distinctOrderRec(list.tail, covered)
case _ => distinctOrderRec(list.tail, list.head :: covered)
}
}
distinctOrderRec(x, Nil)
}
With the results:
scala> val l1 = List(1, 2, 3, 1, 1, 3, 2, 5, 1)
l1: List[Int] = List(1, 2, 3, 1, 1, 3, 2, 5, 1)
scala> distinctOrder(l1)
res1: List[Int] = List(1, 2, 3, 1, 3, 2, 5)
On Edit: Right before I went to bed I came up with this!
l1.foldLeft(List[Int]())((total, next) => if (total.count(_ == next) >= 2) total else total :+ next)
With an answer of:
res9: List[Int] = List(1, 2, 3, 1, 3, 2, 5)
Not the prettiest. I look forward to seeing the other solutions.
def noMoreThan(xs: List[Int], max: Int) =
{
def op(m: Map[Int, Int], a: Int) = {
m updated (a, m(a) + 1)
}
xs.scanLeft( Map[Int,Int]().withDefaultValue(0) ) (op).tail
.zip(xs)
.filter{ case (m, a) => m(a) <= max }
.map(_._2)
}
scala> noMoreThan(l1, 2)
res0: List[Int] = List(1, 2, 3, 1, 3, 2, 5)
More straightforward version using foldLeft:
l1.foldLeft(List[Int]()){(acc, el) =>
if (acc.count(_ == el) >= 2) acc else el::acc}.reverse
Similar to how distinct is implemeted, with a multiset instead of a set:
def noMoreThan[T](list : List[T], max : Int) = {
val b = List.newBuilder[T]
val seen = collection.mutable.Map[T,Int]().withDefaultValue(0)
for (x <- list) {
if (seen(x) < max) {
b += x
seen(x) += 1
}
}
b.result()
}
Based on experquisite's answer, but using foldLeft:
def noMoreThanBis(xs: List[Int], max: Int) = {
val initialState: (Map[Int, Int], List[Int]) = (Map().withDefaultValue(0), Nil)
val (_, result) = xs.foldLeft(initialState) { case ((count, res), x) =>
if (count(x) >= max)
(count, res)
else
(count.updated(x, count(x) + 1), x :: res)
}
result.reverse
}
distinct is defined for SeqLike as
/** Builds a new $coll from this $coll without any duplicate elements.
* $willNotTerminateInf
*
* #return A new $coll which contains the first occurrence of every element of this $coll.
*/
def distinct: Repr = {
val b = newBuilder
val seen = mutable.HashSet[A]()
for (x <- this) {
if (!seen(x)) {
b += x
seen += x
}
}
b.result()
}
We can define our function in very similar fashion:
def distinct2[A](ls: List[A]): List[A] = {
val b = List.newBuilder[A]
val seen1 = mutable.HashSet[A]()
val seen2 = mutable.HashSet[A]()
for (x <- ls) {
if (!seen2(x)) {
b += x
if (!seen1(x)) {
seen1 += x
} else {
seen2 += x
}
}
}
b.result()
}
scala> distinct2(l1)
res4: List[Int] = List(1, 2, 3, 1, 3, 2, 5)
This version uses internal state, but is still pure. It is also quite easy to generalise for arbitrary n (currently 2), but specific version is more performant.
You can implement the same function with folds carrying the "what is seen once and twice" state with you. Yet the for loop and mutable state does the same job.
How about this:
list
.zipWithIndex
.groupBy(_._1)
.toSeq
.flatMap { _._2.take(2) }
.sortBy(_._2)
.map(_._1)
Its a bit ugly, but its relatively faster
val l1 = List(1, 2, 3, 1, 1, 3, 2, 5, 1)
l1.foldLeft((Map[Int, Int](), List[Int]())) { case ((m, ls), x) => {
val z = m + ((x, m.getOrElse(x, 0) + 1))
(z, if (z(x) <= 2) x :: ls else ls)
}}._2.reverse
Gives: List(1, 2, 3, 1, 3, 2, 5)
Here is a recursive solution (it will stack overflow for large lists):
def filterAfter[T](l: List[T], max: Int): List[T] = {
require(max > 1)
//keep the state of seen values
val seen = Map[T, Int]().withDefaultValue(0)//init to 0
def filterAfter(l: List[T], seen: Map[T, Int]): (List[T], Map[T, Int]) = {
l match {
case x :: xs =>
if (seen(x) < max) {
//Update the state and pass to next
val pair = filterAfter(xs, seen updated (x, seen(x) + 1))
(x::pair._1, pair._2)
} else {
//already seen more than max
filterAfter(xs, seen)
}
case _ => (l, seen)//empty, terminate recursion
}
}
//call inner recursive function
filterAfter(l, seen, 2)._1
}
Here is canonical Scala code to do reduce three or more in a row to two in a row:
def checkForTwo(candidate: List[Int]): List[Int] = {
candidate match {
case x :: y :: z :: tail if x == y && y == z =>
checkForTwo(y :: z :: tail)
case x :: tail =>
x :: checkForTwo(tail)
case Nil =>
Nil
}
}
It looks at the first three elements of the list, and if they are the same, drops the first one and repeats the process. Otherwise, it passes items on through.
Solution with groupBy and filter, without any sorting (so it's O(N), sorting will give you additional O(Nlog(N)) in typical case):
val li = l1.zipWithIndex
val pred = li.groupBy(_._1).flatMap(_._2.lift(1)) //1 is your "2", but - 1
for ((x, i) <- li if !pred.get(x).exists(_ < i)) yield x
I prefer approach with immutable Map:
def noMoreThan[T](list: List[T], max: Int): List[T] = {
def go(tail: List[T], freq: Map[T, Int]): List[T] = {
tail match {
case h :: t =>
if (freq(h) < max)
h :: go(t, freq + (h -> (freq(h) + 1)))
else go(t, freq)
case _ => Nil
}
}
go(list, Map[T, Int]().withDefaultValue(0))
}

Scala idiom to find first Some of Option from iterator

I have an iterator of Options, and would like to find the first member that is:
Some
and meets a predicate
What's the best idiomatic way to do this?
Also: If an exception is thrown along the way, I'd like to ignore it and move on to the next member
optionIterator find { case Some(x) if predicate(x) => true case _ => false }
As for ignoring exceptions… Is it the predicate that could throw? 'Cause that's not really wise. Nonetheless…
optionIterator find {
case Some(x) => Try(predicate(x)) getOrElse false
case _ => false
}
Adding a coat of best and idiomatic to the paint job:
scala> val vs = (0 to 10) map { case 3 => None case i => Some(i) }
vs: scala.collection.immutable.IndexedSeq[Option[Int]] = Vector(Some(0), Some(1), Some(2), None, Some(4), Some(5), Some(6), Some(7), Some(8), Some(9), Some(10))
scala> def p(i: Int) = if (i % 2 == 0) i > 5 else ???
p: (i: Int)Boolean
scala> import util._
import util._
scala> val it = vs.iterator
it: Iterator[Option[Int]] = non-empty iterator
scala> it collectFirst { case Some(i) if Try(p(i)) getOrElse false => i }
res2: Option[Int] = Some(6)
Getting the first even number over five that doesn't blow up the test.
Assuming that you can wrap your predicate so that any error returns false:
iterator.flatMap(x => x).find(yourSafePredicate)
flatMap takes a collection of collections (which an iterable of Option is as Option and Either are considered collections with a max size of one) and transforms it into a single collection:
scala> for { x <- 1 to 3; y <- 1 to x } yield x :: y :: Nil
res30: IndexedSeq[List[Int]] = Vector(List(1, 1), List(2, 1), List(2, 2), List(3, 1), List(3, 2), List(3, 3))
scala> res30.flatMap(x => x)
res31: IndexedSeq[Int] = Vector(1, 1, 2, 1, 2, 2, 3, 1, 3, 2, 3, 3)
find returns the first entry in your iterable that matches a predicate as an Option or None if there is no match:
scala> (1 to 10).find(_ > 3)
res0: Option[Int] = Some(4)
scala> (1 to 10).find(_ == 11)
res1: Option[Int] = None
Some sample data
scala> val l = Seq(Some(1),None,Some(-7),Some(8))
l: Seq[Option[Int]] = List(Some(1), None, Some(-7), Some(8))
Using flatMap on a Seq of Options will produce a Seq of defined values, all the None's will be discarded
scala> l.flatMap(a => a)
res0: Seq[Int] = List(1, -7, 8)
Then use find on the sequence - you will get the first value, that satisfies the predicate. Pay attention, that found value is wrapped as Option, cause find should be able to return valid value (None) value in case of "not found" situation.
scala> l.flatMap(a => a).find(_ < 0)
res1: Option[Int] = Some(-7)
As far as I know it is "OK" way for the Scala.
Might be more idiomatic way is to use collect / collectFirst on the Seq ...
scala> l.collectFirst { case a#Some(x) if x < 0 => a }
res2: Option[Some[Int]] = Some(Some(-7))
Pay attention that here we have Some(Some(-7)) because the collectFind should have chance to produce "not found" value, so here 1st Some - from collectFirst, the 2nd Some - from the source elements of Seq of Option's.
You can flatten the Some(Some(-7)) if you need the values in your hand.
scala> l.collectFirst({ case a#Some(x) if x < 0 => a }).flatten
res3: Option[Int] = Some(-7)
If nothing found - you will have the None
scala> l.collectFirst({ case a#Some(x) if x < -10 => a }).flatten
res9: Option[Int] = None

How to elegantly extract range of list based on specific criteria?

I want to extract range of elements from a list, meeting the following requirements:
First element of range has to be an element previous to element matching specific condition
Last element of range has to be an element next to element matching specific condition
Example: For list (1,1,1,10,2,10,1,1,1) and condition x >= 10 I want to get (1,10,2,10,1)
This is very simple to program imperatively, but I am just wondering if there is some smart Scala-functional way to achieve it. Is it?
Keeping it in the scala standard lib, I would solve this using recursion:
def f(_xs: List[Int])(cond: Int => Boolean): List[Int] = {
def inner(xs: List[Int], res: List[Int]): List[Int] = xs match {
case Nil => Nil
case x :: y :: tail if cond(y) && res.isEmpty => inner(tail, res ++ (x :: y :: Nil))
case x :: y :: tail if cond(x) && res.nonEmpty => res ++ (x :: y :: Nil)
case x :: tail if res.nonEmpty => inner(tail, res :+ x)
case x :: tail => inner(tail, res)
}
inner(_xs, Nil)
}
scala> f(List(1,1,1,10,2,10,1,1,1))(_ >= 10)
res3: List[Int] = List(1, 10, 2, 10, 1)
scala> f(List(2,10,2,10))(_ >= 10)
res4: List[Int] = List()
scala> f(List(2,10,2,10,1))(_ >= 10)
res5: List[Int] = List(2, 10, 2, 10, 1)
Maybe there is something I did not think of in this solution, or I missunderstood something, but I think you will get the basic idea.
Good functional algorithm design practice is all about breaking complex problems into simpler ones.
The principle is called Divide and Conquer.
It's easy to extract two simpler subproblems from the subject problem:
Get a list of all elements after the matching one, preceded with this matching element,
preceded with an element before it.
Get a list of all elements up to the latest matching one, followed by the matching element and
the element after it.
The named problems are simple enough for the appropriate functions to be implemented, so no subdivision is required.
Here's the implementation of the first function:
def afterWithPredecessor
[ A ]
( elements : List[ A ] )
( test : A => Boolean )
: List[ A ]
= elements match {
case Nil => Nil
case a :: tail if test( a ) => Nil // since there is no predecessor
case a :: b :: tail if test( b ) => a :: b :: tail
case a :: tail => afterWithPredecessor( tail )( test )
}
Since the second problem can be seen as a direct inverse of the first one, it can be easily implemented by reversing the input and output:
def beforeWithSuccessor
[ A ]
( elements : List[ A ] )
( test : A => Boolean )
: List[ A ]
= afterWithPredecessor( elements.reverse )( test ).reverse
But here's an optimized version of this:
def beforeWithSuccessor
[ A ]
( elements : List[ A ] )
( test : A => Boolean )
: List[ A ]
= elements match {
case Nil => Nil
case a :: b :: tail if test( a ) =>
a :: b :: beforeWithSuccessor( tail )( test )
case a :: tail =>
beforeWithSuccessor( tail )( test ) match {
case Nil => Nil
case r => a :: r
}
}
Finally, composing the above functions together to produce the function solving your problem becomes quite trivial:
def range[ A ]( elements : List[ A ] )( test : A => Boolean ) : List[ A ]
= beforeWithSuccessor( afterWithPredecessor( elements )( test ) )( test )
Tests:
scala> range( List(1,1,1,10,2,10,1,1,1) )( _ >= 10 )
res0: List[Int] = List(1, 10, 2, 10, 1)
scala> range( List(1,1,1,10,2,10,1,1,1) )( _ >= 1 )
res1: List[Int] = List()
scala> range( List(1,1,1,10,2,10,1,1,1) )( _ == 2 )
res2: List[Int] = List(10, 2, 10)
The second test returns an empty list since the outermost elements satisfying the predicate have no predecessors (or successors).
def range[T](elements: List[T], condition: T => Boolean): List[T] = {
val first = elements.indexWhere(condition)
val last = elements.lastIndexWhere(condition)
elements.slice(first - 1, last + 2)
}
scala> range[Int](List(1,1,1,10,2,10,1,1,1), _ >= 10)
res0: List[Int] = List(1, 10, 2, 10, 1)
scala> range[Int](List(2,10,2,10), _ >= 10)
res1: List[Int] = List(2, 10, 2, 10)
scala> range[Int](List(), _ >= 10)
res2: List[Int] = List()
Zip and map to the rescue
val l = List(1, 1, 1, 10, 2, 1, 1, 1)
def test (i: Int) = i >= 10
((l.head :: l) zip (l.tail :+ l.last)) zip l filter {
case ((a, b), c) => (test (a) || test (b) || test (c) )
} map { case ((a, b), c ) => c }
That should work. I only have my smartphone and am miles from anywhere I could test this, so apologise for any typos or minor syntax errors
Edit: works now. I hope it's obvious that my solution shuffles the list to the right and to the left to create two new lists. When these are zipped together and zipped again with the original list, the result is a list of tuples, each containing the original element and a tuple of its neighbours. This is then trivial to filter and map back to a simple list.
Making this into a more general function (and using collect rather than filter -> map)...
def filterWithNeighbours[E](l: List[E])(p: E => Boolean) = l match {
case Nil => Nil
case li if li.size < 3 => if (l exists p) l else Nil
case _ => ((l.head :: l) zip (l.tail :+ l.last)) zip l collect {
case ((a, b), c) if (p (a) || p (b) || p (c) ) => c
}
}
This is less efficient than the recursive solution but makes the test much simpler and more clear. It can be difficult to match the right sequence of patterns in a recursive solution, as the patterns often express the shape of the chosen implementation rather than the original data. With the simple functional solution, each element is clearly and simply being compared to its neighbours.

Calculating differences of subsequent elements of a sequence in scala

I would like to do almost exactly this in scala. Is there an elegant way?
Specifically, I just want the difference of adjacent elements in a sequence. For example
input = 1,2,6,9
output = 1,4,3
How about this?
scala> List(1, 2, 6, 9).sliding(2).map { case Seq(x, y, _*) => y - x }.toList
res0: List[Int] = List(1, 4, 3)
Here is one that uses recursion and works best on Lists
def differences(l:List[Int]) : List[Int] = l match {
case a :: (rest # b :: _) => (b - a) :: differences(rest)
case _ => Nil
}
And here is one that should be pretty fast on Vector or Array:
def differences(a:IndexedSeq[Int]) : IndexedSeq[Int] =
a.indices.tail.map(i => a(i) - a(i-1))
Of course there is always this:
def differences(a:Seq[Int]) : Seq[Int] =
a.tail.zip(a).map { case (x,y) => x - y }
Note that only the recursive version handles empty lists without an exception.