Sort tuples by first element reverse, second element regular - scala

I have tuples of the form (Boolean, Int, String).
I want to define Ordering which sorts the tuples in the following order:
Boolean - reverse order
Int - reverse order
String - regular order
Example:
For the tuples: Array((false, 8, "zz"), (false,3, "bb"), (true, 5, "cc"),(false, 3,"dd")).
The ordering should be:
(true, 5, "cc")
(false, 8,"zz")
(false, 3, "bb")
(false, 3, "dd")
I couldn't find a way to define some of the ordering reverse and some regular.

The straight forward solution in this specific case is to use sortBy on the tuples, modified on the fly to "invert" the first and second elements so that in the end the ordering is reversed:
val a = Array((false, 8, "zz"), (false,3, "bb"), (true, 5, "cc"),(false, 3,"dd"))
a.sortBy{ case (x,y,z) => (!x, -y, z) }
For cases when you cannot easily "invert" a value (say that this is a reference object and you've got an opaque ordering on them), you can instead use
sorted and explicitly pass an ordering that is constructed to invert the order on the first and second elements (you can use Ordering.reverse to reverse an ordering):
val myOrdering: Ordering[(Boolean, Int, String)] = Ordering.Tuple3(Ordering.Boolean.reverse, Ordering.Int.reverse, Ordering.String)
a.sorted(myOrdering)

You could do something like this.
case class myTuple(t: (Boolean, Int, String)) extends Ordered[myTuple] {
def compare(that: myTuple):Int = {
val (x,y,z) =t
val (x1,y1,z1) = that.t
if (x.compare(x1) != 0) x.compare(x1)
else {
if (y.compare(y1) != 0) if (y.compare(y1) == 1) 0 else 1
else z.compareTo(z1)
}
}
}
val myList = Array((false, 8, "zz"), (false,3, "bb"), (true, 5, "cc"),(false, 3,"dd"))
implicit def tupleToBeordered(t: (Boolean, Int, String)) = new myTuple(t._1,t._2,t._3)
myList.sorted

Related

Decompose Scala sequence into member values

I'm looking for an elegant way of accessing two items in a Seq at the same time. I've checked earlier in my code that the Seq will have exactly two items. Now I would like to be able to give them names, so they have meaning.
records
.sliding(2) // makes sure we get `Seq` with two items
.map(recs => {
// Something like this...
val (former, latter) = recs
})
Is there an elegant and/or idiomatic way to achieve this in Scala?
I'm not sure if it is any more elegant, but you can also unpick the sequence like this:
val former +: latter +: _ = recs
You can access the elements by their index:
map { recs => {
val (former, latter) = recs(0), recs(1)
}}
You can use pattern matching to decompose the structure of your list:
val records = List("first", "second")
records match {
case first +: second +: Nil => println(s"1: $first, 2: $second")
case _ => // Won't happen (you can omit this)
}
will output
1: first, 2: second
The result of sliding is a List. Using a pattern match, you can give name to these elements like this:
map{ case List(former, latter) =>
...
}
Note that since it's a pattern match, you need to use {} instead of ().
For a records of known types (for example, Int):
records.sliding (2).map (_ match {
case List (former:Int, latter:Int) => former + latter })
Note, that this will unify element (0, 1), then (1, 2), (2, 3) ... and so on. To combine pairwise, use sliding (2, 2):
val pairs = records.sliding (2, 2).map (_ match {
case List (former: Int, latter: Int) => former + latter
case List (one: Int) => one
}).toList
Note, that you now need an extra case for just one element, if the records size is odd.

Extract list to multiple distinct list

How can I extract Scala list to List with multiple distinct list in Scala?
From
val l = List(1,2,6,3,5,4,4,3,4,1)
to
List(List(1,2,3,4,5,6),List(1,3,4),List(4))
Here's a (rather inefficient) way to do this: group by value, sort result by size of group, then use first group as a basis for per-index scan of the original groups to build the distinct lists:
scala> val l = List(1,2,6,3,5,4,4,3,4,1)
l: List[Int] = List(1, 2, 6, 3, 5, 4, 4, 3, 4, 1)
scala> val groups = l.groupBy(identity).values.toList.sortBy(- _.size)
groups: List[List[Int]] = List(List(4, 4, 4), List(1, 1), List(3, 3), List(5), List(6), List(2))
scala> groups.head.zipWithIndex.map { case (_, i) => groups.flatMap(_.drop(i).headOption) }
res9: List[List[Int]] = List(List(4, 1, 3, 5, 6, 2), List(4, 1, 3), List(4))
An alternative approach after grouping like in the first answer by #TzachZohar is to keep taking one element from each list until all lists are empty:
val groups = l.groupBy(identity).values
Iterator
// continue removing the first element from every sublist, and discard empty tails
.iterate(groups)(_ collect { case _ :: (rest # (_ :: _)) => rest } )
// stop when all sublists become empty and are removed
.takeWhile(_.nonEmpty)
// build and sort result lists
.map(_.map(_.head).toList.sorted)
.toList
And here's another option - scanning the input N times with N being the largest amount of repetitions of a single value:
// this function splits input list into two:
// all duplicate values, and the longest list of unique values
def collectDistinct[A](l: List[A]): (List[A], List[A]) = l.foldLeft((List[A](), List[A]())) {
case ((remaining, distinct), candidate) if distinct.contains(candidate) => (candidate :: remaining, distinct)
case ((remaining, distinct), candidate) => (remaining, candidate :: distinct)
}
// this recursive function takes a list of "remaining" values,
// and a list of distinct groups, and adds distinct groups to the list
// until "remaining" is empty
#tailrec
def distinctGroups[A](remaining: List[A], groups: List[List[A]]): List[List[A]] = remaining match {
case Nil => groups
case _ => collectDistinct(remaining) match {
case (next, group) => distinctGroups(next, group :: groups)
}
}
// all second function with our input and an empty list of groups to begin with:
val result = distinctGroups(l, List())
Consider this approach:
trait Proc {
def process(v:Int): Proc
}
case object Empty extends Proc {
override def process(v:Int) = Processor(v, Map(0 -> List(v)), 0)
}
case class Processor(prev:Int, map:Map[Int, List[Int]], lastTarget:Int) extends Proc {
override def process(v:Int) = {
val target = if (prev==v) lastTarget+1 else 0
Processor(v, map + (target -> (v::map.getOrElse(target, Nil))), target)
}
}
list.sorted.foldLeft[Proc](Empty) {
case (acc, item) => acc.process(item)
}
Here we have simple state machine. We iterate over sorted list with initial state 'Empty'. Once 'Empty' processes item, it produces next state 'Processor'.
Processor has previous value in 'prev' and accumulated map of already grouped items. It also has lastTarget - the index of list where last write occured.
The only thing 'Processor' does is calculating the target for current processing item: if it is the same as previous, it takes next index, otherwise it starts from the beginning with index 0.

Merge two lists which contains case class objects scala

I have two lists which contains case class objects
case class Balance(id: String, in: Int, out: Int)
val l1 = List(Balance("a", 0, 0), Balance("b", 10, 30), Balance("c", 20, 0))
val l2 = List(Balance("a", 10, 0), Balance("b", 40, 0))
I want to sumup the elements in the tuples and combine the lists like below
List((Balance(a, 10, 0), Balance(b, 50, 30), Balance(c, 20, 0))
I have came with following solution
// create list of tuples with 'id' as key
val a = l1.map(b => (b.id, (b.in, b.out)))
val b = l2.map(b => (b.id, (b.in, b.out)))
// combine the lists
val bl = (a ++ b).groupBy(_._1).mapValues(_.unzip._2.unzip match {
case (ll1, ll2) => (ll1.sum, ll2.sum)
}).toList.map(b => Balance(b._1, b._2._1, b._2._2))
// output
// List((Balance(a, 10, 0), Balance(b, 50, 30), Balance(c, 20, 0))
Are they any shorter way to do this?
You don't really need to create the tuple lists.
(l1 ++ l2).groupBy(_.id)
.mapValues(_.foldLeft((0,0)){
case ((a,b),Balance(id,in,out)) => (a+in,b+out)})
.map{
case (k,(in,out)) => Balance(k,in,out)}
.toList
// res0: List[Balance] = List(Balance(b,50,30), Balance(a,10,0), Balance(c,20,0))
You'll note that the result appears out of order because of the intermediate representation as a Map, which, by definition, has no order.
Another approach would be to add a Semigroup instance for Balance and use that for the combine logic. The advantage of this is that that code is in one place only, rather that sprinkled wherever you need to combine lists or maps of Balances.
So, you first add the instance:
import cats.implicits._
implicit val semigroupBalance : Semigroup[Balance] = new Semigroup[Balance]
{
override def combine(x: Balance, y: Balance): Balance =
if(x.id == y.id) // I am arbitrarily deciding this: you can adapt the logic to your
// use case, but if you only need it in the scenario you asked for,
// the case where y.id and x.id are different will never happen.
Balance(x.id, x.in + y.in, x.out + y.out)
else x
}
Then, the code to combine multiple lists becomes simpler (using your example data):
(l1 ++ l2).groupBy(_.id).mapValues(_.reduce(_ |+| _)) //Map(b -> Balance(b,50,30), a -> Balance(a,10,0), c -> Balance(c,20,0))
N.B. As #jwvh already noted, the result will not be in order, in this simple case, because of the default unordered Map the groupBy returns. That could be fixed, if needed.
N.B. You might want to use Monoid instead of Semigroup, if you have a meaningful empty value for Balance.
For those who need to merge two list of case class objects, while maintaining the original ordering, here's my solution which is based on jwvh's answer to this question and this answer.
import scala.collection.immutable.SortedMap
val mergedList: List[Balance] = l1 ++ l2
val sortedListOfBalances: List[Balance] =
SortedMap(mergedList.groupBy(_.id).toSeq:_*)
.mapValues(_.foldLeft((0,0)){
case ((a,b),Balance(id,in,out)) => (a+in,b+out)
})
.map{
case (k,(in,out)) => Balance(k,in,out)
}
.toList
This will return List(Balance(a,10,0), Balance(b,50,30), Balance(c,20,0)) while when not using SortedMap we get List(Balance(b,50,30), Balance(a,10,0), Balance(c,20,0)).
map always returns in an unspecified order unless we specifically use a subtype of SortedMap.

How to use takeWhile with an Iterator in Scala

I have a Iterator of elements and I want to consume them until a condition is met in the next element, like:
val it = List(1,1,1,1,2,2,2).iterator
val res1 = it.takeWhile( _ == 1).toList
val res2 = it.takeWhile(_ == 2).toList
res1 gives an expected List(1,1,1,1) but res2 returns List(2,2) because iterator had to check the element in position 4.
I know that the list will be ordered so there is no point in traversing the whole list like partition does. I like to finish as soon as the condition is not met. Is there any clever way to do this with Iterators? I can not do a toList to the iterator because it comes from a very big file.
The simplest solution I found:
val it = List(1,1,1,1,2,2,2).iterator
val (r1, it2) = it.span( _ == 1)
println(s"group taken is: ${r1.toList}\n rest is: ${it2.toList}")
output:
group taken is: List(1, 1, 1, 1)
rest is: List(2, 2, 2)
Very short but further you have to use new iterator.
With any immutable collection it would be similar:
use takeWhile when you want only some prefix of collection,
use span when you need rest also.
With my other answer (which I've left separate as they are largely unrelated), I think you can implement groupWhen on Iterator as follows:
def groupWhen[A](itr: Iterator[A])(p: (A, A) => Boolean): Iterator[List[A]] = {
#annotation.tailrec
def groupWhen0(acc: Iterator[List[A]], itr: Iterator[A])(p: (A, A) => Boolean): Iterator[List[A]] = {
val (dup1, dup2) = itr.duplicate
val pref = ((dup1.sliding(2) takeWhile { case Seq(a1, a2) => p(a1, a2) }).zipWithIndex collect {
case (seq, 0) => seq
case (Seq(_, a), _) => Seq(a)
}).flatten.toList
val newAcc = if (pref.isEmpty) acc else acc ++ Iterator(pref)
if (dup2.nonEmpty)
groupWhen0(newAcc, dup2 drop (pref.length max 1))(p)
else newAcc
}
groupWhen0(Iterator.empty, itr)(p)
}
When I run it on an example:
println( groupWhen(List(1,1,1,1,3,4,3,2,2,2).iterator)(_ == _).toList )
I get List(List(1, 1, 1, 1), List(2, 2, 2))
I had a similar need, but the solution from #oxbow_lakes does not take into account the situation when the list has only one element, or even if the list contains elements that are not repeated. Also, that solution doesn't lend itself well to an infinite iterator (it wants to "see" all the elements before it gives you a result).
What I needed was the ability to group sequential elements that match a predicate, but also include the single elements (I can always filter them out if I don't need them). I needed those groups to be delivered continuously, without having to wait for the original iterator to be completely consumed before they are produced.
I came up with the following approach which works for my needs, and thought I should share:
implicit class IteratorEx[+A](itr: Iterator[A]) {
def groupWhen(p: (A, A) => Boolean): Iterator[List[A]] = new AbstractIterator[List[A]] {
val (it1, it2) = itr.duplicate
val ritr = new RewindableIterator(it1, 1)
override def hasNext = it2.hasNext
override def next() = {
val count = (ritr.rewind().sliding(2) takeWhile {
case Seq(a1, a2) => p(a1, a2)
case _ => false
}).length
(it2 take (count + 1)).toList
}
}
}
The above is using a few helper classes:
abstract class AbstractIterator[A] extends Iterator[A]
/**
* Wraps a given iterator to add the ability to remember the last 'remember' values
* From any position the iterator can be rewound (can go back) at most 'remember' values,
* such that when calling 'next()' the memoized values will be provided as if they have not
* been iterated over before.
*/
class RewindableIterator[A](it: Iterator[A], remember: Int) extends Iterator[A] {
private var memory = List.empty[A]
private var memoryIndex = 0
override def next() = {
if (memoryIndex < memory.length) {
val next = memory(memoryIndex)
memoryIndex += 1
next
} else {
val next = it.next()
memory = memory :+ next
if (memory.length > remember)
memory = memory drop 1
memoryIndex = memory.length
next
}
}
def canRewind(n: Int) = memoryIndex - n >= 0
def rewind(n: Int) = {
require(memoryIndex - n >= 0, "Attempted to rewind past 'remember' limit")
memoryIndex -= n
this
}
def rewind() = {
memoryIndex = 0
this
}
override def hasNext = it.hasNext
}
Example use:
List(1,2,2,3,3,3,4,5,5).iterator.groupWhen(_ == _).toList
gives: List(List(1), List(2, 2), List(3, 3, 3), List(4), List(5, 5))
If you want to filter out the single elements, just apply a filter or withFilter after groupWhen
Stream.continually(Random.nextInt(100)).iterator
.groupWhen(_ + _ == 100).withFilter(_.length > 1).take(3).toList
gives: List(List(34, 66), List(87, 13), List(97, 3))
You could use method toStream on Iterator.
Stream is a lazy equivalent of List.
As you can see from implementation of toStream it creates a Stream without traversing the whole Iterator.
Stream keeps all element in memory. You should localize usage of link to Stream in some local scope to prevent memory leaking.
With Stream you should use span like this:
val (res1, rest1) = stream.span(_ == 1)
val (res2, rest2) = rest1.span(_ == 2)
I'm guessing a bit here but by the statement "until a condition is met in the next element", it sounds like you might want to look at the groupWhen method on ListOps in scalaz
scala> import scalaz.syntax.std.list._
import scalaz.syntax.std.list._
scala> List(1,1,1,1,2,2,2) groupWhen (_ == _)
res1: List[List[Int]] = List(List(1, 1, 1, 1), List(2, 2, 2))
Basically this "chunks" up the input sequence upon a condition (a (A, A) => Boolean) being met between an element and its successor. In the example above the condition is equality, so, as long as an element is equal to its successor, they will be in the same chunk.

Returning an element from a List in Scala

I've recently been working on a beginner's project in Scala, and have a beginner question about Scala's Lists.
Say I have a list of tuples ( List[Tuple2[String, String]], for example). Is there a convenience method to return the first occurence of a specified tuple from the List, or is it necessary to iterate through the list by hand?
scala> val list = List(("A", "B", 1), ("C", "D", 1), ("E", "F", 1), ("C", "D", 2), ("G", "H", 1))
list: List[(java.lang.String, java.lang.String, Int)] = List((A,B,1), (C,D,1), (E,F,1), (C,D,2), (G,H,1))
scala> list find {e => e._1 == "C" && e._2 == "D"}
res0: Option[(java.lang.String, java.lang.String, Int)] = Some((C,D,1))
You could try using find. (Updated scala-doc location of find)
As mentioned in a previous comment, find is probably the easiest way to do this. There are actually three different "linear search" methods in Scala's collections, each returning a slightly different value. Which one you use depends upon what you need the data for. For example, do you need an index, or do you just need a boolean true/false?
If you're learning scala, I'd take a good look at the Seq trait. It provides the basis for much of scala's functional goodness.
You could also do this, which doesn't require knowing the field names in the Tuple2 class--it uses pattern matching instead:
list find { case (x,y,_) => x == "C" && y == "D" }
"find" is good when you know you only need one; if you want to find all matching elements you could either use "filter" or the equivalent sugary for comprehension:
for ( (x,y,z) <- list if x == "C" && y == "D") yield (x,y,z)
Here's code that may help you.
I had a similar case, having a collection of base class entries (here, A) out of which I wanted to find a certain derived class's node, if any (here, B).
class A
case class B(val name: String) extends A
object TestX extends App {
val states: List[A] = List( B("aa"), new A, B("ccc") )
def findByName( name: String ): Option[B] = {
states.find{
case x: B if x.name == name => return Some(x)
case _ => false
}
None
}
println( findByName("ccc") ) // "Some(B(ccc))"
}
The important part here (for my app) is that findByName does not return Option[A] but Option[B].
You can easily modify the behaviour to return B instead, and throw an exception if none was found. Hope this helps.
Consider collectFirst which delivers Some[(String,String)] for the first matching tuple or None otherwise, for instance as follows,
xs collectFirst { case t#(a,_) if a == "existing" => t }
Some((existing,str))
scala> xs collectFirst { case t#(a,_) if a == "nonExisting" => t }
None
Using # we bind the value of the tuple to t so that a whole matching tuple can be collected.