Scala, a cross between a foldLeft and a map supporting lazy evaluation - scala

I have a collection which I want to map to a new collection, however each resulting value is dependent on the value before it in some way.I could solve this with a leftFold
val result:List[B] = (myList:List[A]).foldLeft(C -> List.empty[B]){
case ((c, list), a) =>
..some function returning something like..
C -> (B :: list)
}
The problem here is I need to iterate through the entire list to retrieve the resultant list. Say I wanted a function that maps TraversableOnce[A] to TraversableOnce[B] and only evaluate members as I call them?
It seems to me to be a fairly conventional problem so Im wondering if there is a common approach to this. What I currently have is:
implicit class TraversableOnceEx[T](val self : TraversableOnce[T]) extends AnyVal {
def foldyMappyFunction[A, U](a:A)(func:(A,T) => (A,U)):TraversableOnce[U] = {
var currentA = a
self.map { t =>
val result = func(currentA, t)
currentA = result._1
result._2
}
}
}
As far as functional purity goes, you couldn't run it in parallel, but otherwise it seems sound.
An example would be;
Return me each element and if it is the first time that element has appeared before.
val elements:TraversableOnce[E]
val result = elements.mappyFoldyFunction(Set.empty[E]) {
(s, e) => (s + e) -> (e -> s.contains(e))
}
result:TraversableOnce[(E,Boolean)]

You might be able to make use of the State Monad. Here is your example re-written using scalaz:
import scalaz._, Scalaz._
def foldyMappy(i: Int) = State[Set[Int], (Int, Boolean)](s => (s + i, (i, s contains(i))))
val r = List(1, 2, 3, 3, 6).traverseS(foldyMappy)(Set.empty[Int])._2
//List((1,false), (2,false), (3,false), (3,true), (6,false))
println(r)

It is look like you need SeqView. Use view or view(from: Int, until: Int) methods for create a non-strict view of list.

I really don't understand your example as your contains check will always result to false.
foldLeft is different. It will result in a single value by aggregating all elements of the list.
You clearly need map (List => List).
Anyway, answering your question about laziness:
you should use Stream instead of List. Stream doesn't evaluate the tail before actually calling it.
Stream API

Related

groupBy on List as LinkedHashMap instead of Map

I am processing XML using scala, and I am converting the XML into my own data structures. Currently, I am using plain Map instances to hold (sub-)elements, however, the order of elements from the XML gets lost this way, and I cannot reproduce the original XML.
Therefore, I want to use LinkedHashMap instances instead of Map, however I am using groupBy on the list of nodes, which creates a Map:
For example:
def parse(n:Node): Unit =
{
val leaves:Map[String, Seq[XmlItem]] =
n.child
.filter(node => { ... })
.groupBy(_.label)
.map((tuple:Tuple2[String, Seq[Node]]) =>
{
val items = tuple._2.map(node =>
{
val attributes = ...
if (node.text.nonEmpty)
XmlItem(Some(node.text), attributes)
else
XmlItem(None, attributes)
})
(tuple._1, items)
})
...
}
In this example, I want leaves to be of type LinkedHashMap to retain the order of n.child. How can I achieve this?
Note: I am grouping by label/tagname because elements can occur multiple times, and for each label/tagname, I keep a list of elements in my data structures.
Solution
As answered by #jwvh I am using foldLeft as a substitution for groupBy. Also, I decided to go with LinkedHashMap instead of ListMap.
def parse(n:Node): Unit =
{
val leaves:mutable.LinkedHashMap[String, Seq[XmlItem]] =
n.child
.filter(node => { ... })
.foldLeft(mutable.LinkedHashMap.empty[String, Seq[Node]])((m, sn) =>
{
m.update(sn.label, m.getOrElse(sn.label, Seq.empty[Node]) ++ Seq(sn))
m
})
.map((tuple:Tuple2[String, Seq[Node]]) =>
{
val items = tuple._2.map(node =>
{
val attributes = ...
if (node.text.nonEmpty)
XmlItem(Some(node.text), attributes)
else
XmlItem(None, attributes)
})
(tuple._1, items)
})
To get the rough equivalent to .groupBy() in a ListMap you could fold over your collection. The problem is that ListMap preserves the order of elements as they were appended, not as they were encountered.
import collection.immutable.ListMap
List('a','b','a','c').foldLeft(ListMap.empty[Char,Seq[Char]]){
case (lm,c) => lm.updated(c, c +: lm.getOrElse(c, Seq()))
}
//res0: ListMap[Char,Seq[Char]] = ListMap(b -> Seq(b), a -> Seq(a, a), c -> Seq(c))
To fix this you can foldRight instead of foldLeft. The result is the original order of elements as encountered (scanning left to right) but in reverse.
List('a','b','a','c').foldRight(ListMap.empty[Char,Seq[Char]]){
case (c,lm) => lm.updated(c, c +: lm.getOrElse(c, Seq()))
}
//res1: ListMap[Char,Seq[Char]] = ListMap(c -> Seq(c), b -> Seq(b), a -> Seq(a, a))
This isn't necessarily a bad thing since a ListMap is more efficient with last and init ops, O(1), than it is with head and tail ops, O(n).
To process the ListMap in the original left-to-right order you could .toList and .reverse it.
List('a','b','a','c').foldRight(ListMap.empty[Char,Seq[Char]]){
case (c,lm) => lm.updated(c, c +: lm.getOrElse(c, Seq()))
}.toList.reverse
//res2: List[(Char, Seq[Char])] = List((a,Seq(a, a)), (b,Seq(b)), (c,Seq(c)))
Purely immutable solution would be quite slow. So I'd go with
import collection.mutable.{ArrayBuffer, LinkedHashMap}
implicit class ExtraTraversableOps[A](seq: collection.TraversableOnce[A]) {
def orderedGroupBy[B](f: A => B): collection.Map[B, collection.Seq[A]] = {
val map = LinkedHashMap.empty[B, ArrayBuffer[A]]
for (x <- seq) {
val key = f(x)
map.getOrElseUpdate(key, ArrayBuffer.empty) += x
}
map
}
To use, just change .groupBy in your code to .orderedGroupBy.
The returned Map can't be mutated using this type (though it can be cast to mutable.Map or to mutable.LinkedHashMap), so it's safe enough for most purposes (and you could create a ListMap from it at the end if really desired).

How to define case class with a list of tuples and access the tuples in scala

I have a case class with a parameter a which is a list of int tuple. I want to iterate over a and define operations on a.
I have tried the following:
case class XType (a: List[(Int, Int)]) {
for (x <- a) {
assert(x._2 >= 0)
}
def op(): XType = {
for ( x <- XType(a))
yield (x._1, x._2)
}
}
However, I am getting the error:
"Value map is not a member of XType."
How can I access the integers of tuples and define operations on them?
You're running into an issue with for comprehensions, which are really another way of expressing things like foreach and map (and flatMap and withFilter/filter). See here and here for more explanation.
Your first for comprehension (the one with asserts) is equivalent to
a.foreach(x => assert(x._2 >= 0))
a is a List, x is an (Int, Int), everything's good.
However, the second on (in op) translates to
XType(a).map(x => x)
which doesn't make sense--XType doesn't know what to do with map, like the error said.
An instance of XType refers to its a as simply a (or this.a), so a.map(x => x) would be just fine in op (and then turn the result into a new XType).
As a general rule, for comprehensions are handy for nested maps (or flatMaps or whatever), rather than as a 1-1 equivalent for for loops in other languages--just use map instead.
You can access to the tuple list by:
def op(): XType = {
XType(a.map(...))
}

loop until a condition stands in scala

I'd like to write a generic loop until a given condition stands, in a functional way.
I've came up with the following code :
def loop[A](a: A, f: A => A, cond: A => Boolean) : A =
if (cond(a)) a else loop(f(a), f, cond)
What are other alternatives ? Is there anything in scalaz ?
[update] It may be possible to use cats and to convert A => A into Reader and afterwards use tailRecM. Any help would be appreciated.
I agree with #wheaties's comment, but since you asked for alternatives, here you go:
You could represent the loop's steps as an iterator, then navigate to the first step where cond is true using .find:
val result = Iterator.iterate(a)(f).find(cond).get
I had originally misread, and answered as if the cond was the "keep looping while true" condition, as with C-style loops. Here's my response as if that was what you asked.
val steps = Iterator.iterate(a)(f).takeWhile(cond)
If all you want is the last A value, you can use steps.toIterable.last (oddly, Iterator doesn't have .last defined). Or you could collect all of the values to a list using steps.toList.
Example:
val steps = Iterator.iterate(0)(_ + 1).takeWhile(_ < 10)
// remember that an iterator is read-once, so if you call .toList, you can't call .last
val result = steps.toIterable.last
// result == 9
From your structure, I think what you are describing is closer to dropWhile than takeWhile. What follows is 100% educational and I don't suggest that this is useful or the proper way to solve this problem. Nevertheless, you might find it useful.
If you want to be generic to any container (List, Array, Option, etc.) You will need a method to access the first element of this container (a.k.a. the head):
trait HasHead[I[_]]{
def head[X](of: I[X]): X
}
object HasHead {
implicit val listHasHead = new HasHead[List] {
def head[X](of: List[X]) = of.head
}
implicit val arrayHasHead = new HasHead[Array] {
def head[X](of: Array[X]) = of.head
}
//...
}
Here is the generic loop adapted to work with any container:
def loop[I[_], A](
a: I[A],
f: I[A] => I[A],
cond: A => Boolean)(
implicit
hh: HasHead[I]): I[A] =
if(cond(hh.head(a))) a else loop(f(a), f, cond)
Example:
loop(List(1,2,3,4,5), (_: List[Int]).tail, (_: Int) > 2)
> List(3, 4, 5)

Scala: Grouping list of tuples

I need to group list of tuples in some unique way.
For example, if I have
val l = List((1,2,3),(4,2,5),(2,3,3),(10,3,2))
Then I should group the list with second value and map with the set of first value
So the result should be
Map(2 -> Set(1,4), 3 -> Set(2,10))
By so far, I came up with this
l groupBy { p => p._2 } mapValues { v => (v map { vv => vv._1 }).toSet }
This works, but I believe there should be a much more efficient way...
This is similar to this question. Basically, as #serejja said, your approach is correct and also the most concise one. You could use collection.breakOut as builder factory argument to the last map and thereby save the additional iteration to get the Set type:
l.groupBy(_._2).mapValues(_.map(_._1)(collection.breakOut): Set[Int])
You shouldn't probably go beyond this, unless you really need to squeeze the performance.
Otherwise, this is how a general toMultiMap function could look like which allows you to control the values collection type:
import collection.generic.CanBuildFrom
import collection.mutable
def toMultiMap[A, K, V, Values](xs: TraversableOnce[A])
(key: A => K)(value: A => V)
(implicit cbfv: CanBuildFrom[Nothing, V, Values]): Map[K, Values] = {
val b = mutable.Map.empty[K, mutable.Builder[V, Values]]
xs.foreach { elem =>
b.getOrElseUpdate(key(elem), cbfv()) += value(elem)
}
b.map { case (k, vb) => (k, vb.result()) } (collection.breakOut)
}
What it does is, it uses a mutable Map during building stage, and values gathered in a mutable Builder first (the builder is provided by the CanBuildFrom instance). After the iteration over all input elements has completed, that mutable map of builder values is converted into an immutable map of the values collection type (again using the collection.breakOut trick to get the desired output collection straight away).
Ex:
val l = List((1,2,3),(4,2,5),(2,3,3),(10,3,2))
val v = toMultiMap(l)(_._2)(_._1) // uses Vector for values
val s: Map[Int, Set[Int] = toMultiMap(l)(_._2)(_._1) // uses Set for values
So your annotated result type directs the type inference of the values type. If you do not annotate the result, Scala will pick Vector as default collection type.

Scala: Generalised method to find match and return match dependant values in collection

I wish to find a match within a List and return values dependant on the match. The CollectFirst works well for matching on the elements of the collection but in this case I want to match on the member swEl of the element rather than on the element itself.
abstract class CanvNode (var swElI: Either[CSplit, VistaT])
{
private[this] var _swEl: Either[CSplit, VistaT] = swElI
def member = _swEl
def member_= (value: Either[CSplit, VistaT] ){ _swEl = value; attach}
def attach: Unit
attach
def findVista(origV: VistaIn): Option[Tuple2[CanvNode,VistaT]] = member match
{
case Right(v) if (v == origV) => Option(this, v)
case _ => None
}
}
def nodes(): List[CanvNode] = topNode :: splits.map(i => List(i.n1, i.n2)).flatten
//Is there a better way of implementing this?
val temp: Option[Tuple2[CanvNode, VistaT]] =
nodes.map(i => i.findVista(origV)).collectFirst{case Some (r) => r}
Do I need a View on that, or will the collectFirst method ensure the collection is only created as needed?
It strikes me that this must be a fairly general pattern. Another example could be if one had a List member of the main List's elements and wanted to return the fourth element if it had one. Is there a standard method I can call? Failing that I can create the following:
implicit class TraversableOnceRichClass[A](n: TraversableOnce[A])
{
def findSome[T](f: (A) => Option[T]) = n.map(f(_)).collectFirst{case Some (r) => r}
}
And then I can replace the above with:
val temp: Option[Tuple2[CanvNode, VistaT]] =
nodes.findSome(i => i.findVista(origV))
This uses implicit classes from 2.10, for pre 2.10 use:
class TraversableOnceRichClass[A](n: TraversableOnce[A])
{
def findSome[T](f: (A) => Option[T]) = n.map(f(_)).collectFirst{case Some (r) => r}
}
implicit final def TraversableOnceRichClass[A](n: List[A]):
TraversableOnceRichClass[A] = new TraversableOnceRichClass(n)
As an introductory side node: The operation you're describing (return the first Some if one exists, and None otherwise) is the sum of a collection of Options under the "first" monoid instance for Option. So for example, with Scalaz 6:
scala> Stream(None, None, Some("a"), None, Some("b")).map(_.fst).asMA.sum
res0: scalaz.FirstOption[java.lang.String] = Some(a)
Alternatively you could put something like this in scope:
implicit def optionFirstMonoid[A] = new Monoid[Option[A]] {
val zero = None
def append(a: Option[A], b: => Option[A]) = a orElse b
}
And skip the .map(_.fst) part. Unfortunately neither of these approaches is appropriately lazy in Scalaz, so the entire stream will be evaluated (unlike Haskell, where mconcat . map (First . Just) $ [1..] is just fine, for example).
Edit: As a side note to this side note: apparently Scalaz does provide a sumr that's appropriately lazy (for streams—none of these approaches will work on a view). So for example you can write this:
Stream.from(1).map(Some(_).fst).sumr
And not wait forever for your answer, just like in the Haskell version.
But assuming that we're sticking with the standard library, instead of this:
n.map(f(_)).collectFirst{ case Some(r) => r }
I'd write the following, which is more or less equivalent, and arguably more idiomatic:
n.flatMap(f(_)).headOption
For example, suppose we have a list of integers.
val xs = List(1, 2, 3, 4, 5)
We can make this lazy and map a function with a side effect over it to show us when its elements are accessed:
val ys = xs.view.map { i => println(i); i }
Now we can flatMap an Option-returning function over the resulting collection and use headOption to (safely) return the first element, if it exists:
scala> ys.flatMap(i => if (i > 2) Some(i.toString) else None).headOption
1
2
3
res0: Option[java.lang.String] = Some(3)
So clearly this stops when we hit a non-empty value, as desired. And yes, you'll definitely need a view if your original collection is strict, since otherwise headOption (or collectFirst) can't reach back and stop the flatMap (or map) that precedes it.
In your case you can skip findVista and get even more concise with something like this:
val temp = nodes.view.flatMap(
node => node.right.toOption.filter(_ == origV).map(node -> _)
).headOption
Whether you find this clearer or just a mess is a matter of taste, of course.