Scala: Find and update one element in a list - scala

I am trying to find an elegant way to do:
val l = List(1,2,3)
val (item, idx) = l.zipWithIndex.find(predicate)
val updatedItem = updating(item)
l.update(idx, updatedItem)
Can I do all in one operation ? Find the item, if it exist replace with updated value and keep it in place.
I could do:
l.map{ i =>
if (predicate(i)) {
updating(i)
} else {
i
}
}
but that's pretty ugly.
The other complexity is the fact that I want to update only the first element which match predicate .
Edit: Attempt:
implicit class UpdateList[A](l: List[A]) {
def filterMap(p: A => Boolean)(update: A => A): List[A] = {
l.map(a => if (p(a)) update(a) else a)
}
def updateFirst(p: A => Boolean)(update: A => A): List[A] = {
val found = l.zipWithIndex.find { case (item, _) => p(item) }
found match {
case Some((item, idx)) => l.updated(idx, update(item))
case None => l
}
}
}

I don't know any way to make this in one pass of the collection without using a mutable variable. With two passes you can do it using foldLeft as in:
def updateFirst[A](list:List[A])(predicate:A => Boolean, newValue:A):List[A] = {
list.foldLeft((List.empty[A], predicate))((acc, it) => {acc match {
case (nl,pr) => if (pr(it)) (newValue::nl, _ => false) else (it::nl, pr)
}})._1.reverse
}
The idea is that foldLeft allows passing additional data through iteration. In this particular implementation I change the predicate to the fixed one that always returns false. Unfortunately you can't build a List from the head in an efficient way so this requires another pass for reverse.
I believe it is obvious how to do it using a combination of map and var
Note: performance of the List.map is the same as of a single pass over the list only because internally the standard library is mutable. Particularly the cons class :: is declared as
final case class ::[B](override val head: B, private[scala] var tl: List[B]) extends List[B] {
so tl is actually a var and this is exploited by the map implementation to build a list from the head in an efficient way. The field is private[scala] so you can't use the same trick from outside of the standard library. Unfortunately I don't see any other API call that allows to use this feature to reduce the complexity of your problem to a single pass.

You can avoid .zipWithIndex() by using .indexWhere().
To improve complexity, use Vector so that l(idx) becomes effectively constant time.
val l = Vector(1,2,3)
val idx = l.indexWhere(predicate)
val updatedItem = updating(l(idx))
l.updated(idx, updatedItem)
Reason for using scala.collection.immutable.Vector rather than List:
Scala's List is a linked list, which means data are access in O(n) time. Scala's Vector is indexed, meaning data can be read from any point in effectively constant time.
You may also consider mutable collections if you're modifying just one element in a very large collection.
https://docs.scala-lang.org/overviews/collections/performance-characteristics.html

Related

groupBy on List as LinkedHashMap instead of Map

I am processing XML using scala, and I am converting the XML into my own data structures. Currently, I am using plain Map instances to hold (sub-)elements, however, the order of elements from the XML gets lost this way, and I cannot reproduce the original XML.
Therefore, I want to use LinkedHashMap instances instead of Map, however I am using groupBy on the list of nodes, which creates a Map:
For example:
def parse(n:Node): Unit =
{
val leaves:Map[String, Seq[XmlItem]] =
n.child
.filter(node => { ... })
.groupBy(_.label)
.map((tuple:Tuple2[String, Seq[Node]]) =>
{
val items = tuple._2.map(node =>
{
val attributes = ...
if (node.text.nonEmpty)
XmlItem(Some(node.text), attributes)
else
XmlItem(None, attributes)
})
(tuple._1, items)
})
...
}
In this example, I want leaves to be of type LinkedHashMap to retain the order of n.child. How can I achieve this?
Note: I am grouping by label/tagname because elements can occur multiple times, and for each label/tagname, I keep a list of elements in my data structures.
Solution
As answered by #jwvh I am using foldLeft as a substitution for groupBy. Also, I decided to go with LinkedHashMap instead of ListMap.
def parse(n:Node): Unit =
{
val leaves:mutable.LinkedHashMap[String, Seq[XmlItem]] =
n.child
.filter(node => { ... })
.foldLeft(mutable.LinkedHashMap.empty[String, Seq[Node]])((m, sn) =>
{
m.update(sn.label, m.getOrElse(sn.label, Seq.empty[Node]) ++ Seq(sn))
m
})
.map((tuple:Tuple2[String, Seq[Node]]) =>
{
val items = tuple._2.map(node =>
{
val attributes = ...
if (node.text.nonEmpty)
XmlItem(Some(node.text), attributes)
else
XmlItem(None, attributes)
})
(tuple._1, items)
})
To get the rough equivalent to .groupBy() in a ListMap you could fold over your collection. The problem is that ListMap preserves the order of elements as they were appended, not as they were encountered.
import collection.immutable.ListMap
List('a','b','a','c').foldLeft(ListMap.empty[Char,Seq[Char]]){
case (lm,c) => lm.updated(c, c +: lm.getOrElse(c, Seq()))
}
//res0: ListMap[Char,Seq[Char]] = ListMap(b -> Seq(b), a -> Seq(a, a), c -> Seq(c))
To fix this you can foldRight instead of foldLeft. The result is the original order of elements as encountered (scanning left to right) but in reverse.
List('a','b','a','c').foldRight(ListMap.empty[Char,Seq[Char]]){
case (c,lm) => lm.updated(c, c +: lm.getOrElse(c, Seq()))
}
//res1: ListMap[Char,Seq[Char]] = ListMap(c -> Seq(c), b -> Seq(b), a -> Seq(a, a))
This isn't necessarily a bad thing since a ListMap is more efficient with last and init ops, O(1), than it is with head and tail ops, O(n).
To process the ListMap in the original left-to-right order you could .toList and .reverse it.
List('a','b','a','c').foldRight(ListMap.empty[Char,Seq[Char]]){
case (c,lm) => lm.updated(c, c +: lm.getOrElse(c, Seq()))
}.toList.reverse
//res2: List[(Char, Seq[Char])] = List((a,Seq(a, a)), (b,Seq(b)), (c,Seq(c)))
Purely immutable solution would be quite slow. So I'd go with
import collection.mutable.{ArrayBuffer, LinkedHashMap}
implicit class ExtraTraversableOps[A](seq: collection.TraversableOnce[A]) {
def orderedGroupBy[B](f: A => B): collection.Map[B, collection.Seq[A]] = {
val map = LinkedHashMap.empty[B, ArrayBuffer[A]]
for (x <- seq) {
val key = f(x)
map.getOrElseUpdate(key, ArrayBuffer.empty) += x
}
map
}
To use, just change .groupBy in your code to .orderedGroupBy.
The returned Map can't be mutated using this type (though it can be cast to mutable.Map or to mutable.LinkedHashMap), so it's safe enough for most purposes (and you could create a ListMap from it at the end if really desired).

Complexity of linked-lists concatenation

I'm still pretty much new to functional programming and experimenting with algebraic data types. I implemented LinkedList as follows:
sealed abstract class LinkedList[+T]{
def flatMap[T2](f: T => LinkedList[T2]) = LinkedList.flatMap(this)(f)
def foreach(f: T => Unit): Unit = LinkedList.foreach(this)(f)
}
final case class Next[T](t: T, linkedList: LinkedList[T]) extends LinkedList[T]
case object Stop extends LinkedList[Nothing]
object LinkedList{
private def connect[T](left: LinkedList[T], right: LinkedList[T]): LinkedList[T] = left match {
case Stop => right
case Next(t, l) => Next(t, connect(l, right))
}
private def flatMap[T, T2](list: LinkedList[T])(f: T => LinkedList[T2]): LinkedList[T2] = list match {
case Stop => Stop
case Next(t, l) => connect(f(t), flatMap(l)(f))
}
private def foreach[T](ll: LinkedList[T])(f: T => Unit): Unit = ll match {
case Stop => ()
case Next(t, l) =>
f(t)
foreach(l)(f)
}
}
The thing is that I wanted to measure complexity of flatMap operation.
The complexity of flatMap is O(n) where n is the length of flatMaped list (as far as I could prove by induction).
The question is how to design the list that way so the concatenation has constant-time complexity? What kind of class extends LinkedList we should design to achieve that? Or maybe some another way?
One approach would be to keep track of the end of your list, and make your list nodes mutable so you can just modify the end to point to the start of the next list.
I don't see a way to do it with immutable nodes, since modifying what follows a tail node requires modifying everything before it. To get constant-time concatenation with immutable data structures, you need something other than a simple linked list.

Scala list foreach, update list while in foreach loop

I just started working with scala and am trying to get used to the language. I was wondering if the following is possible:
I have a list of Instruction objects that I am looping over with the foreach method. Am I able to add elements to my Instruction list while I am looping over it? Here is a code example to explain what I want:
instructions.zipWithIndex.foreach { case (value, index) =>
value match {
case WhileStmt() => {
---> Here I want to add elements to the instructions list.
}
case IfStmt() => {
...
}
_ => {
...
}
Idiomatic way would be something like this for rather complex iteration and replacement logic:
#tailrec
def idiomaticWay(list: List[Instruction],
acc: List[Instruction] = List.empty): List[Instruction] =
list match {
case WhileStmt() :: tail =>
// add element to head of acc
idiomaticWay(tail, CherryOnTop :: acc)
case IfStmt() :: tail =>
// add nothing here
idiomaticWay(tail, list.head :: acc)
case Nil => acc
}
val updatedList = idiomaticWay(List(WhileStmt(), IfStmt()))
println(updatedList) // List(IfStmt(), CherryOnTop)
This solution works with immutable list, returns immutable list which has different values in it according to your logic.
If you want to ultimately hack around (add, remove, etc) you could use Java ListIterator class that would allow you to do all operations mentioned above:
def hackWay(list: util.List[Instruction]): Unit = {
val iterator = list.listIterator()
while(iterator.hasNext) {
iterator.next() match {
case WhileStmt() =>
iterator.set(CherryOnTop)
case IfStmt() => // do nothing here
}
}
}
import collection.JavaConverters._
val instructions = new util.ArrayList[Instruction](List(WhileStmt(), IfStmt()).asJava)
hackWay(instructions)
println(instructions.asScala) // Buffer(CherryOnTop, IfStmt())
However in the second case you do not need scala :( So my advise would be to stick to immutable data structures in scala.

Scala: Grouping list of tuples

I need to group list of tuples in some unique way.
For example, if I have
val l = List((1,2,3),(4,2,5),(2,3,3),(10,3,2))
Then I should group the list with second value and map with the set of first value
So the result should be
Map(2 -> Set(1,4), 3 -> Set(2,10))
By so far, I came up with this
l groupBy { p => p._2 } mapValues { v => (v map { vv => vv._1 }).toSet }
This works, but I believe there should be a much more efficient way...
This is similar to this question. Basically, as #serejja said, your approach is correct and also the most concise one. You could use collection.breakOut as builder factory argument to the last map and thereby save the additional iteration to get the Set type:
l.groupBy(_._2).mapValues(_.map(_._1)(collection.breakOut): Set[Int])
You shouldn't probably go beyond this, unless you really need to squeeze the performance.
Otherwise, this is how a general toMultiMap function could look like which allows you to control the values collection type:
import collection.generic.CanBuildFrom
import collection.mutable
def toMultiMap[A, K, V, Values](xs: TraversableOnce[A])
(key: A => K)(value: A => V)
(implicit cbfv: CanBuildFrom[Nothing, V, Values]): Map[K, Values] = {
val b = mutable.Map.empty[K, mutable.Builder[V, Values]]
xs.foreach { elem =>
b.getOrElseUpdate(key(elem), cbfv()) += value(elem)
}
b.map { case (k, vb) => (k, vb.result()) } (collection.breakOut)
}
What it does is, it uses a mutable Map during building stage, and values gathered in a mutable Builder first (the builder is provided by the CanBuildFrom instance). After the iteration over all input elements has completed, that mutable map of builder values is converted into an immutable map of the values collection type (again using the collection.breakOut trick to get the desired output collection straight away).
Ex:
val l = List((1,2,3),(4,2,5),(2,3,3),(10,3,2))
val v = toMultiMap(l)(_._2)(_._1) // uses Vector for values
val s: Map[Int, Set[Int] = toMultiMap(l)(_._2)(_._1) // uses Set for values
So your annotated result type directs the type inference of the values type. If you do not annotate the result, Scala will pick Vector as default collection type.

Scala: Generalised method to find match and return match dependant values in collection

I wish to find a match within a List and return values dependant on the match. The CollectFirst works well for matching on the elements of the collection but in this case I want to match on the member swEl of the element rather than on the element itself.
abstract class CanvNode (var swElI: Either[CSplit, VistaT])
{
private[this] var _swEl: Either[CSplit, VistaT] = swElI
def member = _swEl
def member_= (value: Either[CSplit, VistaT] ){ _swEl = value; attach}
def attach: Unit
attach
def findVista(origV: VistaIn): Option[Tuple2[CanvNode,VistaT]] = member match
{
case Right(v) if (v == origV) => Option(this, v)
case _ => None
}
}
def nodes(): List[CanvNode] = topNode :: splits.map(i => List(i.n1, i.n2)).flatten
//Is there a better way of implementing this?
val temp: Option[Tuple2[CanvNode, VistaT]] =
nodes.map(i => i.findVista(origV)).collectFirst{case Some (r) => r}
Do I need a View on that, or will the collectFirst method ensure the collection is only created as needed?
It strikes me that this must be a fairly general pattern. Another example could be if one had a List member of the main List's elements and wanted to return the fourth element if it had one. Is there a standard method I can call? Failing that I can create the following:
implicit class TraversableOnceRichClass[A](n: TraversableOnce[A])
{
def findSome[T](f: (A) => Option[T]) = n.map(f(_)).collectFirst{case Some (r) => r}
}
And then I can replace the above with:
val temp: Option[Tuple2[CanvNode, VistaT]] =
nodes.findSome(i => i.findVista(origV))
This uses implicit classes from 2.10, for pre 2.10 use:
class TraversableOnceRichClass[A](n: TraversableOnce[A])
{
def findSome[T](f: (A) => Option[T]) = n.map(f(_)).collectFirst{case Some (r) => r}
}
implicit final def TraversableOnceRichClass[A](n: List[A]):
TraversableOnceRichClass[A] = new TraversableOnceRichClass(n)
As an introductory side node: The operation you're describing (return the first Some if one exists, and None otherwise) is the sum of a collection of Options under the "first" monoid instance for Option. So for example, with Scalaz 6:
scala> Stream(None, None, Some("a"), None, Some("b")).map(_.fst).asMA.sum
res0: scalaz.FirstOption[java.lang.String] = Some(a)
Alternatively you could put something like this in scope:
implicit def optionFirstMonoid[A] = new Monoid[Option[A]] {
val zero = None
def append(a: Option[A], b: => Option[A]) = a orElse b
}
And skip the .map(_.fst) part. Unfortunately neither of these approaches is appropriately lazy in Scalaz, so the entire stream will be evaluated (unlike Haskell, where mconcat . map (First . Just) $ [1..] is just fine, for example).
Edit: As a side note to this side note: apparently Scalaz does provide a sumr that's appropriately lazy (for streams—none of these approaches will work on a view). So for example you can write this:
Stream.from(1).map(Some(_).fst).sumr
And not wait forever for your answer, just like in the Haskell version.
But assuming that we're sticking with the standard library, instead of this:
n.map(f(_)).collectFirst{ case Some(r) => r }
I'd write the following, which is more or less equivalent, and arguably more idiomatic:
n.flatMap(f(_)).headOption
For example, suppose we have a list of integers.
val xs = List(1, 2, 3, 4, 5)
We can make this lazy and map a function with a side effect over it to show us when its elements are accessed:
val ys = xs.view.map { i => println(i); i }
Now we can flatMap an Option-returning function over the resulting collection and use headOption to (safely) return the first element, if it exists:
scala> ys.flatMap(i => if (i > 2) Some(i.toString) else None).headOption
1
2
3
res0: Option[java.lang.String] = Some(3)
So clearly this stops when we hit a non-empty value, as desired. And yes, you'll definitely need a view if your original collection is strict, since otherwise headOption (or collectFirst) can't reach back and stop the flatMap (or map) that precedes it.
In your case you can skip findVista and get even more concise with something like this:
val temp = nodes.view.flatMap(
node => node.right.toOption.filter(_ == origV).map(node -> _)
).headOption
Whether you find this clearer or just a mess is a matter of taste, of course.