Combining the elements of 2 lists - scala

Assume we have two lists :
val l1=List("a","b","c")
val l2 = List("1","2","3")
What I want is : List("a1", "b2", "c3") that is, adding the nth element of l1 with the nth element of l2
A way to achieve it is :
(l1 zip l2).map (c => {c._1+c._2})
I just wonder if one could achieve it with an Applicative. I tried :
(l1 |#| l2) { _+ _ }
but it gives all the combinations :
List(a1, a2, a3, b1, b2, b3, c1, c2, c3)
Any idea?
Thank you
Benoit

You cannot do that with strict lists, so instead use lazy lists i.e. streams. You have to define the Applicative[Stream] instance as shown below. (You'll find it in Haskell standard library under the name ZipList.)
scala> val s1 = Stream("a", "b", "c")
s1: scala.collection.immutable.Stream[java.lang.String] = Stream(a, ?)
scala> val s2 = Stream("1", "2", "3")
s2: scala.collection.immutable.Stream[java.lang.String] = Stream(1, ?)
scala> implicit object StreamApplicative extends Applicative[Stream] {
| def pure[A](a: => A) = Stream.continually(a)
| override def apply[A, B](f: Stream[A => B], xs: Stream[A]): Stream[B] = (f, xs).zipped.map(_ apply _)
| }
defined module StreamApplicative
scala> (s1 |#| s2)(_ + _)
res101: scala.collection.immutable.Stream[java.lang.String] = Stream(a1, ?)
scala> .force
res102: scala.collection.immutable.Stream[java.lang.String] = Stream(a1, b2, c3)
The reason this cannot be done with strict lists is because it is impossible to define a pure on them that satisfies the applicative laws.
As an aside, Scala lets you do this more concisely than the code you have used in OP:
scala> (l1, l2).zipped.map(_ + _)
res103: List[java.lang.String] = List(a1, b2, c3)

The answer is that you can't achieve this with an applicative as far as I can see. The applicative for list will apply the function to all combinations, as you have found out. Not great for what you want but awesome for stuff like creating cartesian products.
A slightly less verbose method might use Tuple2W.fold supplied by scalaz:
(l1 zip l2).map (_ fold (_ + _))

Related

scala collections : map a list and carry some state?

I seem to run into this problem all the time. I want to modify some of the elements in a list, but I need to keep some state as I do it, so map doesn't work.
Here is an example :
scala> val l1 = List("a","b","c","d","e","f","b","c","e","b","a")
l1: List[String] = List(a, b, c, d, e, f, b, c, e, b, a)
I want to change the name of any duplicates. so I want to end up with this:
List(a1, b1, c1, d, e1, f, b2, c2, e2, b3, a2)
Getting the dupes is easy :
scala> val d = l1.diff(l1.distinct).distinct
d: List[String] = List(b, c, e, a)
Now I'm stuck. I made it work by converting d to a HashMap w/ a count, and writing a function to iterate over l1 and update it & the hash before recursing. Which works fine, but looks kinda ugly to me.
But I've always thought there should be a way to do w/ the collection classes.
Here is the rest of my solution which I don't like :
val m = d.map( _ -> 1).toMap
def makeIt(ms: Map[String, Int], ol: Iterator[String], res: List[String]=List[String]()) :List[String] = {
if( !ol.hasNext) return res
val no = ol.next()
val (p,nm) = ms.get(no) match {
case Some(v) => (s"$no$v", ms.updated(no,v+1))
case None => (no,ms)
}
makeIt(nm,ol,res :+ p)
}
makeIt(m,l1.iterator)
Which gives me what I want
res2: List[String] = List(a1, b1, c1, d, e1, f, b2, c2, e2, b3, a2)
I feel like I want "mapWithState" where I can just pass something along. Like Fold-ish. Maybe it exists and I just haven't found it yet?
Thanks
-------UPDATE---------
#Aluan Haddad's comment pointed me in this direction. Which destroys order, which is fine for my case. But the "state" is carried by zipWithIndex. I'm looking for a more general case where the state would require some computation at each element. But for this simple case I like it :
l1.groupBy(x=>x).values.flatMap( v =>{
if( v.length <= 1 ) v else {
v.zipWithIndex.map{ case (s,i) => s"$s${i+1}"}
}
})
res7: Iterable[String] = List(e1, e2, f, a1, a2, b1, b2, b3, c1, c2, d)
The tricky part is that the "d" and "f" elements get no modification.
This is what I came up with. It's a bit more concise, code wise, but does involve multiple traversals.
val l1: List[String] = List("a","b","c","d","e","f","b","c","e","b","a")
l1.reverse.tails.foldLeft(List[String]()){
case (res, Nil) => res
case (res, hd::tl) =>
val count = tl.count(_ == hd)
if (count > 0) s"$hd${count+1}" +: res
else if (res.contains(hd+2)) (hd+1) +: res
else hd +: res
}
//res0: List[String] = List(a1, b1, c1, d, e1, f, b2, c2, e2, b3, a2)
By using tails, each element, hd, is able to see the future, tl, and the past, res.
A simple but slow version
l1.zipWithIndex.map{ case (elem, i) =>
if (l1.count(_ == elem) == 1) {
elem
} else {
val n = {l1.take(i+1).count(_ == elem)}
s"$elem$n"
}
}
The next version is longer, less pretty, and not functional, but should be faster in the unlikely case that you are processing a very long list.
def makeUniq(in: Seq[String]): Seq[String] = {
// Count occurrence of each element
val m = mutable.Map.empty[String, Int]
for (elem <- in) {
m.update(elem, m.getOrElseUpdate(elem, 0) + 1)
}
// Remove elements with a single occurrence
val dupes = m.filter(_._2 > 1)
// Apply numbering to duplicate elements
in.reverse.map(e => {
val idx = dupes.get(e) match {
case Some(i) =>
dupes.update(e, i - 1)
i.toString
case _ =>
""
}
s"$e$idx"
}).reverse
}
The code is easier if you wanted to apply a count to every element rather than just the non-unique ones.
def makeUniq(in: Seq[String]): Seq[String] = {
val m = mutable.Map.empty[String, Int]
in.map{ e =>
val i = m.getOrElseUpdate(e, 0) + 1
m.update(e, i)
s"$e$i"
}
}

What is the inverse of intercalate, and how to implement it?

This question discusses how to interleave two lists in an alternating fashion, i.e. intercalate them.
What is the inverse of "intercalate" called?
Is there an idiomatic way to implement this in Scala?
The topic is discussed on this Haskell IRC session.
Possibilities include "deintercalate", "extracalate", "ubercalate", "outercalate", and "chocolate" ;-)
Assuming we go for "extracalate", it can be implemented as a fold:
def extracalate[A](a: List[A]) =
a.foldRight((List[A](), List[A]())){ case (b, (a1,a2)) => (b :: a2, a1) }
For example:
val mary = List("Mary", "had", "a", "little", "lamb")
extracalate(mary)
//> (List(Mary, a, lamb),List(had, little)
Note that the original lists can only be reconstructed if either:
the input lists were the same length, or
the first list was 1 longer than the second list
The second case actually turns out to be useful for the geohashing algorithm, where the latitude bits and longitude bits are intercalated, but there may be an odd number of bits.
Note also that the definition of intercalate in the linked question is different from the definition in the Haskell libraries, which intersperses a list in between a list of lists!
Update: As for any fold, we supply a starting value and a function to apply to each value of the input list. This function modifies the starting value and passes it to the next step of the fold.
Here, we start with a pair of empty output lists: (List[A](), List[A]())
Then for each element in the input list, we add it onto the front of one of the output lists using cons ::. However, we also swap the order of the two output lists , each time the function is invoked; (a1, a2) becomes (b :: a2, a1). This divides the input list between the two output lists in alternating fashion. Because it's a right fold, we start at the end of the input list, which is necessary to get each output list in the correct order. Proceeding from the starting value to the final value, we would get:
([], [])
([lamb], [])
([little],[lamb])
([a, lamb],[little])
([had, little],[a, lamb])
([Mary, a, lamb],[had, little])
Also, using standard methods
val mary = List("Mary", "had", "a", "little", "lamb")
//> mary : List[String] = List(Mary, had, a, little, lamb)
val (f, s) = mary.zipWithIndex.partition(_._2 % 2 == 0)
//> f : List[(String, Int)] = List((Mary,0), (a,2), (lamb,4))
//| s : List[(String, Int)] = List((had,1), (little,3))
(f.unzip._1, s.unzip._1)
//> res0: (List[String], List[String]) = (List(Mary, a, lamb),List(had, little))
Not really recommending it, though, the fold will beat it hands down on performance
Skinning the cat another way
val g = mary.zipWithIndex.groupBy(_._2 % 2)
//> g : scala.collection.immutable.Map[Int,List[(String, Int)]] = Map(1 -> List
//| ((had,1), (little,3)), 0 -> List((Mary,0), (a,2), (lamb,4)))
(g(0).unzip._1, g(1).unzip._1)
//> res1: (List[String], List[String]) = (List(Mary, a, lamb),List(had, little))
Also going to be slow
I think it's inferior to #DNA's answer as it's more code and it requires passing through the list twice.
scala> list
res27: List[Int] = List(1, 2, 3, 4, 5)
scala> val first = list.zipWithIndex.filter( x => x._1 % 2 == 1).map(x => x._2)
first: List[Int] = List(0, 2, 4)
scala> val second = list.zipWithIndex.filter( x => x._1 % 2 == 0).map(x => x._2)
second: List[Int] = List(1, 3)
scala> (first, second)
res28: (List[Int], List[Int]) = (List(0, 2, 4),List(1, 3))

In Scala, how do you combine the sequence of elements of the same List?

How do I transform this list:
List("a", "b", "c", "d")
into a list with the following elements
List(List("a"), List("a", "b"), List("a", "b", "c"), List("a", "b", "c", "d"))
My requirement is to build a list of relative directory paths from a list containing directory names, where a is the root directory and b is a leaf of a i.e. a/b
Example:
"fruit", "tropical", "mango"
transforms to:
"fruit", "fruit/tropical", "fruit/tropical/mango"
Edit: I can do this iteratively, but I'm looking for a functional solution.
You can use inits to achieve similar thing you are looking for:
val xs = List("a", "b", "c", "d")
val ys = xs.inits.toList.reverse.drop(1)
Explanation:
xs.inits.toList will give you this result:
List(List(a, b, c, d), List(a, b, c), List(a, b), List(a), List())
Now you can reverse it and drop the first element and get this:
List(List(a), List(a, b), List(a, b, c), List(a, b, c, d))
Then, just make a String of results you got:
ys.map(_.mkString("/")) // results in List(a, a/b, a/b/c, a/b/c/d)
I think you should probably use inits (I would avoid relying on the order of the returned elements, although it is documented that the last element is the empty one):
val basket = List("fruit", "tropical", "mango")
basket.inits.toList filterNot (_.isEmpty) sortBy (_.length) map (_ mkString "/")
However, if you want an approach that doesn't use that library function, you could roll your own recursive function:
def paths(elems: List[String]): List[List[String]] = {
elems match {
case Nil => Nil
case e :: es => List(e) :: (paths(es) map (e :: _))
}
}
paths(basket) map (_ mkString "/")
This isn't tail-recursive, so it will blow the stack if your path is many elements deep. You could make it tail-recursive using an accumulating parameter (actually, two accumulating parameters is the best I can do):
#annotation.tailrec
final def paths(elems: List[String], path: List[String], acc: List[List[String]]): List[List[String]] = {
elems match {
case Nil => acc
case e :: es => paths(es, path :+ e, acc :+ (path :+ e))
}
}
paths(basket, Nil, Nil) map (_ mkString "/")
This solution uses the :+ operator (append element) on List a lot though, so it's not optimal with respect to time complexity. I'll leave fixing that as an exercise for the reader (hint: you would probably want to store the accumulating parameters in reverse order).

What's the relation of fold on Option, Either etc and fold on Traversable?

Scalaz provides a method named fold for various ADTs such as Boolean, Option[_], Validation[_, _], Either[_, _] etc. This method basically takes functions corresponding to all possible cases for that given ADT. In other words, a pattern match shown below:
x match {
case Case1(a, b, c) => f(a, b, c)
case Case2(a, b) => g(a, b)
.
.
case CaseN => z
}
is equivalent to:
x.fold(f, g, ..., z)
Some examples:
scala> (9 == 8).fold("foo", "bar")
res0: java.lang.String = bar
scala> 5.some.fold(2 *, 2)
res1: Int = 10
scala> 5.left[String].fold(2 +, "[" +)
res2: Any = 7
scala> 5.fail[String].fold(2 +, "[" +)
res6: Any = 7
At the same time, there is an operation with the same name for the Traversable[_] types, which traverses over the collection performing certain operation on its elements, and accumulating the result value. For example,
scala> List(2, 90, 11).foldLeft("Contents: ")(_ + _.toString + " ")
res9: java.lang.String = "Contents: 2 90 11 "
scala> List(2, 90, 11).fold(0)(_ + _)
res10: Int = 103
scala> List(2, 90, 11).fold(1)(_ * _)
res11: Int = 1980
Why are these two operations identified with the same name - fold/catamorphism? I fail to see any similarities/relation between the two. What am I missing?
I think the problem you are having is that you see these things based on their implementation, not their types. Consider this simple representation of types:
List[A] = Nil
| Cons head: A tail: List[A]
Option[A] = None
| Some el: A
Now, let's consider Option's fold:
fold[B] = (noneCase: => B, someCase: A => B) => B
So, on Option, it reduces every possible case to some value in B, and return that. Now, let's see the same thing for List:
fold[B] = (nilCase: => B, consCase: (A, List[A]) => B) => B
Note, however, that we have a recursive call there, on List[A]. We have to fold that somehow, but we know fold[B] on a List[A] will always return B, so we can rewrite it like this:
fold[B] = (nilCase: => B, consCase: (A, B) => B) => B
In other words, we replaced List[A] by B, because folding it will always return a B, given the type signature of fold. Now, let's see Scala's (use case) type signature for foldRight:
foldRight[B](z: B)(f: (A, B) ⇒ B): B
Say, does that remind you of something?
If you think of "folding" as "condensing all the values in a container through an operation, with a seed value", and you think of an Option as a container that can can have at most one value, then this starts to make sense.
In fact, foldLeft has the same signature and gives you exactly the same results if you use it on an empty list vs None, and on a list with only one element vs Some:
scala> val opt : Option[Int] = Some(10)
opt: Option[Int] = Some(10)
scala> val lst : List[Int] = List(10)
lst: List[Int] = List(10)
scala> opt.foldLeft(1)((a, b) => a + b)
res11: Int = 11
scala> lst.foldLeft(1)((a, b) => a + b)
res12: Int = 11
fold is also defined on both List and Option in the Scala standard library, with the same signature (I believe they both inherit it from a trait, in fact). And again, you get the same results on a singleton list as on Some:
scala> opt.fold(1)((a, b) => a * b)
res25: Int = 10
scala> lst.fold(1)((a, b) => a * b)
res26: Int = 10
I'm not 100% sure about the fold from Scalaz on Option/Either/etc, you raise a good point there. It seems to have quite a different signature and operation from the "folding" I'm used to.

Cartesian product of two lists

Given a map where a digit is associated to several characters
scala> val conversion = Map("0" -> List("A", "B"), "1" -> List("C", "D"))
conversion: scala.collection.immutable.Map[java.lang.String,List[java.lang.String]] =
Map(0 -> List(A, B), 1 -> List(C, D))
I want to generate all possible character sequences based on a sequence of digits. Examples:
"00" -> List("AA", "AB", "BA", "BB")
"01" -> List("AC", "AD", "BC", "BD")
I can do this with for comprehensions
scala> val number = "011"
number: java.lang.String = 011
Create a sequence of possible characters per index
scala> val values = number map { case c => conversion(c.toString) }
values: scala.collection.immutable.IndexedSeq[List[java.lang.String]] =
Vector(List(A, B), List(C, D), List(C, D))
Generate all the possible character sequences
scala> for {
| a <- values(0)
| b <- values(1)
| c <- values(2)
| } yield a+b+c
res13: List[java.lang.String] = List(ACC, ACD, ADC, ADD, BCC, BCD, BDC, BDD)
Here things get ugly and it will only work for sequences of three digits. Is there any way to achieve the same result for any sequence length?
The following suggestion is not using a for-comprehension. But I don't think it's a good idea after all, because as you noticed you'd be tied to a certain length of your cartesian product.
scala> def cartesianProduct[T](xss: List[List[T]]): List[List[T]] = xss match {
| case Nil => List(Nil)
| case h :: t => for(xh <- h; xt <- cartesianProduct(t)) yield xh :: xt
| }
cartesianProduct: [T](xss: List[List[T]])List[List[T]]
scala> val conversion = Map('0' -> List("A", "B"), '1' -> List("C", "D"))
conversion: scala.collection.immutable.Map[Char,List[java.lang.String]] = Map(0 -> List(A, B), 1 -> List(C, D))
scala> cartesianProduct("01".map(conversion).toList)
res9: List[List[java.lang.String]] = List(List(A, C), List(A, D), List(B, C), List(B, D))
Why not tail-recursive?
Note that above recursive function is not tail-recursive. This isn't a problem, as xss will be short unless you have a lot of singleton lists in xss. This is the case, because the size of the result grows exponentially with the number of non-singleton elements of xss.
I could come up with this:
val conversion = Map('0' -> Seq("A", "B"), '1' -> Seq("C", "D"))
def permut(str: Seq[Char]): Seq[String] = str match {
case Seq() => Seq.empty
case Seq(c) => conversion(c)
case Seq(head, tail # _*) =>
val t = permut(tail)
conversion(head).flatMap(pre => t.map(pre + _))
}
permut("011")
I just did that as follows and it works
def cross(a:IndexedSeq[Tree], b:IndexedSeq[Tree]) = {
a.map (p => b.map( o => (p,o))).flatten
}
Don't see the $Tree type that am dealing it works for arbitrary collections too..