Map versus FlatMap on String - scala

Listening to the Collections lecture from Functional Programming Principles in Scala, I saw this example:
scala> val s = "Hello World"
scala> s.flatMap(c => ("." + c)) // prepend each element with a period
res5: String = .H.e.l.l.o. .W.o.r.l.d
Then, I was curious why Mr. Odersky didn't use a map here. But, when I tried map, I got a different result than I expected.
scala> s.map(c => ("." + c))
res8: scala.collection.immutable.IndexedSeq[String] = Vector(.H, .e, .l, .l, .o,
". ", .W, .o, .r, .l,
I expected that above call to return a String, since I'm map-ing, i.e. applying a function to each item in the "sequence," and then returning a new "sequence."
However, I could perform a map rather than flatmap for a List[String]:
scala> val sList = s.toList
sList: List[Char] = List(H, e, l, l, o, , W, o, r, l, d)
scala> sList.map(c => "." + c)
res9: List[String] = List(.H, .e, .l, .l, .o, ". ", .W, .o, .r, .l, .d)
Why was a IndexedSeq[String] the return type of calling map on the String?

The reason for this behavior is that, in order to apply "map" to a String, Scala treats the string as a sequence of chars (IndexedSeq[String]). This is what you get as a result of the map invocation, where for each element of said sequence, the operation is applied. Since Scala treated the string as a sequence to apply map, that is what mapreturns.
flatMap then simply invokes flatten on that sequence afterwards, which then "converts" it back to a String

You also have an interesting "collection of Scala flatMap examples", the first of which illustrates that difference between flatMap and map:
scala> val fruits = Seq("apple", "banana", "orange")
fruits: Seq[java.lang.String] = List(apple, banana, orange)
scala> fruits.map(_.toUpperCase)
res0: Seq[java.lang.String] = List(APPLE, BANANA, ORANGE)
scala> fruits.flatMap(_.toUpperCase)
res1: Seq[Char] = List(A, P, P, L, E, B, A, N, A, N, A, O, R, A, N, G, E)
Quite a difference, right?
Because flatMap treats a String as a sequence of Char, it flattens the resulting list of strings into a sequence of characters (Seq[Char]).
flatMap is a combination of map and flatten, so it first runs map on the sequence, then runs flatten, giving the result shown.
You can see this by running map and then flatten yourself:
scala> val mapResult = fruits.map(_.toUpperCase)
mapResult: Seq[String] = List(APPLE, BANANA, ORANGE)
scala> val flattenResult = mapResult.flatten
flattenResult: Seq[Char] = List(A, P, P, L, E, B, A, N, A, N, A, O, R, A, N, G, E)

Your map function c => ("." + c) takes a char and returns a String. It's like taking a List and returning a List of Lists. flatMap flattens that back.
If you would return a char instead of a String you wouldn't need the result flattened, e.g. "abc".map(c => (c + 1).toChar) returns "bcd".

With map you are taking a list of characters and turning it into a list of strings. That's the result you see. A map never changes the length of a list – the list of strings has as many elements as the original string has characters.
With flatMap you are taking a list of characters and turning it into a list of strings and then you mush those strings together into a single string again. flatMap is useful when you want to turn one element in a list into multiple elements, without creating a list of lists. (This of course also means that the resulting list can have any length, including 0 – this is not possible with map unless you start out with the empty list.)

Use flatMap in situations where you run map followed by flattern. The specific situation is this:
• You’re using map (or a for/yield expression) to create a new collection from an existing collection.
• The resulting collection is a List of Lists.
• You call flatten immediately after map (or a for/yield expression).
When you’re in this situation, you can use flatMap instead.
Example: Add all the Integers from the bag
val bag = List("1", "2", "three", "4", "one hundred seventy five")
def toInt(in: String): Option[Int] = {
try {
Some(Integer.parseInt(in.trim))
} catch {
case e: Exception => None
}
}
Using a flatMap method
> bag.flatMap(toInt).sum
Using map method (3 steps needed)
bag.map(toInt) // List[Option[Int]] = List(Some(1), Some(2), None, Some(4), None)
bag.map(toInt).flatten //List[Int] = List(1, 2, 4)
bag.map(toInt).flatten.sum //Int = 7

Related

Fold function scala's immutable list

I have created an immutable list and try to fold it to a map, where each element is mapped to a constant string "abc". I do it for practice.
While I do that, I am getting an error. I am not sure why the map (here, e1 which has mutable map type) is converted to Any.
val l = collection.immutable.List(1,2,3,4)
l.fold (collection.mutable.Map[Int,String]()) ( (e1,e2) => e1 += (e2,"abc") )
l.fold (collection.mutable.Map[Int,String]()) ( (e1,e2) => e1 += (e2,"abc") )
<console>:13: error: value += is not a member of Any
Expression does not convert to assignment because receiver is not assignable.
l.fold (collection.mutable.Map[Int,String]()) ( (e1,e2) => e1 += (e2,"abc") )
At least three different problem sources here:
Map[...] is not a supertype of Int, so you probably want foldLeft, not fold (the fold acts more like the "banana brackets", it expects the first argument to act like some kind of "zero", and the binary operation as some kind of "addition" - this does not apply to mutable maps and integers).
The binary operation must return something, both for fold and foldLeft. In this case, you probably want to return the modified map. This is why you need ; m (last expression is what gets returned from the closure).
The m += (k, v) is not what you think it is. It attempts to invoke a method += with two separate arguments. What you need is to invoke it with a single pair. Try m += ((k, v)) instead (yes, those problems with arity are annoying).
Putting it all together:
l.foldLeft(collection.mutable.Map[Int, String]()){ (m, e) => m += ((e, "abc")); m }
But since you are using a mutable map anyway:
val l = (1 to 4).toList
val m = collection.mutable.Map[Int, String]()
for (e <- l) m(e) = "abc"
This looks arguably clearer to me, to be honest. In a foldLeft, I wouldn't expect the map to be mutated.
Folding is all about combining a sequence of input elements into a single output element. The output and input elements should have the same types in Scala. Here is the definition of fold:
def fold[A1 >: A](z: A1)(op: (A1, A1) => A1): A1
In your case type A1 is Int, but output element (sum type) is mutable.Map. So if you want to build a Map throug iteration, then you can use foldLeft or any other alternatives where you can use different input and output types. Here is the definition of foldLeft:
def foldLeft[B](z: B)(op: (B, A) => B): B
Solution:
val l = collection.immutable.List(1, 2, 3, 4)
l.foldLeft(collection.immutable.Map.empty[Int, String]) { (e1, e2) =>
e1 + (e2 -> "abc")
}
Note: I'm not using a mutabe Map

Concatenate a List of Char into a String via a fold

For a list of string, I do this:
val l = List("a", "r", "e")
l.reduceLeft((x, z) => x + z)
I don't know how to do it for a list of Chars. The following is a compile error:
val chs = List('a', 'r', 'e')
chs.reduceLeft[String]( (x,y) => String.valueOf(x) + String.valueOf(y))
Here's the type signature for reduceLeft:
def reduceLeft[B >: A](f: (B, A) => B): B
It requires that what you're reducing to be a subtype of the type that you're reducing from so in you're case Char is A and String is B which is not a subtype of Char.
You can do a foldLeft which will reduce your list and doesn't require the output to be a subtype of the input:
chs.foldLeft("")((x,y) => x + String.valueOf(y))
If you just want to accomplish the result:
scala> List('a', 'r', 'e').mkString
res0: String = are
If you want to really learn folds, you should do something a bit more applicable. While you can of course do this to a string, there are much better ways to create a string from a list of characters. Folds are very powerful and creating a string doesn't quite do it justice
Let's say, for example, you have
case class Employee(fname:String, lname:String, age:Int)
Let's say you also have a HashMap[String, List[Employee]] that organizes them by position. So you have Joe Shmoe who's a 25 year old Software Engineer, and Larry Larison who's a 37 year old accountant, etc. You can easily use a fold to organize this data into flat structures very simply. If you want to take it and create just a List of the names of your employees you can combine it with a flatMap to very simply return a List[String]
val employees = Map[String, List[Employee]]("Software Engineer" -> List(new Employee("Joe", "Shmoe", 25), new Employee("Larry", "Larrison", 37)), "Accountant" -> List(new Employee("Harry", "Harrison", 55))).flatMap(_._2).foldLeft[List[String]](Nil)((l,e) => e.fname + " " + e.lname :: l)
employees.flatMap(_._2).foldLeft[List[String]](Nil)(
(li,emp) =>
s"${emp.fname} ${emp.lname}" :: li
)
The flatMap function gives you a flat list of all Employee objects. It's passed a Tuple2 of String and List[Employee]. The _._2 returns the 2nd item of the Tuple2 which is the list of employees which flatMap joins together with the others.
From there, you can use foldLeft on a list of Employee objects to create a list of their names. Nil is an empty List (and will have List[String] inferred), that's your starting object.
foldLeft takes a predicate that should use a tuple as it's parameter, the first item of which will be the list formed so far, and the second item of which will be the next item in the list you are iterating over. on the first pass, you'll have an empty list and Joe Shmoe.
In the predicate, you create the string of the Employees first and last name and prepend that string to the accumulator, li.
This returns
List[String] = List(Harry Harrison, Larry Larrison, Joe Shmoe)
Folds are a very useful tool. I've found this page to be very helpful in figuring them out: http://oldfashionedsoftware.com/2009/07/30/lots-and-lots-of-foldleft-examples/
To to show the workings of foldLeft and foldRight (and map, along the way) with a little bit af a "real" operation applied, let's use toChar (of Int):
val iA: Int = 65
val cA: Char = iA.toChar //====> A
val cB: Char = 66.toChar //====> B
cA + cB
//====> 131 (Int), since char has no concatenation +, obviously
"" + cA + cB
//====> AB now forced to concatenation + of String
val li: List[Int] = List(65, 66, 67)
li map (i => i.toChar) //====> List(A, B, C)
The parameter of foldLeft and foldRight is the "zero element".
I make it explicitly visible here by using "0", you want to use "" for a decent concatenation.
The zero element should generally not be part of the result, but is needed to calculate that result.
In The following code:
i: Int because of li: List[Int]
acc: String because of "0" (accumulator)
+ is String concatenation
li.foldLeft("0")((acc, i) => acc + i.toChar)
//====> 0ABC 0 --> 0A --> 0AB --> 0ABC
li.foldLeft("0")((acc, i) => i.toChar + acc)
//====> CBA0 0 --> A0 --> BA0 --> CBA0
li.foldRight("0")((i, acc) => acc + i.toChar)
//====> 0CBA 0 --> 0C --> 0CB --> 0CBA
li.foldRight("0")((i, acc) => i.toChar + acc)
//====> ABC0 0 --> C0 --> BC0 --> ABC0

Standard function to enumerate all strings of given length over given alphabet

Suppose, I have an alphabet of N symbols and want to enumerate all different strings of length M over this alphabet. Does Scala provide any standard library function for that?
Taking inspiration from another answer:
val letters = Seq("a", "b", "c")
val n = 3
Iterable.fill(n)(letters) reduceLeft { (a, b) =>
for(a<-a;b<-b) yield a+b
}
Seq[java.lang.String] = List(aaa, aab, aac, aba, abb, abc, aca, acb, acc, baa, bab, bac, bba, bbb, bbc, bca, bcb, bcc, caa, cab, cac, cba, cbb, cbc, cca, ccb, ccc)
To work with something other than strings:
val letters = Seq(1, 2, 3)
Iterable.fill(n)(letters).foldLeft(List(List[Int]())) { (a, b) =>
for (a<-a;b<-b) yield(b::a)
}
The need for the extra type annotation is a little annoying but it will not work without it (unless someone knows another way).
Another solution:
val alph = List("a", "b", "c")
val n = 3
alph.flatMap(List.fill(alph.size)(_))
.combinations(n)
.flatMap(_.permutations).toList
Update: If you want to get a list of strings in the output, then alph should be a string.
val alph = "abcd"

Manipulating tuples

Is there a way to manipulate multiple values of a tuple without using a temporary variable and starting a new statement?
Say I have a method that returns a tuple and I want to do something with those values inline.
e.g. if I want to split a string at a certain point and reverse the pieces
def backToFront(s: String, n:Int) = s.splitAt(n)...
I can do
val (a, b) = s.splitAt(n)
b + a
(requires temporary variables and new statement) or
List(s.splitAt(n)).map(i => i._2 + i._1).head
(works, but seems a bit dirty, creating a single element List just for this) or
s.splitAt(n).swap.productIterator.mkString
(works for this particular example, but only because there happens to be a swap method that does what I want, so it's not very general).
The zipped method on tuples seems just to be for tuples of Lists.
As another example, how could you turn the tuple ('a, 'b, 'c) into ('b, 'a, 'c) in one statement?
Tuples are just convenient return type, and it is not assumed that you will make complicated manipulations with it. Also there was similar discussion on scala forums.
About the last example, couldn't find anything better than pattern-matching.
('a, 'b, 'c) match { case (a, b, c) => (b, a ,c) }
Unfortunately, the built-in methods on tuples are pretty limited.
Maybe you want something like these in your personal library,
def fold2[A, B, C](x: (A, B))(f: (A, B) => C): C = f(x._1, x._2)
def fold3[A, B, C, D](x: (A, B, C))(f: (A, B, C) => D): D = f(x._1, x._2, x._3)
With the appropriate implicit conversions, you could do,
scala> "hello world".splitAt(5).swap.fold(_ + _)
res1: java.lang.String = " worldhello"
scala> (1, 2, 3).fold((a, b, c) => (b, c, a))
res2: (Int, Int, Int) = (2,3,1)
An alternative to the last expression would be the "pipe" operator |> (get it from Scalaz or here),
scala> ('a, 'b, 'c) |> (t => (t._2, t._3, t._1))
res3: (Symbol, Symbol, Symbol) = ('b,'c,'a)
This would be nice, if not for the required annotations,
scala> ("hello ", "world") |> (((_: String) + (_: String)).tupled)
res4: java.lang.String = hello world
How about this?
s.splitAt(n) |> Function.tupled(_ + _)
[ Edit: Just noticed your arguments to function are reversed. In that case, you will have to give up placeholder syntax and instead go for: s.splitAt(n) |> Function.tupled((a, b) => b + a) ]
For your last example, can't think of anything better than a pattern match (as shown by #4e6.)
With the current development version of shapeless, you can achieve this without unpacking the tuple:
import shapeless.syntax.std.tuple._
val s = "abcdefgh"
val n = 3
s.splitAt(n).rotateRight[shapeless.Nat._1].mkString("", "", "") // "defghabc"
I think you shouldn't have to wait too long (matter of days I'd say) before the syntax of the methods of the last line get cleaned, and you can simply write
s.splitAt(n).rotateRight(1).mkString

Scala: How do I use fold* with Map?

I have a Map[String, String] and want to concatenate the values to a single string.
I can see how to do this using a List...
scala> val l = List("te", "st", "ing", "123")
l: List[java.lang.String] = List(te, st, ing, 123)
scala> l.reduceLeft[String](_+_)
res8: String = testing123
fold* or reduce* seem to be the right approach I just can't get the syntax right for a Map.
Folds on a map work the same way they would on a list of pairs. You can't use reduce because then the result type would have to be the same as the element type (i.e. a pair), but you want a string. So you use foldLeft with the empty string as the neutral element. You also can't just use _+_ because then you'd try to add a pair to a string. You have to instead use a function that adds the accumulated string, the first value of the pair and the second value of the pair. So you get this:
scala> val m = Map("la" -> "la", "foo" -> "bar")
m: scala.collection.immutable.Map[java.lang.String,java.lang.String] = Map(la -> la, foo -> bar)
scala> m.foldLeft("")( (acc, kv) => acc + kv._1 + kv._2)
res14: java.lang.String = lalafoobar
Explanation of the first argument to fold:
As you know the function (acc, kv) => acc + kv._1 + kv._2 gets two arguments: the second is the key-value pair currently being processed. The first is the result accumulated so far. However what is the value of acc when the first pair is processed (and no result has been accumulated yet)? When you use reduce the first value of acc will be the first pair in the list (and the first value of kv will be the second pair in the list). However this does not work if you want the type of the result to be different than the element types. So instead of reduce we use fold where we pass the first value of acc as the first argument to foldLeft.
In short: the first argument to foldLeft says what the starting value of acc should be.
As Tom pointed out, you should keep in mind that maps don't necessarily maintain insertion order (Map2 and co. do, but hashmaps do not), so the string may list the elements in a different order than the one in which you inserted them.
The question has been answered already, but I'd like to point out that there are easier ways to produce those strings, if that's all you want. Like this:
scala> val l = List("te", "st", "ing", "123")
l: List[java.lang.String] = List(te, st, ing, 123)
scala> l.mkString
res0: String = testing123
scala> val m = Map(1 -> "abc", 2 -> "def", 3 -> "ghi")
m: scala.collection.immutable.Map[Int,java.lang.String] = Map((1,abc), (2,def), (3,ghi))
scala> m.values.mkString
res1: String = abcdefghi