Should x._1,x._2... syntax be avoided? - scala

I'm just starting out in Scala. I find myself using tuple variables a lot.
For example, here's some code I wrote:
/* Count each letter of a string and return in a list sorted by character
* countLetter("test") = List(('e',1),('s',1),('t',2))
*/
def countLetters(s: String): List[(Char, Int)] = {
val charsListMap = s.toList.groupBy((c:Char) => c)
charsListMap.map(x => (x._1, x._2.length)).toList.sortBy(_._1)
}
Is this tuple syntax (x._1, x._2 etc) frowned upon by Scala developers?

Are the tuple accessors frowned upon by Scala developers?
Short answer: no.
Slightly longer (by one character) answer: yes.
Too many _n's can be a code smell, and in your case the following is much clearer, in my opinion:
def countLetters(s: String): List[(Char, Int)] =
s.groupBy(identity).mapValues(_.length).toList.sortBy(_._1)
There are lots of methods like mapValues that are specifically designed to cut down on the need for the noisy tuple accessors, so if you find yourself writing _1, etc., a lot, that probably means you're missing some nice library methods. But occasionally they're the cleanest way to write something (e.g., the final _1 in my rewrite).
One other thing to note is that excessive use of tuple accessors should be treated as a nudge toward promoting your tuples to case classes. Consider the following:
val name = ("Travis", "Brown")
println("Hello, " + name._1)
As opposed to:
case class Name(first: String, last: String)
val name = Name("Travis", "Brown")
println("Hello, " + name.first)
The extra case class definition in the second version buys a lot of readability for a single line of code.

There is a better solution then x._N. Common way to work with tuples is pattern matching:
charsListMap.map{case (a, b) => (a, b.length)}
You also may take a look at scalaz, there are some instruments for tuples:
import scalaz._
import Scalaz._
scala> (1, "a") bimap (_ + 2, _ + 2)
res0: (Int, java.lang.String) = (3,a2)
scala> ('s, "abc") :-> { _.length }
res1: (Symbol, Int) = ('s,3)

Starting in Scala 3, with the parameter untupling feature, the following will become an alternative for .map(x => x._1 -> x._2.length):
.map(_ -> _.length)
and thus, your example becomes:
"test".toList.groupBy(identity).map(_ -> _.length).toList.sortBy(identity)
// List(("e", 1), ("s", 1), ("t", 2))
Concerning your example more specifically and starting in Scala 2.13, you could also use groupMapReduce which (as its name suggests) is an equivalent of a groupBy followed by mapValues and a reduce step:
"test".groupMapReduce(identity)(_ => 1)(_ + _).toList.sortBy(identity)

Related

Can I return Map collection in Scala using for-yield syntax?

I'm fairly new to Scala, so hopefully you tolerate this question in the case you find it noobish :)
I wrote a function that returns a Seq of elements using yield syntax:
def calculateSomeMetrics(names: Seq[String]): Seq[Long] = {
for (name <- names) yield {
// some auxiliary actions
val metrics = somehowCalculateMetrics()
metrics
}
}
Now I need to modify it to return a Map to preserve the original names against each of the calculated values:
def calculateSomeMetrics(names: Seq[String]): Map[String, Long] = { ... }
I've attempted to use the same yield-syntax but to yield a tuple instead of a single element:
def calculateSomeMetrics(names: Seq[String]): Map[String, Long] = {
for (name <- names) yield {
// Everything is the same as before
(name, metrics)
}
}
However, the compiler interprets it Seq[(String, Long)], as per the compiler error message
type mismatch;
found : Seq[(String, Long)]
required: Map[String, Long]
So I'm wondering, what is the "canonical Scala way" to implement such a thing?
The efficient way of creating different collection types is using scala.collection.breakOut. It works with Maps and for comprehensions too:
import scala.collection.breakOut
val x: Map[String, Int] = (for (i <- 1 to 10) yield i.toString -> i)(breakOut)
x: Map[String,Int] = Map(8 -> 8, 4 -> 4, 9 -> 9, 5 -> 5, 10 -> 10, 6 -> 6, 1 -> 1, 2 -> 2, 7 -> 7, 3 -> 3)
In your case it should work too:
import scala.collection.breakOut
def calculateSomeMetrics(names: Seq[String]): Map[String, Long] = {
(for (name <- names) yield {
// Everything is the same as before
(name, metrics)
})(breakOut)
}
Comparison with toMap solutions: before toMap creates an intermediate Seq of Tuple2s (which incidentally might be a Map too in certain cases) and from that it creates the Map, while breakOut omits this intermediate Seq creation and creates the Map directly instead of the intermediate Seq.
Usually this is not a huge difference in memory or CPU usage (+ GC pressure), but sometimes these things matter.
Either:
def calculateSomeMetrics(names: Seq[String]): Map[String, Long] = {
(for (name <- names) yield {
// Everything is the same as before
(name, metrics)
}).toMap
}
Or:
names.map { name =>
// doStuff
(name, metrics)
}.toMap
Several links here that either other people pointed me at or I managed to find out later on, just assembling them in a single answer for my future reference.
breakOut - suggested by MichaƂ in his comment
toMap - in this thread
Great profound explanation on how breakOut works - in this answer
Note, though, that breakOut is going away, as noted by Karl

Scala: Apply same function to 2 lists in one call

let say I have
val list: List[(Int, String)] = List((1,"test"),(2,"test2"),(3,"sample"))
I need to partition this list in two, based on (Int, String) value. So far, so good.
For example it can be
def isValid(elem: (Int, String)) = elem._1 < 3 && elem._2.startsWith("test")
val (good, bad) = list.partition(isValid)
So, now I had 2 lists with signatures List[(Int, String)], but I need only Int part(some id). Off course I can write some function
def ids(list:List(Int, String)) = list.map(_._1)
and call it on both lists
val (ok, wrong) = (ids(good), ids(bad))
it worked, but looks little bit boilerplate. I prefer something like
val (good, bad) = list.partition(isValid).map(ids)
But it obviously not possible. So is there "Nicer" way to do what I need?
I understand that it's not so bad, but feel that there exist some functional pattern or general solution for such cases and I want to know it:) Thanks!
P.S. Thanks for all! Finally it's transformed to
private def handleGames(games:List[String], lastId:Int) = {
val (ok, wrong) = games.foldLeft(
(List.empty[Int], List.empty[Int])){
(a, b) => b match {
case gameRegex(d,w,e) => {
if(filterGame((d, w, e), lastId)) (d.toInt :: a._1, a._2)
else (a._1, d.toInt :: a._2 )
}
case _ => log.debug(s"not handled game template is: $b"); a
}
}
log.debug(s"not handled game ids are: ${wrong.mkString(",")}")
ok
}
You're looking for a foldLeft on the List:
myList.foldLeft((List.empty[Int], List.empty[Int])){
case ((good, bad), (id, value)) if predicate(id, value) => (id :: good, bad)
case ((good, bad), (id, _)) => (good, id :: bad)
}
This way you're operating at every stage doing both a transform and an accumulate. The returned type will be (List[Int], List[Int]) assuming predicate is the function which chooses between "good" and "bad" states. The cast of the Nil is due to the aggressive nature of Scala for choosing the most restrictive type on a foldl.
An additional approach using Cats can be used with Tuple2K and Foldables foldMap. Note this requires help from the kind-projector compiler plugin
import cats.implicits._
import cats.Foldable
import cats.data.Tuple2K
val listTuple = Tuple2K(list, otherList)
val (good, bad) = Foldable[Tuple2K[List, List, ?]].foldMap(listTuple)(f =>
if (isValid(f)) (List(f), List.empty) else (List.empty, List(f)))

How to turn a list of objects into a map of two fields in Scala

I'm having a real brain fart here. I'm working with the Play Framework. I have a method which takes a map and turns it into a HTML select element. I had a one-liner to take a list of objects and convert it into a map of two of the object's fields, id and name. However, I'm a Java programmer and my Scala is weak, and I've only gone and forgotten the syntax of how I did it.
I had something like
organizations.all.map {org => /* org.prop1, org.prop2 */ }
Can anyone complete the commented part?
I would suggest:
map { org => (org.id, org.name) } toMap
e.g.
scala> case class T(val a : Int, val b : String)
defined class T
scala> List(T(1, "A"), T(2, "B"))
res0: List[T] = List(T(1,A), T(2,B))
scala> res0.map(t => (t.a, t.b))
res1: List[(Int, String)] = List((1,A), (2,B))
scala> res0.map(t => (t.a, t.b)).toMap
res2: scala.collection.immutable.Map[Int,String] = Map(1 -> A, 2 -> B)
You could also take an intermediary List out of the equation and go straight to the Map like this:
case class Org(prop1:String, prop2:Int)
val list = List(Org("foo", 1), Org("bar", 2))
val map:Map[String,Int] = list.map(org => (org.prop1, org.prop2))(collection.breakOut)
Using collection.breakOut as the implicit CanBuildFrom allows you to basically skip a step in the process of getting a Map from a List.

`doto` for Scala

Clojure offers a macro called doto that takes its argument and a list of functions and essentially calls each function, prepending the (evaluated) argument:
(doto (new java.util.HashMap) (.put "a" 1) (.put "b" 2))
-> {a=1, b=2}
Is there some way to implement something similar in Scala? I envision something with the following form:
val something =
doto(Something.getInstance) {
x()
y()
z()
}
which will be equivalent to
val something = Something.getInstance
something.x()
something.y()
something.z()
Might it be possible using scala.util.DynamicVariables?
Note that with factory methods, like Something.getInstance, it is not possible to use the common Scala pattern
val something =
new Something {
x()
y()
z()
}
I don't think there is such a thing built-in in the library but you can mimic it quite easily:
def doto[A](target: A)(calls: (A => A)*) =
calls.foldLeft(target) {case (res, f) => f(res)}
Usage:
scala> doto(Map.empty[String, Int])(_ + ("a" -> 1), _ + ("b" ->2))
res0: Map[String,Int] = Map(a -> 1, b -> 2)
scala> doto(Map.empty[String, Int])(List(_ + ("a" -> 1), _ - "a", _ + ("b" -> 2)))
res10: Map[String,Int] = Map(b -> 2)
Of course, it works as long as your function returns the proper type. In your case, if the function has only side effects (which is not so "scalaish"), you can change doto and use foreach instead of foldLeft:
def doto[A](target: A)(calls: (A => Unit)*) =
calls foreach {_(target)}
Usage:
scala> import collection.mutable.{Map => M}
import collection.mutable.{Map=>M}
scala> val x = M.empty[String, Int]
x: scala.collection.mutable.Map[String,Int] = Map()
scala> doto(x)(_ += ("a" -> 1), _ += ("a" -> 2))
scala> x
res16: scala.collection.mutable.Map[String,Int] = Map(a -> 2)
In Scala, the "typical" way to do this would be to chain "tap" or "pipe" methods. These are not in the standard library, but are frequently defined as so:
implicit class PipeAndTap[A](a: A) {
def |>[B](f: A => B): B = f(a)
def tap[B](f: A => B): A = { f(a); a }
}
Then you would
(new java.util.HashMap[String,Int]) tap (_.put("a",1)) tap (_.put("b",2))
This is not as compact as the Clojure version (or as compact as Scala can be), but it is about as close to canonical as one is likely to get.
(Note: if you want to minimize run-time overhead for adding these methods, you can make a a private val and have PipeAndTap extend AnyVal; then this will be a "value class" which is only converted into a real class when you need an object to pass around; just calling a method doesn't actually require class creation.)
(Second note: in older versions of Scala, implicit class does not exist. You have to separately write the class and an implicit def that converts a generic a to a PipeAndTap.)
I think, that the closest would be to import this object's members in scope:
val something = ...
import something._
x()
y()
z()
In this post you can find another example (in section "Small update about theoretical grounds"):
http://hacking-scala.posterous.com/side-effecting-without-braces
Also small advantage with this approach - you can import individual members and rename them:
import something.{x, y => doProcessing}
More simple I guess:
val hm = Map [String, Int] () + ("a"-> 1) + ("b"-> 2)
Your sample
val something =
doto (Something.getInstance) {
x()
y()
z()
}
doesn't look very functional, because - what is the result? I assume you're side effecting.
Something.x().y().z()
could be a way if each call produces the type where the next function can act on.
z(y(x(Something)))
another kind of producing a result.
And there is the andThen method to chain method calls on collections, you might want to have a look at.
For your Map-example, a fold-left is another way to go:
val hm = Map [String, Int] () + ("a"-> 1) + ("b"-> 2)
val l = List (("a", 8), ("b", 7), ("c", 9))
(hm /: l)(_ + _)
// res8: scala.collection.immutable.Map[String,Int] = Map(a -> 8, b -> 7, c -> 9)
Well, I can think of two ways of doing it: passing strings as parameters, and having a macro change the string and compile it, or simply importing the methods. If Scala had untyped macros, maybe they could be used as well -- since it doesn't have them, I'm not going to speculate on it.
At any rate, I'm going to leave macro alternatives to others. Importing the methods is rather simple:
val map = collection.mutable.Map[String, Int]()
locally {
import map._
put("a", 1)
put("b", 2)
}
Note that locally doesn't do anything, except restrict the scope in which the members of map are imported.
One very basic way to chain several actions is function composition:
val f:Map[Int,String]=>Map[Int,String] = _ + (1 -> "x")
val g:Map[Int,String]=>Map[Int,String] = _ + (2 -> "y")
val h:Map[Int,String]=>Map[Int,String] = _ + (3 -> "z")
(h compose g compose f)(Map(42->"a"))
// Map[Int,String] = Map((42,a), (1,x), (2,y), (3,z))
In this case it's not very practical, though, as the type of the functions can't be inferred easily...

Strange behavior with reduce and fold on identical code block

EDIT
Ok, #dhg discovered that dot-method syntax required if the code block to fold() is not bound to a val (why with reduce() in the same code block one can use space-method syntax, I don't know). At any rate, the end result is the nicely concise:
result.map { row =>
addLink( row.href, row.label )
}.fold(NodeSeq.Empty)(_++_)
Which negates to some degree the original question; i.e. in many cases one can higher-order away either/or scenarios and avoid "fat", repetitive if/else statements.
ORIGINAL
Trying to reduce if/else handling when working with possibly empty collections like List[T]
For example, let's say I need to grab the latest news articles to build up a NodeSeq of html news <li><a>links</a></li>:
val result = dao.getHeadlines // List[of model objects]
if(result.isEmpty) NodeSeq.Empty
else
result map { row =>
addLink( row.href, row.label ) // NodeSeq
} reduce(_ ++ _)
This is OK, pretty terse, but I find myself wanting to go ternary style to address these only-will-ever-be either/or cases:
result.isEmpty ? NodeSeq.Empty :
result map { row =>
addLink( row.href, row.label )
} reduce(_ ++ _)
I've seen some old postings on pimping ternary onto boolean, but curious to know what the alternatives are, if any, to streamline if/else?
match {...} is, IMO, a bit bloated for this scenario, and for {...} yield doesn't seem to help much either.
You don't need to check for emptiness at all. Just use fold instead of reduce since fold allows you to specify a default "empty" value:
scala> List(1,2,3,4).map(_ + 1).fold(0)(_+_)
res0: Int = 14
scala> List[Int]().map(_ + 1).fold(0)(_+_)
res1: Int = 0
Here's an example with a List of Seqs:
scala> List(1,2).map(Seq(_)).fold(Seq.empty)(_++_)
res14: Seq[Int] = List(1, 2)
scala> List[Int]().map(Seq(_)).fold(Seq.empty)(_++_)
res15: Seq[Int] = List()
EDIT: Looks like the problem in your sample has to do with the dropping of dot (.) characters between methods. If you keep them in, it all works:
scala> List(1,2,3).map(i => node).fold(NodeSeq.Empty)(_ ++ _)
res57: scala.xml.NodeSeq = NodeSeq(<li>Link</li>, <li>Link</li>, <li>Link</li>)