Scala style: constant map vs pattern matching - scala

I need to declare a constant mapping in scala, and wounder what would be the proper way to do that.
The Java way is
private static final String[] numbers = {"zero","one","two","three"} //Java
val numbers = Array("zero","one","two","three") //Scala
val numbers = collection.immutable.HashMap(0 -> "zero", 1 -> "one", 2 => "two") //Scala maps
Another way to do that in Scala is
def array(i: Int) = i match {
case 0 => "zero"
case 1 => "one"
case 2 => "two"
}
Is there a standard/recommended way to do it in Scala?

Map provides features that a plain function does not. You can enumerate/scan/traverse/filter existing keys and values for example. Map/reduce/transform etc. (You can have a default value or generate an error on missing keys too, despite what the other answer suggests).
If you don't need any of that, there isn't much difference ... except, if the number of entries is fairly large, access to Map would generally be faster than evaluating the static pattern.

Not really. It depends on the purpose. Here's a version that generates the keys:
List("zero", "one", "two", "three").zipWithIndex.map(_.swap).toMap
(still a Map, assuming you can use the index)
I've seen both approaches used depending on the context.
If you need to serialize the mapping or pass it around or keep different versions of it, a Map would be better.
Otherwise, pattern matching might be better.

Related

HashMap only has one element instead of three

I am filling a HashMap in Scala like so:
val hashMap = new HashMap[P, List[T]]() { list.map(x => put(x.param1, x.param1.elements)) }
The problem is that hashMap will have only a size of 1 while list has a size of 3.
What am I doing wrong here?
You''re mixing imperative commands (put, new HashMap) with functional constructs (map). This cannot behave nicely.
What you should do (if I understand your goal correctly):
list.map(x => x.param1 -> x.param1.elements).toMap[P, List[T]]
Also, beware that if several elements in your list have the same param1, only the last one will be kept, since Map can only have one value for a given key.

Can I output a collection instead of a tuple in Scalding map method?

If you want to create a pipe with more than 22 fields from a smaller one in Scalding you are limited by Scala tuples, which cannot have more than 22 items.
Is there a way to use collections instead of tuples? I imagine something like in the following example, which sadly doesn't work:
input.read.mapTo('line -> aLotOfFields) { line: String =>
(1 to 24).map(_.toString)
}.write(output)
actually you can. It's in FAQ - https://github.com/twitter/scalding/wiki/Frequently-asked-questions#what-if-i-have-more-than-22-fields-in-my-data-set
val toFields = (1 to 24).map(f => Symbol("field_" + f)).toList
input
.read
.mapTo('line -> toFields) { line: String =>
new Tuple((1 to 24).map(_.toString).map(_.asInstanceOf[AnyRef]): _*)
}
the last map(_.asInstanceOf[AnyRef]) looks ugly so if you find better solution let me know please.
Wrap your tuples into case classes. It will also make your code more readable and type safe than using tuples and collections respectively.

Converting a Scala Map to a List

I have a map that I need to map to a different type, and the result needs to be a List. I have two ways (seemingly) to accomplish what I want, since calling map on a map seems to always result in a map. Assuming I have some map that looks like:
val input = Map[String, List[Int]]("rk1" -> List(1,2,3), "rk2" -> List(4,5,6))
I can either do:
val output = input.map{ case(k,v) => (k.getBytes, v) } toList
Or:
val output = input.foldRight(List[Pair[Array[Byte], List[Int]]]()){ (el, res) =>
(el._1.getBytes, el._2) :: res
}
In the first example I convert the type, and then call toList. I assume the runtime is something like O(n*2) and the space required is n*2. In the second example, I convert the type and generate the list in one go. I assume the runtime is O(n) and the space required is n.
My question is, are these essentially identical or does the second conversion cut down on memory/time/etc? Additionally, where can I find information on storage and runtime costs of various scala conversions?
Thanks in advance.
My favorite way to do this kind of things is like this:
input.map { case (k,v) => (k.getBytes, v) }(collection.breakOut): List[(Array[Byte], List[Int])]
With this syntax, you are passing to map the builder it needs to reconstruct the resulting collection. (Actually, not a builder, but a builder factory. Read more about Scala's CanBuildFroms if you are interested.) collection.breakOut can exactly be used when you want to change from one collection type to another while doing a map, flatMap, etc. — the only bad part is that you have to use the full type annotation for it to be effective (here, I used a type ascription after the expression). Then, there's no intermediary collection being built, and the list is constructed while mapping.
Mapping over a view in the first example could cut down on the space requirement for a large map:
val output = input.view.map{ case(k,v) => (k.getBytes, v) } toList

Map inside Map in Scala

I've this code :
val total = ListMap[String,HashMap[Int,_]]
val hm1 = new HashMap[Int,String]
val hm2 = new HashMap[Int,Int]
...
//insert values in hm1 and in hm2
...
total += "key1" -> hm1
total += "key2" -> hm2
....
val get = HashMap[Int,String] = total.get("key1") match {
case a : HashMap[Int,String] => a
}
This work, but I would know if exists a better (more readable) way to do this.
Thanks to all !
It looks like you're trying to re-implement tuples as maps.
val total : ( Map[Int,String], Map[Int,Int]) = ...
def get : Map[Int,String] = total._1
(edit: oh, sorry, I get it now)
Here's the thing: the code above doesn't work. Type parameters are erased, so the match above will ALWAYS return true -- try it with key2, for example.
If you want to store multiple types on a Map and retrieve them latter, you'll need to use Manifest and specialized get and put methods. But this has already been answers on Stack Overflow, so I won't repeat myself here.
Your total map, containing maps with non uniform value types, would be best avoided. The question is, when you retrieve the map at "key1", and then cast it to a map of strings, why did you choose String?
The most trivial reason might be that key1 and so on are simply constants, that you know all of them when you write your code. In that case, you probably should have a val for each of your maps, and dispense with map of maps entirely.
It might be that the calls made by the client code have this knowledge. Say that the client does stringMap("key1"), or intMap("key2") or that one way or another, the call implies that some given type is expected. That the client is responsible for not mixing types and names. Again in that case, there is no reason for total. You would have a map of string maps, a map of int maps (provided that you are previous knowledge of a limited number of value types)
What is your reason to have total?
First of all: this is a non-answer (as I would not recommend the approach I discuss), but it was too long for a comment.
If you haven't got too many different keys in your ListMap, I would suggest trying Malvolio's answer.
Otherwise, due to type erasure, the other approaches based on pattern matching are practically equivalent to this (which works, but is very unsafe):
val get = total("key1").asInstanceOf[HashMap[Int, String]]
the reasons why this is unsafe (unless you like living dangerously) are:
total("key1") is not returning an Option (unlike total.get("key1")). If "key1" does not exist, it will throw a NoSuchElementException. I wasn't sure how you were planning to manage the "None" case anyway.
asInstanceOf will also happily cast total("key2") - which should be a HashMap[Int, Int], but is at this point a HashMap[Int, Any] - to a HashMap[Int, String]. You will have problem later on when you try to access the Int value (which now scala believes is a String)

scala: map-like structure that doesn't require casting when fetching a value?

I'm writing a data structure that converts the results of a database query. The raw structure is a java ResultSet and it would be converted to a map or class which permits accessing different fields on that data structure by either a named method call or passing a string into apply(). Clearly different values may have different types. In order to reduce burden on the clients of this data structure, my preference is that one not need to cast the values of the data structure but the value fetched still has the correct type.
For example, suppose I'm doing a query that fetches two column values, one an Int, the other a String. The result then names of the columns are "a" and "b" respectively. Some ideal syntax might be the following:
val javaResultSet = dbQuery("select a, b from table limit 1")
// with ResultSet, particular values can be accessed like this:
val a = javaResultSet.getInt("a")
val b = javaResultSet.getString("b")
// but this syntax is undesirable.
// since I want to convert this to a single data structure,
// the preferred syntax might look something like this:
val newStructure = toDataStructure[Int, String](javaResultSet)("a", "b")
// that is, I'm willing to state the types during the instantiation
// of such a data structure.
// then,
val a: Int = newStructure("a") // OR
val a: Int = newStructure.a
// in both cases, "val a" does not require asInstanceOf[Int].
I've been trying to determine what sort of data structure might allow this and I could not figure out a way around the casting.
The other requirement is obviously that I would like to define a single data structure used for all db queries. I realize I could easily define a case class or similar per call and that solves the typing issue, but such a solution does not scale well when many db queries are being written. I suspect some people are going to propose using some sort of ORM, but let us assume for my case that it is preferred to maintain the query in the form of a string.
Anyone have any suggestions? Thanks!
To do this without casting, one needs more information about the query and one needs that information at compiole time.
I suspect some people are going to propose using some sort of ORM, but let us assume for my case that it is preferred to maintain the query in the form of a string.
Your suspicion is right and you will not get around this. If current ORMs or DSLs like squeryl don't suit your fancy, you can create your own one. But I doubt you will be able to use query strings.
The basic problem is that you don't know how many columns there will be in any given query, and so you don't know how many type parameters the data structure should have and it's not possible to abstract over the number of type parameters.
There is however, a data structure that exists in different variants for different numbers of type parameters: the tuple. (E.g. Tuple2, Tuple3 etc.) You could define parameterized mapping functions for different numbers of parameters that returns tuples like this:
def toDataStructure2[T1, T2](rs: ResultSet)(c1: String, c2: String) =
(rs.getObject(c1).asInstanceOf[T1],
rs.getObject(c2).asInstanceOf[T2])
def toDataStructure3[T1, T2, T3](rs: ResultSet)(c1: String, c2: String, c3: String) =
(rs.getObject(c1).asInstanceOf[T1],
rs.getObject(c2).asInstanceOf[T2],
rs.getObject(c3).asInstanceOf[T3])
You would have to define these for as many columns you expect to have in your tables (max 22).
This depends of course on that using getObject and casting it to a given type is safe.
In your example you could use the resulting tuple as follows:
val (a, b) = toDataStructure2[Int, String](javaResultSet)("a", "b")
if you decide to go the route of heterogeneous collections, there are some very interesting posts on heterogeneous typed lists:
one for instance is
http://jnordenberg.blogspot.com/2008/08/hlist-in-scala.html
http://jnordenberg.blogspot.com/2008/09/hlist-in-scala-revisited-or-scala.html
with an implementation at
http://www.assembla.com/wiki/show/metascala
a second great series of posts starts with
http://apocalisp.wordpress.com/2010/07/06/type-level-programming-in-scala-part-6a-heterogeneous-list%C2%A0basics/
the series continues with parts "b,c,d" linked from part a
finally, there is a talk by Daniel Spiewak which touches on HOMaps
http://vimeo.com/13518456
so all this to say that perhaps you can build you solution from these ideas. sorry that i don't have a specific example, but i admit i haven't tried these out yet myself!
Joschua Bloch has introduced a heterogeneous collection, which can be written in Java. I once adopted it a little. It now works as a value register. It is basically a wrapper around two maps. Here is the code and this is how you can use it. But this is just FYI, since you are interested in a Scala solution.
In Scala I would start by playing with Tuples. Tuples are kinda heterogeneous collections. The results can be, but not have to be accessed through fields like _1, _2, _3 and so on. But you don't want that, you want names. This is how you can assign names to those:
scala> val tuple = (1, "word")
tuple: ([Int], [String]) = (1, word)
scala> val (a, b) = tuple
a: Int = 1
b: String = word
So as mentioned before I would try to build a ResultSetWrapper around tuples.
If you want "extract the column value by name" on a plain bean instance, you can probably:
use reflects and CASTs, which you(and me) don't like.
use a ResultSetToJavaBeanMapper provided by most ORM libraries, which is a little heavy and coupled.
write a scala compiler plugin, which is too complex to control.
so, I guess a lightweight ORM with following features may satisfy you:
support raw SQL
support a lightweight,declarative and adaptive ResultSetToJavaBeanMapper
nothing else.
I made an experimental project on that idea, but note it's still an ORM, and I just think it may be useful to you, or can bring you some hint.
Usage:
declare the model:
//declare DB schema
trait UserDef extends TableDef {
var name = property[String]("name", title = Some("姓名"))
var age1 = property[Int]("age", primary = true)
}
//declare model, and it mixes in properties as {var name = ""}
#BeanInfo class User extends Model with UserDef
//declare a object.
//it mixes in properties as {var name = Property[String]("name") }
//and, object User is a Mapper[User], thus, it can translate ResultSet to a User instance.
object `package`{
#BeanInfo implicit object User extends Table[User]("users") with UserDef
}
then call raw sql, the implicit Mapper[User] works for you:
val users = SQL("select name, age from users").all[User]
users.foreach{user => println(user.name)}
or even build a type safe query:
val users = User.q.where(User.age > 20).where(User.name like "%liu%").all[User]
for more, see unit test:
https://github.com/liusong1111/soupy-orm/blob/master/src/test/scala/mapper/SoupyMapperSpec.scala
project home:
https://github.com/liusong1111/soupy-orm
It uses "abstract Type" and "implicit" heavily to make the magic happen, and you can check source code of TableDef, Table, Model for detail.
Several million years ago I wrote an example showing how to use Scala's type system to push and pull values from a ResultSet. Check it out; it matches up with what you want to do fairly closely.
implicit val conn = connect("jdbc:h2:f2", "sa", "");
implicit val s: Statement = conn << setup;
val insertPerson = conn prepareStatement "insert into person(type, name) values(?, ?)";
for (val name <- names)
insertPerson<<rnd.nextInt(10)<<name<<!;
for (val person <- query("select * from person", rs => Person(rs,rs,rs)))
println(person.toXML);
for (val person <- "select * from person" <<! (rs => Person(rs,rs,rs)))
println(person.toXML);
Primitives types are used to guide the Scala compiler into selecting the right functions on the ResultSet.