Scala: For Comprehension compile error (newbie question) - scala

I am getting a type mismatch compile error for the following code:
case class MyClass(name: String)
def getMyClass(id : String) = {
//For now ignore the id field
Some(Seq(MyClass("test1"), MyClass("test2"), MyClass("test3"), MyClass("test4"), MyClass("test5")))
}
def getHeader() = {
Map(
"n" -> List(Map("s"->"t"), Map("s"->"t"), Map("s"->"t")),
"o" -> List(Map("s"->"t"), Map("s"->"t"), Map("s"->"t")),
"id" -> "12345"
)
}
def castToString(any: Option[Any]): Option[String] = {
any match {
case Some(value: String) => Some(value)
case _ => None
}
}
val h = getHeader()
for{
id <- castToString(h.get("id")) //I hate I have to do this but the map is a Map[String,Any]
m <- getMyClass(id) //This strips the Some from the Some(Seq[MyClass])
item <- m //XXXXXXXX Compile errors
oList <- h.get("o")
nList <- h.get("n")
} yield {
(oList, nList, item)
}
The error is:
C:\temp\s.scala:28: error: type mismatch;
found : Seq[(java.lang.Object, java.lang.Object, this.MyClass)]
required: Option[?]
item <- m
^
But m is of type Seq[MyClass]. I am trying to iterate through the list and set item

You can't mix container types in this way, specifically given the signature of Option.flatMap (to which this expression is desugared - see the comment by pst). However, there's a pretty easy solution:
for{
id <- castToString(h.get("id")).toSeq
m <- getMyClass(id).toSeq
oList <- h.get("o")
nList <- h.get("n")
} yield {
(oList, nList, item)
}

A better explanation of why the code you mentioned doesnt work can be found here:
What is Scala's yield?
You could change your code to the one Kris posted.

Related

Future composition in Scala with chunked response

I think I have understanding of how future composition works but I am confused how to invoke the next future on chunk of response from first future.
Say the first future returns a list of integer and list is huge. I want to apply some function to that list with 2 elements at a time. How do I do that?
This example summarizes my dilemma:
val a = Future(List(1,2,3,4,5,6))
def f(a: List[Int]) = Future(a map (_ + 2))
val res = for {
list <- a
chunked <- list.grouped(2).toList
} yield f(chunked)
<console>:14: error: type mismatch;
found : List[scala.concurrent.Future[List[Int]]]
required: scala.concurrent.Future[?]
chunked <- list.grouped(2).toList
^
The return type has to be Future[?] so I can fix it by moving second future to yield part:
val res = for {
list <- a
} yield {
val temp = for {
chunked <- list.grouped(2).toList
} yield f(chunked)
Future.sequence(temp)
}
I feel it loses its elegance now, since it becomes nested (see two for comprehensions instead of one in the first approach). Is there a better way to achieve the same?
Consider
a.map { _.grouped(2).toList }.flatMap { Future.traverse(_)(f) }
Or, if you are set on only using for comprehension for some reason, here is how, without "cheating" :)
for {
b <- a
c <- Future.traverse(b.grouped(2).toList)(f)
} yield c
Edit in response to the comment It's not really that hard to add more processing to your chunked list if needed:
for {
b <- a
chunks = b.grouped(2).toList
processedChunks = processChunks(chunks)
c <- Future.traverse(processedChunks)
} yield c
Or, without for comprehension:
a
.map { _.grouped(2).toList }
.map(processChunks)
.flatMap { Future.traverse(_)(f) }
You cannot mix Future with List in a for-comprehension. All involved objects have to be of the same type. Also, in your working example, your result value res is of type Future[Future[List[List[Int]]]], which is probably not what you want.
import scala.concurrent._
import scala.concurrent.ExecutionContext.Implicits.global
a: scala.concurrent.Future[List[Int]] = scala.concurrent.impl.Promise$DefaultPromise#3bd3cdc8
f: (a: List[Int])scala.concurrent.Future[List[Int]]
scala> val b: Future[List[List[Int]]] = a.map(list => list.grouped(2).toList)
b: scala.concurrent.Future[List[List[Int]]] = scala.concurrent.impl.Promise$DefaultPromise#74db196c
scala> val res: Future[List[List[Int]]] = b.flatMap(lists => Future.sequence(lists.map(f)))
res: scala.concurrent.Future[List[List[Int]]] = scala.concurrent.impl.Promise$DefaultPromise#28f9873c
With for-comprehension
for {
b ← a.map(list ⇒ list.grouped( 2 ).toList)
res ← Future.sequence(b.map(f))
} yield res

Tuple seen as Product, compiler rejects reference to element

Constructing phoneVector:
val phoneVector = (
for (i <- 1 until 20) yield {
val p = killNS(r.get("Phone %d - Value" format(i)))
val t = killNS(r.get("Phone %d - Type" format(i)))
if (p == None) None
else
if (t == None) (p,"Main") else (p,t)
}
).filter(_ != None)
Consider this very simple snippet:
for (pTuple <- phoneVector) {
println(pTuple.getClass.getName)
println(pTuple)
//val pKey = pTuple._1.replaceAll("[^\\d]","")
associate() // stub prints "associate"
}
When I run it, I see output like this:
scala.Tuple2
((609) 954-3815,Mobile)
associate
When I uncomment the line with replaceAll(), compile fails:
....scala:57: value _1 is not a member of Product with Serializable
[error] val pKey = pTuple._1.replaceAll("[^\\d]","")
[error] ^
Why does it not recognize pTuple as a Tuple2 and treat it only as Product
OK, this compiles and produces the desired result. But it's too verbose. Can someone please demonstrate a more concise solution for dealing with this typesafe stuff?
for (pTuple <- phoneVector) {
println(pTuple.getClass.getName)
println(pTuple)
val pPhone = pTuple match {
case t:Tuple2[_,_] => t._1
case _ => None
}
val pKey = pPhone match {
case s:String => s.replaceAll("[^\\d]","")
case _ => None
}
println(pKey)
associate()
}
You can do:
for (pTuple <- phoneVector) {
val pPhone = pTuple match {
case (key, value) => key
case _ => None
}
val pKey = pPhone match {
case s:String => s.replaceAll("[^\\d]","")
case _ => None
}
println(pKey)
associate()
}
Or simply phoneVector.map(_._1.replaceAll("[^\\d]",""))
By changing the construction of phoneVector, as wrick's question implied, I've been able to eliminate the match/case stuff because Tuple is assured. Not thrilled by it, but Change is Hard, and Scala seems cool.
Now, it's still possible to slip a None value into either of the Tuple values. My match/case does not check for that, and I suspect that could lead to a runtime error in the replaceAll call. How is that allowed?
def killNS (s:Option[_]) = {
(s match {
case _:Some[_] => s.get
case _ => None
}) match {
case None => None
case "" => None
case s => s
}
}
val phoneVector = (
for (i <- 1 until 20) yield {
val p = killNS(r.get("Phone %d - Value" format(i)))
val t = killNS(r.get("Phone %d - Type" format(i)))
if (t == None) (p,"Main") else (p,t)
}
).filter(_._1 != None)
println(phoneVector)
println(name)
println
// Create the Neo4j nodes:
for (pTuple <- phoneVector) {
val pPhone = pTuple._1 match { case p:String => p }
val pType = pTuple._2
val pKey = pPhone.replaceAll(",.*","").replaceAll("[^\\d]","")
associate(Map("target"->Map("label"->"Phone","key"->pKey,
"dial"->pPhone),
"relation"->Map("label"->"IS_AT","key"->pType),
"source"->Map("label"->"Person","name"->name)
)
)
}
}

Error on return a Future[Boolean] from a for in Scala

I'm writing a Play 2.3.2 application in Scala.
I use reactivemongo as driver for my MongoDB database.
I've a collection named "recommendation.tagsSimilarity", that contain the value of the similarity between my tags, where a tag is in the form :"category:attribute".
An example of a document is like the following:
{
"_id" : ObjectId("5440ec6e4165e71ac4b53a71"),
"id" : "10912199495810912197116116114-10912199581091219711611611450",
"tag1" : "10912199495810912197116116114",
"tag1Name" : "myc1:myattr",
"tag2" : "10912199581091219711611611450",
"tag2Name" : "myc:myattr2",
"eq" : 0
}
A doment represents an element of a matrix of nxn dimensions, where n is the number of tags saved.
Now I've created a collection named "recommendation.correlation" on which i save the correlation between a "category" and a tag.
For do that I'm writing a method that iterate on the elements of the TagSimilarity as a matrix.
def calculateCorrelation: Future[Boolean] = {
def calculate(category: String, tag: String): Future[(Double, Double)] = {//calculate the correlation and return the tuple value
}
play.Logger.debug("Start Correlation")
Similarity.all.toList flatMap { tagsMatch =>
for(i <- tagsMatch) {
val category = i.tag1Name.split(":")(0) // get the tag category
for(j <- tagsMatch) {
val productName = j.tag2Name //obtain the product tag
calculate(category, productName) flatMap {value =>
val correlation = Correlation(category, productName, value._1, value._2) //create the correlation object
val query = Json.obj("category" -> category, "attribute" -> productName)
Correlations.update(query, correlation, upsert = true) flatMap{status => status match {
case LastError(ok, _, _, _, _, _, _) => Future{true}
case _ => Future{false}
}}
}
}
}
}
}
But the compiler gives me the following error:
[error] /Users/alberto/git/bdrim/modules/recommendation-system/app/recommendationsystem/algorithms/Pearson.scala:313: type mismatch;
[error] found : Unit
[error] required: scala.concurrent.Future[Boolean]
[error] for(i <- tagsMatch) {
[error] ^
[error] /Users/alberto/git/bdrim/modules/recommendation-system/app/recommendationsystem/algorithms/Pearson.scala:313: type mismatch;
[error] found : Unit
[error] required: scala.concurrent.Future[Boolean]
[error] for(i <- tagsMatch) {
[error] ^
[error] one error found
What's wrong?? I can't understand why the for statement don't return nothing.
In addition to I want to ask why i can't write the code in a for comprehension in Scala for iterate two times on the list.
You forgot to use yield with for:
for(i <- tagsMatch) { ... } gets translated to a foreach instruction.
Using for(i <- tagsMatch) yield { ... } it will actually translate to map/flatMap and yield a result (remember to use it on both of your fors).

Filtering inside `for` with pattern matching

I am reading a TSV file and using using something like this:
case class Entry(entryType: Int, value: Int)
def filterEntries(): Iterator[Entry] = {
for {
line <- scala.io.Source.fromFile("filename").getLines()
} yield new Entry(line.split("\t").map(x => x.toInt))
}
Now I am both interested in filtering out entries whose entryType are set to 0 and ignoring lines with column count greater or lesser than 2 (that does not match the constructor). I was wondering if there's an idiomatic way to achieve this may be using pattern matching and unapply method in a companion object. The only thing I can think of is using .filter on the resulting iterator.
I will also accept solution not involving for loop but that returns Iterator[Entry]. They solutions must be tolerant to malformed inputs.
This is more state-of-arty:
package object liner {
implicit class R(val sc: StringContext) {
object r {
def unapplySeq(s: String): Option[Seq[String]] = sc.parts.mkString.r unapplySeq s
}
}
}
package liner {
case class Entry(entryType: Int, value: Int)
object I {
def unapply(s: String): Option[Int] = util.Try(s.toInt).toOption
}
object Test extends App {
def lines = List("1 2", "3", "", " 4 5 ", "junk", "0, 100000", "6 7 8")
def entries = lines flatMap {
case r"""\s*${I(i)}(\d+)\s+${I(j)}(\d+)\s*""" if i != 0 => Some(Entry(i, j))
case __________________________________________________ => None
}
Console println entries
}
}
Hopefully, the regex interpolator will make it into the standard distro soon, but this shows how easy it is to rig up. Also hopefully, a scanf-style interpolator will allow easy extraction with case f"$i%d".
I just started using the "elongated wildcard" in patterns to align the arrows.
There is a pupal or maybe larval regex macro:
https://github.com/som-snytt/regextractor
You can create variables in the head of the for-comprehension and then use a guard:
edit: ensure length of array
for {
line <- scala.io.Source.fromFile("filename").getLines()
arr = line.split("\t").map(x => x.toInt)
if arr.size == 2 && arr(0) != 0
} yield new Entry(arr(0), arr(1))
I have solved it using the following code:
import scala.util.{Try, Success}
val lines = List(
"1\t2",
"1\t",
"2",
"hello",
"1\t3"
)
case class Entry(val entryType: Int, val value: Int)
object Entry {
def unapply(line: String) = {
line.split("\t").map(x => Try(x.toInt)) match {
case Array(Success(entryType: Int), Success(value: Int)) => Some(Entry(entryType, value))
case _ =>
println("Malformed line: " + line)
None
}
}
}
for {
line <- lines
entryOption = Entry.unapply(line)
if entryOption.isDefined
} yield entryOption.get
The left hand side of a <- or = in a for-loop may be a fully-fledged pattern. So you may write this:
def filterEntries(): Iterator[Int] = for {
line <- scala.io.Source.fromFile("filename").getLines()
arr = line.split("\t").map(x => x.toInt)
if arr.size == 2
// now you may use pattern matching to extract the array
Array(entryType, value) = arr
if entryType == 0
} yield Entry(entryType, value)
Note that this solution will throw a NumberFormatException if a field is not convertible to an Int. If you do not want that, you'll have to encapsulate x.toInt with a Try and pattern match again.

Scala Option return type

I am newbie in Scala programming world but loving it. Recently I have started porting my research App into Scala and one of thing I am still struggling is the return keyword. For example in below code
def readDocument(dbobj:MongoDBObject) = Option[ContainerMetaData]
{
for(a <- dbobj.getAs[String]("classname");
b <- dbobj.getAs[Long]("id");
c <- dbobj.getAs[Long]("version");
d <- dbobj.getAs[String]("description");
e <- dbobj.getAs[String]("name");
f <- dbobj.getAs[String]("tag");
g <- dbobj.getAs[Int]("containertype");
h <- dbobj.getAs[Date]("createddate")
)
{
val ctype = ContainerType(g)
val jodadt = new DateTime(h)
val data = new ContainerMetaData(a,b,c,d,e,f,ctype,jodadt)
Some(data)
}
None
}
In above code I get the error message:
type mismatch; found : None.type required: om.domain.ContainerMetaData
So if I remove the explicit return type the code works but then without explicit return keyword I am not able to terminate my code at Some(data).
def readDocument(dbobj:MongoDBObject)=
{
for(a <- dbobj.getAs[String]("classname");
b <- dbobj.getAs[Long]("id");
c <- dbobj.getAs[Long]("version");
d <- dbobj.getAs[String]("description");
e <- dbobj.getAs[String]("name");
f <- dbobj.getAs[String]("tag");
g <- dbobj.getAs[Int]("containertype");
h <- dbobj.getAs[Date]("createddate")
)
{
val ctype = ContainerType(g)
val jodadt = new DateTime(h)
val data = new ContainerMetaData(a,b,c,d,e,f,ctype,jodadt)
Some(data)
}
None
}
And if add a return keyword then compiler complains
method `readDocument` has return statement; needs result tye
Few more additional info, this is the trait I am extending
trait MongoDAOSerializer[T] {
def createDocument(content:T) : DBObject
def readDocument(db:MongoDBObject) : Option[T]
}
The problem is, that you are missing the yield keyword in the for-comprehension. And also the None at the end is unnecessary, as the for-comprehension will yield None, if one of the values is missing and also the explicit creation of a Some in the comprehension is not needed, as it will create an Option anyway. Your code hase to look like this (not tested)
def readDocument(dbobj: MongoDBObject): Option[ContainerMetaData] = {
for {
a <- dbobj.getAs[String]("classname")
b <- dbobj.getAs[Long]("id")
c <- dbobj.getAs[Long]("version")
d <- dbobj.getAs[String]("description")
e <- dbobj.getAs[String]("name")
f <- dbobj.getAs[String]("tag")
g <- dbobj.getAs[Int]("containertype")
h <- dbobj.getAs[Date]("createddate")
} yield {
val ctype = ContainerType(g)
val jodadt = new DateTime(h)
new ContainerMetaData(a,b,c,d,e,f,ctype,jodadt)
}
}