Scala: .take(1) in for-comprehension? - scala

val SumABC = 1000
val Max = 468
val Min = 32
val p9 = for {
a <- Max to 250 by -1
b <- Min+(Max-a) to 249
if a*a+b*b == (SumABC-a-b)*(SumABC-a-b)
} yield a*b*(SumABC-a-b)
Can I .take(1) here? (I tried to translate it to flatmap, filter, etc, but since I failed I guess it wouldn't be as readable anyway...)

If I understood your cryptic questin, what you would like to do is the following
val p9 = (for {
a <- Max to 250 by -1
b <- Min+(Max-a) to 249
if a*a+b*b == (SumABC-a-b)*(SumABC-a-b)
} yield a*b*(SumABC-a-b)).take(1)
Just add parenthesis before for and after yield to ensure the take method is called on the result of the for block

Related

How to get Map with matching values

I have a file with values like this :
user id | item id | rating | timestamp
196 242 3 881250949
186 302 3 891717742
22 377 1 878887116
244 51 2 880606923
166 346 1 886397596
298 474 4 884182806
115 265 2 881171488
253 465 5 891628467
305 451 3 886324817
6 86 3 883603013
62 257 2 879372434
200 222 5 876042340
210 40 3 891035994
224 29 3 888104457
303 785 3 879485318
122 387 5 879270459
194 274 2 879539794
......
I want to find all values where item id = "560"
and make Map from rating values(1-5) like this {1->6,2-5,3-10,4-6,5-14}
object Parse {
def main(args: Array[String]): Unit = {
//вытаскиваем данные с u.data
var a: List[(String, String, String, String)] = List()
for (line <- io.Source.fromFile("F:\\big data\\u.data").getLines) {
val newLine = line.replace("\t", ",")
if (newLine.split(",").length < 4) {
break
} else {
val asd = newLine.split(",")
val userId = asd(0)
val itemId = asd(1)
val rating = asd(2)
val timestamp = asd(3)
a = a :+ ((userId, itemId, rating, timestamp))
}
a = a.filter(_._2.equals("590")) <- filter list of tuples correctly
val empty: List[String] = a.map(_._2) <- have tyed to get list of all rating, but it does not work
}
}
How can I create a map of rating?
here as I can see we can generate a map of matching values
Scala groupBy for a list
If what you want is a Map of rating->count for a given "item id", this should do it.
util.Using(io.Source.fromFile("../junk.txt")) { file =>
val rec = raw"\d+\s+590\s+(\d+)\s+\d+".r //only this item id
file.getLines()
.collect { case rec(rating) => rating }
.foldLeft(Map.empty[String, Int]) {
case (m, r) => m + (r -> (m.getOrElse(r, 0) + 1))
}
}.getOrElse(Map.empty[String,Int])
Note that fromFile() is automatically closed at the end of the Using block.
I think using for-loop is not the better decision. Please, look at your problem from the data-stream problem not array. scala.io.Source.fromFile("F:\\big data\\u.data").getLines() returns to you Iterator[String] of your lines. It is more suitable to use it as data stream not as array of data. And in your conditions is better just use combination of map, filter, collect and groupBy functions to get grouped rows by rank.
Full correct code:
val sourceFile = scala.io.Source.fromFile("F:\\big data\\u.data")
try {
val linesOfArrays = sourceFile.getLines().map{
line => line.split(",")
}
require(!linesOfArrays.exists(_.length < 4)) // your data schema validation
val ratingCountsMap: Map[String, Int] = linesOfArrays.collect{
case rowValuesArray if rowValuesArray(1) == "590" =>
// in this line you will get rating and 1 for his counting
rowValuesArray(2) -> 1
}.toSeq
.groupBy{ case (rating, _) => rating }
.mapValues{ groupWithSameRating => groupWithSameRating.length }
} finally sourceFile.close()
And don't forget to release resource (in your case this is file) using close method in finally section or use scala-arm library (more about resources here)

Split Map to multiple Maps

I need to process a diff between two (huge) Maps. To parallelize the task, I would like to split these 2 Maps by Key hash value and create smaller Maps (by range of hash value).
How would I archieve that in (idiomatic) Scala?
Here's a rough sketch to get you started with the Scala syntax:
// create two (slightly different) maps, print them as table side by side
val rnd = new util.Random
val originalMap1 = (0 to 10).map(i => (i, i * i)).toMap
val originalMap2 = (0 to 10).map(i => (i, i * i + rnd.nextInt(2))).toMap
for (i <- 0 to 10) {
val a = originalMap1(i)
val b = originalMap2(i)
val marker = if (a == b) "" else " <-"
println(s"$i: $a $b $marker")
}
//subdivide into smaller maps
val numSubmaps = 5
val submaps1 = originalMap1.groupBy(_._1.hashCode % numSubmaps)
val submaps2 = originalMap2.groupBy(_._1.hashCode % numSubmaps)
// compare each corresponding pair of maps separately, merge diffs
val diffs = (for (s <- 0 until numSubmaps) yield {
val m1 = submaps1(s)
val m2 = submaps2(s)
for {
k <- m1.keys
a = m1(k)
b = m2(k)
if a != b
} yield (k, (a, b))
}).reduce(_ ++ _)
println(diffs.toList.sortBy(_._1))
Input:
0: 0 1 <-
1: 1 2 <-
2: 4 4
3: 9 9
4: 16 16
5: 25 26 <-
6: 36 36
7: 49 49
8: 64 65 <-
9: 81 82 <-
10: 100 101 <-
Output:
List((0,(0,1)), (1,(1,2)), (5,(25,26)), (8,(64,65)), (9,(81,82)), (10,(100,101)))

Dynamic For Comprehensions

I'm relatively new to Scala so I'm not super confident with the language and I need your help to solve a problem. I know that the for-comprehension is just a syntactic sugar to simplify complex map/flatMap hierarchies.
Now, consider to have 3 different Range intervals, which should be combined in order to create all the possible combinations (respecting the intervals) of values.
Example:
Using the for-comprehension the problem can be solved as:
val intervalX = 1 to 5
val intervalY = 6 to 13
val intervalZ = 20 to 50
for {
x <- intervalX;
y <- intervalY;
z <- intervalZ
} yield (x,y,z)
Which is converted by the Scala compiler as:
intervalX.flatMap{x =>
intervalY.flatMap{y =>
intervalZ.map{z => (x,y,z)}
}
}
However, the problem is harder if you are given in input a variable number d of intervals. Is it possible to perform the same operation, obtaining all the possible d-tuples? I think that it could be solved using the foldLeft operation, but I am not able to write it correctly at the moment. Can you help me?
Thanks
If you can live without tuples as a result then a version using foldLeft and returning lists representing combinations could be:
val intervalX = 1 to 5
val intervalY = 6 to 13
val intervalZ = 20 to 50
val ranges = intervalZ :: intervalY :: intervalX :: Nil
val combos = ranges.foldLeft(Iterable[Seq[Int]](Nil)) { case (c, e) =>
for {
i <- e
j <- c
} yield i +: j
}
combos foreach { println(_) }

How does yield expand to in multiple dimension loop in Scala?

From Here we get to know that an expression like:
for( i <- 1 to 10 ) yield i + 1
will expand into
( 1 to 10 ).map( _+1 )
But what does the following expression expand to?
for( i <- 1 to 50 j <- i to 50 ) yield List(1,i,j)
Is this correct?
( 1 to 50 ).map( x => (1 to 50 ).map(List(1,x,_))
I'm interested in this problem because I'd like to make a function which performs multiple Xi <- Xi-1 to 50 operations, as shown below:
for( X1 <- 1 to 50 X2 <- X1 to 50 X3 <- X2 to 50 ..... Xn <- Xn-1 to 50 )
yield List(1,X1,X2,X3,.....,Xn)
The function has one parameter: dimension which denotes the n in the above expression.
Its return type is IndexSeq[List[Int]]
How can I achieve that?
Thank you for answering (:
It's well explained in a relevant doc. In particular:
for(x <- c1; y <- c2; z <- c3) yield {...}
will be translated into
c1.flatMap(x => c2.flatMap(y => c3.map(z => {...})))
I don't think there is a way to abstract over arbitrary nested comprehension (unless you're using voodoo magic, like macros)
See om-nom-nom's answer for an explanation of what the for loops expand to. I'd like to answer the second part of the opening question, how to implement a function that can do:
for( X1 <- 1 to 50 X2 <- X1 to 50 X3 <- X2 to 50 ..... Xn <- Xn to 50 )
yield List(1,X1,X2,X3,.....,Xn)
You can use:
def upto50(dimension: Int) = {
def loop(n: Int, start: Int): IndexedSeq[List[Int]] = {
if (n > dimension)
IndexedSeq(List())
else {
(n to 50).flatMap(x => loop(n + 1, x).map(x :: _))
}
}
loop(1, 1)
}
We compute each of the loops recursively, working inside-out, starting with Xn to 50 and building up the solution.
Solutions for the more general case of:
for( X1 <- S1 X2 <- S2 X3 <- S3 ..... Xn <- Sn )
yield List(1,X1,X2,X3,.....,Xn)
Where S1..Sn are arbitraray sequences or monads are also possible. See this gist for the necessary wall of code.

How to use yield with multistatement for?

The code is just for illustrative purposes, i.e. it is an example not a real code.
I tried this:
val results = for(i <- 1 to 20)
{
val x = i+1
println(x)
yield x
}
and this
val results = for {i <- 1 to 20;
val x = i+1;
println(x)
}
yield x
But none of this works -- I need a generator, definition, and a statement -- is this possible to do it with yield? If yes, what is the correct syntax?
Hopefully, this will get you started:
val result = for (i <- 1 to 10 if i%2==0) yield {
println(i);
i
}
which is equivalent to
(1 to 10).filter(_%2==0).map(x => { println(x); x } )
You seem to think that for in Scala is similar to for in imperative languages. It's not! Behind the scenes, it makes use of flatMap. Every expression in the first section of the for/yield syntax must have a certain form. If I'm not mistaken, it must either be an assignment (restricted to val, maybe) or a <- expression. You can hack it to get what you want:
for {
i <- 1 to 20
val x = i + 1
_ <- {println(x); List(1)}
} yield x
But that is pretty horrible. Hacking the yield, as Jamil demonstrated, is also a possibility, though also pretty horrible.
The question is, what exactly are you trying to accomplish? foreach is best used for side-effecting loop code:
(1 to 10) foreach { i =>
val x = i+1
println(x)
}
map is best used for producing a new list of the same length:
(1 to 10) map (i => i + 1)
It is rather unusual, and somewhat ugly, to want to do both at the same time.