Transforming/repacking the results of a Slick query - scala

I have what I hope is a simple question about Slick. Apologies if this is well documented - I may have overlooked something in my searching.
I have an aggregate query built as follows:
def doQuery(/* ... */) = for {
a <- Query(TableA)
b <- a.relationship.where // ...
c <- b.relationship.where // ...
} yield (a, b, c)
This returns me a Query[(A, B, C)].
I also have a case class:
case class Aggregate(a: A, b: B, c: C)
I'd like to transform my query to a Query[Aggregate] so my fellow developers can call .list() or .firstOption() and get a List or Option as appropriate.
I naturally went for the .map() method on Query, but it has an implicit Shape argument that I'm not sure how to handle.
Is this straightforward in Slick? We're using v1.0.1 at the moment but upgrading to 2.0 is also a possibility.
Best regards,
Dave

After a lot of playing around, I have concluded that this is not possible in Slick 1.
In Slick 2 you can use the <> operator to transform a projection assembled in the yield portion of the for comprehension:
def doQuery(/* ... */) = for {
a <- Query(TableA)
b <- a.relationship.where // ...
c <- b.relationship.where // ...
} yield (a, b, c) <> (Aggregate.tupled, Aggregate.unapply)
This works as expected in conjunction with .list and .firstOption. I'm unsure what the consequences are of trying to use .insert, .update and .delete.

If you can modify doQuery, then you just want to do yield Aggregate(a, b, c) instead of yield (a, b, c).
Or, if you want to transform the result without modifying doQuery, then you can call .map { case (a, b, c) => Aggregate(a, b, c) } on the result of doQuery.

Related

Scala filter by nested Option/Try monads

In Scala, I have an Array[Option[(String,String,Try[String])]] and would like to find all the Failure error codes.
If the inner monad is an Option[String] instead, I can access the Some(x) contents with a clean little for comprehension, like so:
for {
Some(row) <- row
(a,b,c) = row
x <- c
} yield x
But if the inner monad is a Failure, then I'm struggling to see how to pattern match it, since I can't put Failure(x) <- c in the for statement. This feels like a really simple thing I'm missing, but any guidance would be very valuable.
Many thanks!
EDIT - Mis-specified the array. It's actually an array of option-tuple3s, not just tuple3s.
Will a.map(_._3).filter(_.isFailure) do?
EDIT: after having seen the edit and your comment, I think you can also do
val tries = for {
x <- a
z <- x
} yield z._3
tries.filter(_.isFailure)
In order to combine different types of "monads" you will need what's called a monad transformer. To put it simply, Scala doesn't let you mixin different monad types within the same for comprehension - this makes sense since a for comprehension just syntactic sugar for combinations of map / flatMap / filter.
Assuming the first one is always an Option then you could transform the Try into an Option and get the desired result:
for {
Some((a, b, c)) <- row
x <- c.toOption
} yield x
If you don't really care about what's inside that Try that's fine, but if you do then be careful that you'll lose that information when doing Some(x). If the pattern match fails, then you will get a None.
I hope that helps you.
This returns an Array[Throwable].
for {
(_,_,Failure(e)) <- rows
} yield e
Or, perhaps, an un-sugared version.
rows.collect{case (_,_,Failure(e)) => e}

Future yielding with flatMap

Given Futures fa, fb, fc, I can use f: Function1[(A,B,C), Future[D]], to return a Future[D] either by:
(for {
a <- fa
b <- fb
c <- fc
} yield (a,b,c)).flatMap(f)
which has the unenviable property of declaring the variables a,b,c twice.
or
a.zip(b).zip(c).flatMap{ case (a, (b, c)) => f(a, b, c) }
which is terser, but the nesting of the futures into pairs of pairs is weird.
It would be great to have a form of the for-expression where the yield returns a flattened result. Is there such a thing?
There's no reason to flatMap in the yield. It should be another line in the for-comprehension.
for {
a <- fa
b <- fb
c <- fc
d <- f(a, b, c)
} yield d
I don't think it can get more concise than that.

Most idiomatic way to mix synchronous, asynchronous, and parallel computation in a scala for comprehension of futures

Suppose I have 4 future computations to do. The first two can be done in parallel, but the third must be done after the first two (even though the values of the first two are not used in the third -- think of each computation as a command that performs some db operation). Finally, there is a 4th computation that must occur after all of the first 3. Additionally, there is a side effect that can be started after the first 3 complete (think of this as kicking off a periodic runnable). In code, this could look like the following:
for {
_ <- async1 // not done in parallel with async2 :( is there
_ <- async2 // any way of achieving this cleanly inside of for?
_ <- async3
_ = sideEffect // do I need "=" here??
_ <- async4
} yield ()
The comments show my doubts about the quality of the code:
What's the cleanest way to do two operations in parallel in a for comprehension?
Is there is a way to achieve this result without so many "_" characters (nor assigning a named reference, at least in the case of sideEffect)
what's the cleanest and most idiomatic way to do this?
You can use zip to combine two futures, including the result of zip itself. You'll end up with tuples holding tuples, but if you use infix notation for Tuple2 it is easy to take them apart. Below I define a synonym ~ for succinctness (this is what the parser combinator library does, except its ~ is a different class that behaves similiarly to Tuple2).
As an alternative for _ = for the side effect, you can either move it into the yield, or combine it with the following statement using braces and a semicolon. I would still consider _ = to be more idiomatic, at least so far as having a side effecting statement in the for is idiomatic at all.
val ~ = Tuple2
for {
a ~ b ~ c <- async1 zip
async2 zip
async3
d <- { sideEffect; async4 }
} yield (a, b, c, d)
for-comprehensions represent monadic operations, and monadic operations are sequenced. There's superclass of monad, applicative, where computations don't depend on the results of prior computations, thus may be run in parallel.
Scalaz has a |#| operator for combining applicatives, so you can use (future1 |#| future2)(proc(_, _)) to dispatch two futures in parallel and then run "proc" on the result of both of them, as opposed to sequential computation of for {a <- future1; b <- future2(a)} yield b (or just future1 flatMap future2).
There's already a method on stdlib Futures called .zip that combines Futures in parallel, and indeed the scalaz impl uses this: https://github.com/scalaz/scalaz/blob/scalaz-seven/core/src/main/scala/scalaz/std/Future.scala#L36
And .zip and for-comprehensions may be intermixed to have parallel and sequential parts, as appropriate.
So just using the stdlib syntax, your above example could be written as:
for {
_ <- async1 zip async2
_ <- async3
_ = sideEffect
_ <- async4
} yield ()
Alternatively, written w/out a for-comprehension:
async1 zip async2 flatMap (_=> async3) flatMap {_=> sideEffect; async4}
Just as an FYI, it's really simple to get two futures to run in parallel and still process them via a for-comprehension. The suggested solutions of using zip can certainly work, but I find that when I want to handle a couple of futures and do something when they are all done, and I have two or more that are independent of each other, I do something like this:
val f1 = async1
val f2 = async2
//First two futures now running in parallel
for {
r1 <- f1
r2 <- f2
_ <- async3
_ = sideEffect
_ <- async4
} yield {
...
}
Now the way the for comprehension is structured certainly waits on f1 before checking on the completion status of f2, but the logic behind these two futures is running at the same time. This is a little simpler then some of the suggestions but still might give you what you need.
Your code already looks structured minus computing futures in parallel.
Use helper functions, ideally writing a code generator to print out
helpers for all tuple cases
As far as I know, you need to name the result or assign it _
Example code
Example code with helpers.
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global
object Example {
def run: Future[Unit] = {
for {
(a, b, c) <- par(
Future.successful(1),
Future.successful(2),
Future.successful(3)
)
constant = 100
(d, e) <- par(
Future.successful(a + 10),
Future.successful(b + c)
)
} yield {
println(constant)
println(d)
println(e)
}
}
def par[A,B](a: Future[A], b: Future[B]): Future[(A, B)] = {
for {
a <- a
b <- b
} yield (a, b)
}
def par[A,B,C](a: Future[A], b: Future[B], c: Future[C]): Future[(A, B, C)] = {
for {
a <- a
b <- b
c <- c
} yield (a, b, c)
}
}
Example.run
Edit:
generated code for 1 to 20 futures: https://gist.github.com/nanop/c448db7ac1dfd6545967#file-parhelpers-scala
parPrinter script: https://gist.github.com/nanop/c448db7ac1dfd6545967#file-parprinter-scala

generating permutations with scalacheck

I have some generators like this:
val fooRepr = oneOf(a, b, c, d, e)
val foo = for (s <- choose(1, 5); c <- listOfN(s, fooRepr)) yield c.mkString("$")
This leads to duplicates ... I might get two a's, etc. What I really want is to generate random permutation with exactly 0 or 1 or each of a, b, c, d, or e (with at least one of something), in any order.
I was thinking there must be an easy way, but I'm struggling to even find a hard way. :)
Edited: Ok, this seems to work:
val foo = for (s <- choose(1, 5);
c <- permute(s, a, b, c, d, e)) yield c.mkString("$")
def permute[T](n: Int, gs: Gen[T]*): Gen[Seq[T]] = {
val perm = Random.shuffle(gs.toList)
for {
is <- pick(n, 1 until gs.size)
xs <- sequence[List,T](is.toList.map(perm(_)))
} yield xs
}
...borrowing heavily from Gen.pick.
Thanks for your help, -Eric
Rex, thanks for clarifying exactly what I'm trying to do, and that's useful code, but perhaps not so nice with scalacheck, particularly if the generators in question are quite complex. In my particular case the generators a, b, c, etc. are generating huge strings.
Anyhow, there was a bug in my solution above; what worked for me is below. I put a tiny project demonstrating how to do this at github
The guts of it is below. If there's a better way, I'd love to know it...
package powerset
import org.scalacheck._
import org.scalacheck.Gen._
import org.scalacheck.Gen
import scala.util.Random
object PowersetPermutations extends Properties("PowersetPermutations") {
def a: Gen[String] = value("a")
def b: Gen[String] = value("b")
def c: Gen[String] = value("c")
def d: Gen[String] = value("d")
def e: Gen[String] = value("e")
val foo = for (s <- choose(1, 5);
c <- permute(s, a, b, c, d, e)) yield c.mkString
def permute[T](n: Int, gs: Gen[T]*): Gen[Seq[T]] = {
val perm = Random.shuffle(gs.toList)
for {
is <- pick(n, 0 until gs.size)
xs <- sequence[List, T](is.toList.map(perm(_)))
} yield xs
}
implicit def arbString: Arbitrary[String] = Arbitrary(foo)
property("powerset") = Prop.forAll {
a: String => println(a); true
}
}
Thanks,
Eric
You're not describing a permutation, but the power set (minus the empty set)Edit: you're describing a combination of a power set and a permutation. The power set of an indexed set N is isomorphic to 2^N, so we simply (in Scala alone; maybe you want to alter this for use with ScalaCheck):
def powerSet[X](xs: List[X]) = {
val xis = xs.zipWithIndex
(for (j <- 1 until (1<<xs.length)) yield {
for ((x,i) <- xis if ((j & (1<<i)) != 0)) yield x
}).toList
}
to generate all possible subsets given a set. Of course, explicit generation of power sets is unwise if they original set contains more than a handful of elements. If you don't want to generate all of them, just pass in a random number from 1 until (1<<(xs.length-1)) and run the inner loop. (Switch to Long if there are 33-64 elements, and to BitSet if there are more yet.) You can then permute the result to switch the order around if you wish.
Edit: there's another way to do this if you can generate permutations easily and you can add a dummy argument: make your list one longer, with a Stop token. Then permute and .takeWhile(_ != Stop). Ta-da! Permutations of arbitrary length. (Filter out the zero-length answer if need be.)

Scala for comprehension efficiency?

In the book "Programming In Scala", chapter 23, the author give an example like:
case class Book(title: String, authors: String*)
val books: List[Book] = // list of books, omitted here
// find all authors who have published at least two books
for (b1 <- books; b2 <- books if b1 != b2;
a1 <- b1.authors; a2 <- b2.authors if a1 == a2)
yield a1
The author said, this will translated into:
books flatMap (b1 =>
books filter (b2 => b1 != b2) flatMap (b2 =>
b1.authors flatMap (a1 =>
b2.authors filter (a2 => a1 == a2) map (a2 =>
a1))))
But if you look into the map and flatmap method definition(TraversableLike.scala), you may find, they are defined as for loops:
def map[B, That](f: A => B)(implicit bf: CanBuildFrom[Repr, B, That]): That = {
val b = bf(repr)
b.sizeHint(this)
for (x <- this) b += f(x)
b.result
}
def flatMap[B, That](f: A => Traversable[B])(implicit bf: CanBuildFrom[Repr, B, That]): That = {
val b = bf(repr)
for (x <- this) b ++= f(x)
b.result
}
Well, I guess this for will continually be translated to foreach and then translated to while statement which is a construct not an expression, scala doesn't have a for construct, because it wants the for always yield something.
So, what I want to discuss with you is that, why does Scala do this "For translation" ?
The author's example used 4 generators, which will be translated into 4 level nested for loop in the end, I think it'll have really horrible performance when the books is large.
Scala encourage people to use this kind of "Syntactic Sugar", you can always see codes that heavily make use of filter, map and flatmap, which seems programmers are forgetting what they really do is nesting one loop inside another, and what achieved is only to make codes looks a bit shorter. What's your idea?
For comprehensions are syntactic sugar for monadic transformation, and, as such, are useful in all sorts of places. At that, they are much more verbose in Scala than the equivalent Haskell construct (of course, Haskell is non-strict by default, so one can't talk about performance of the construct like in Scala).
Also important, this construct keeps what is being done clear, and avoids quickly escalating indentation or unnecessary private method nesting.
As to the final consideration, whether that hides the complexity or not, I'll posit this:
for {
b1 <- books
b2 <- books
if b1 != b2
a1 <- b1.authors
a2 <- b2.authors
if a1 == a2
} yield a1
It is very easy to see what is being done, and the complexity is clear: b^2 * a^2 (the filter won't alter the complexity), for number of books and number of authors. Now, write the same code in Java, either with deep indentation or with private methods, and try to ascertain, in a quick look, what the complexity of the code is.
So, imho, this doesn't hide the complexity, but, on the contrary, makes it clear.
As for the map/flatMap/filter definitions you mention, they do not belong to List or any other class, so they won't be applied. Basically,
for(x <- List(1, 2, 3)) yield x * 2
is translated into
List(1, 2, 3) map (x => x * 2)
and that is not the same thing as
map(List(1, 2, 3), ((x: Int) => x * 2)))
which is how the definition you passed would be called. For the record, the actual implementation of map on List is:
def map[B, That](f: A => B)(implicit bf: CanBuildFrom[Repr, B, That]): That = {
val b = bf(repr)
b.sizeHint(this)
for (x <- this) b += f(x)
b.result
}
I write code so that it's easy to understand and maintain. I then profile. If there's a bottleneck that's where I devote my attention. If it's in something like you've described I'll attack the problem in a different manner. Until then, I love the "sugar." It saves me the trouble of writing things out or thinking hard about it.
There are actually 6 loops. One loop for each filter/flatMap/map
The filter->map pairs can be done in one loop by using lazy views of the collections (iterator method)
In general, tt is running 2 nested loops for books to find all book pairs and then two nested loops to find if the author of one book is in the list of authors of the other.
Using simple data structures, you would do the same when coding explicitly.
And of course, the example here is to show a complex 'for' loop, not to write the most efficient code. E.g., instead of a sequence of authors, one could use a Set and then find if the intersection is non empty:
for (b1 <- books; b2 <- books; a <- (b1.authors & b2.authors)) yield a
Note that in 2.8, the filter call was changed to withFilter which is lazy and would avoid constructing an intermediate structure. See guide to move from filter to withFilter?.
I believe the reason that for is translated to map, flatMap and withFilter (as well as value definitions if present) is to make the use of monads easier.
In general I think if the computation you are doing involves looping 4 times, it is fine using the for loop. If the computation can be done more efficiently and performance is important then you should use the more efficient algorithm.
One follow-up to #IttayD's answer on the algorithm's efficiency. It's worth noting that the algorithm in the original post (and in the book) is a nested loop join. In practice, this isn't an efficient algorithm for large datasets, and most databases would use a hash aggregate here instead. In Scala, a hash aggregate would look something like:
(for (book <- books;
author <- book.authors) yield (book, author)
).groupBy(_._2).filter(_._2.size > 1).keys