How do you flatten a Sequence of Sequence in DBIOaction SLICK? - scala

Hey guys i am new in slick, how can I flatten this sequence of sequence? so that can return the commented code
def insertIfNotExists(mapCountryStates: Map[String, Iterable[StateUtil]]): Future[Seq[Seq[StateTable]]] /*: Future[Seq[StateTable]]*/ = {
val interaction = DBIO.sequence(mapCountryStates.toSeq.map { case (alpha2Country, statesUtil) =>
val codes = statesUtil.map(_.alpha3Code)
for {
countryId <- Countries.filter(_.alpha2Code === alpha2Country).map(_.id).result.head
existing <- States.filter(s => (s.alpha3Code inSet codes) && s.countryId === countryId).result
stateTables = statesUtil.map(x => StateTable(0L, x.name, x.alpha3Code, countryId))
statesInserted <- StatesInsertQuery ++= stateTables.filter(s => !existing.exists(x => x.alpha3Code == s.alpha3Code && x.countryId == s.countryId))
} yield existing ++ statesInserted
})
db.run(interaction.transactionally)
}
if I write it here:
val interaction = DBIO.sequence(...).flatten
or here:
db.run(interaction.flatten.transactionally)
[error] Cannot prove that Seq[Seq[StateRepository.this.StateTableMapping#TableElementType]] <:< slick.dbio.DBIOAction[R2,S2,E2].
but when the application runs, because the IDE does not detect it as an error:
I update my definition with DBIO.fold:

It looks like you might be after DBIO.fold. This provides a way to take a number of actions and reduce them down to a single value. In this case, your single value is a Seq[StateTable] from a Seq[Seq[StateTable]].
A sketch of how this could look might be...
def insertIfNotExists(...): DBIO[Seq[StateTable]] = {
val interaction: Seq[DBIO[Seq[StateTable]]] = ...
val startingPoint: Seq[StateTable] = Seq.empty
DBIO.fold(interaction, startingPoint) {
(total, list) => total ++ list
}
}
It looks like the types will line up using fold. Hope it's of some use in your case.
There's some more information about fold in Chapter 4 of Essential Slick.

A viable solution should be to flatten the sequence once the Future has been completed:
def insertIfNotExists(mapCountryStates: Map[String, Iterable[StateUtil]]): Future[Seq[StateTable]] = {
val interaction = DBIO.sequence(mapCountryStates.toSeq.map { case (alpha2Country, statesUtil) =>
val codes = statesUtil.map(_.alpha3Code)
for {
countryId <- Countries.filter(_.alpha2Code === alpha2Country).map(_.id).result.head
existing <- States.filter(s => (s.alpha3Code inSet codes) && s.countryId === countryId).result
stateTables = statesUtil.map(x => StateTable(0L, x.name, x.alpha3Code, countryId))
statesInserted <- StatesInsertQuery ++= stateTables.filter(s => !existing.exists(x => x.alpha3Code == s.alpha3Code && x.countryId == s.countryId))
} yield existing ++ statesInserted
})
db.run(interaction.transactionally).map(_.flatten)
}

Related

Conditionally running Slick statements in a for comprehension

Given the following Slick code:
val doUpdate = true
val table1 = TableQuery[Table1DB]
val table2 = TableQuery[Table2DB]
val table3 = TableQuery[Table3DB]
val action = (for {
_ <- table1.filter(_.id === id).update(table1Record)
_ <- table2.filter(_.id === id).update(table2Record)
_ <- table3.filter(_.id === id).update(table3Record)
} yield ()).transactionally
What I need is to update table2 and table3 ONLY if the variable doUpdate is true. This is my attempt:
val action = (for {
_ <- table1.filter(_.id === id).update(table1Record)
_ <- table2.filter(_.id === id).update(table2Record)
if (doUpdate == true)
_ <- table3.filter(_.id === id).update(table3Record)
if (doUpdate == true)
} yield ()).transactionally
This compiles, but when I run it I get the following error Action.withFilter failed. How to add the condition?
As you notices if you use if inside a for it will call withFilter underneath. In case of pipelines (DBIO, Futures, Tasks) that filtering boils down to whether or not continue with a pipeline (happy path) or fail it (generating Failure or similar).
If all you want is to update, then you can go with the if road and recover the Future. However, it would we ugly, since depending on how you write the case it might also recover from errors you would like to be informed about.
Alternative is to simply use if-else to provide a different DBIOAction:
(for {
_ <- {
if (doUpdate) conditionalAction.map(_ => ())
else DBIOAction.successful(())
}
} yield ()).transactionally
In your case it could look something like:
val action = (for {
_ <- table1.filter(_.id === id).update(table1Record)
_ <- {
if (doUpdate) DBIOAction.seq(
table2.filter(_.id === id).update(table2Record),
table3.filter(_.id === id).update(table3Record)
)
else DBIOAction.successful(())
}
} yield ()).transactionally

How to map sequence only if a condition applies with scala using an immutable approach?

Given a sequence of Price objects, I want to map it to applyPromo function if a condition, i.e. promo == "FOO" applies, otherwise return the sequence as is.
This is my applyPromo:
val pricePromo = price => price.copy(amount = price.amount - someDiscount)
In a mutable way I probably would write it like this:
var prices: Seq[Price] = Seq(price1, price2, ...)
.map(doStuff)
.map(doSomeOtherStuff)
if (promo == "FOO") {
prices = prices.map(applyPromo)
}
prices
I was wondering if I could do something similar like this while keeping the immutable approach of scala. Instead of creating a temp var, I prefer to keep the chain.
Pseudo-code:
val prices = Seq(price1, price2, ...)
prices
.map(dosStuff)
.map(doOtherStuff)
.mapIf(promo == "FOO", applyPromo)
I don't want to check the condition within the map function in this case, as it applies for all elements:
prices.map(price => {
if (promo == "FOO") {
applyDiscount(price)
} else
price
}
)
You just need to use else to make it functional (and you can create an implicit class to add the mapIf method if you prefer):
val prices: Seq[Price] = Seq(price1, price2,...).map(doStuff).map(doSomeOtherStuff)
/* val resultPrices = */ if (promo == "FOO") {
prices.map(price => {
price.copy(amount = price.amount - someDiscount)
})
} else prices
Something like this:
implicit class ConditionalMap[T](seq: Seq[T]) extends AnyVal {
def mapIf[Q](cond: =>Boolean, f: T => Q): Seq[Q] = if (cond) seq.map(f) else seq
}
You can also map(x => x) in the else case:
val discountFunction = if (promo == "FOO") (price: Price) =>
price.copy(amount = price.amount - someDiscount) else (x: Price) => x
val prices: Seq[Price] = Seq(price1, price2,...).
map(doStuff).
map(doSomeOtherStuff).
map(discountFunction)
I'd do it like this:
val maybePromo: (Price => Price) =
if(promo == "FOO") applyPromo else identity _
prices.map(maybePromo)
Or you can inline it within map itself:
prices.map(if(promo == "FOO") applyPromo else identity)
In scalaz, a function A => A is called an endomorphism and is a Monoid whose associative binary operation is function composition and whose identity is the identity function. This is useful because there is a bunch of syntax available where monoids are concerned. For example, scalaz adds the ?? operation to boolean along these lines:
def ??[A: Monoid](a: A) = if (self) a else Monoid[A].zero
Thus:
prices
.map(doStuff)
.map(doSomeOtherStuff)
.map(((promo === "FOO") ?? deductDiscount).run)
Where:
val deductDiscount: Endo[Price] = Endo(px => px.copy(amount = px.amount - someDiscount))
The above all requires
import scalaz._
import Scalaz._
Notes
=== is typesafe equals syntax
?? is boolean syntax
oxbow_lakes has an interesting answer
Easy way solve to me is wrapping Seq in a Option context.
scala> case class Price(amount: Double)
defined class Price
when condition matches,
scala> val promo = "FOO"
promo: String = FOO
scala> Some(Seq(Price(1), Price(2), Price(3))).collect{
case prices if promo == "FOO" => prices.map { p => p.copy(p.amount - 1 )}
case prices => prices}
res6: Option[Seq[Price]] = Some(List(Price(0.0), Price(1.0), Price(2.0)))
when condition does not match
scala> val promo = "NOT-FOO"
promo: String = NOT-FOO
scala> Some(Seq(Price(1), Price(2), Price(3))).collect{
case prices if promo == "FOO" => prices.map { p => p.copy(p.amount - 1 )}
case prices => prices}
res7: Option[Seq[Price]] = Some(List(Price(1.0), Price(2.0), Price(3.0)))

combine condition inside flat map and return result

val date2 = Option(LocalDate.parse("2017-02-01"))
//date1.compareTo(date2)>=0
case class dummy(val prop:Seq[Test])
case class Test(val s :String)
case class Result(val s :String)
val s = "11,22,33"
val t = Test(s)
val dt =Test("2017-02-06")
val list = dummy(Seq(t))
val list2 = dummy(Seq(dt))
val code = Option("22")
val f = date2.flatMap(c => list2
.prop
.find(d=>LocalDate.parse(d.s)
.compareTo(c)>=0))
.map(_ => Result("Found"))
.getOrElse(Result("Not Found"))
code.flatMap(c => list
.prop
.find(_.s.split(",").contains(c)))
.map(_ => Result("Found"))
.getOrElse(Result("Not Found"))
I want to && the conditions below and return Result("Found")/Result("Not Found")
d=>LocalDate.parse(d.s).compareTo(c)>=0)
_.s.split(",").contains(c)
Is there any possible way to achieve the above .In actual scenerio list and list 2 are Future
I tried to make a more realistic example based on Futures. Here is how I would do it:
val date2 = Option(LocalDate.parse("2017-02-01"))
case class Test(s: String)
case class Result(s: String)
val t = Test("11,22,33")
val dt = Test("2017-02-06")
val code = Option("22")
val f1 = Future(Seq(t))
val f2 = Future(Seq(dt))
// Wait for both futures to finish
val futureResult = Future.sequence(Seq(f1, f2)).map {
case Seq(s1, s2) =>
// Check the first part, this will be a Boolean
val firstPart = code.nonEmpty && s1.exists(_.s.split(",").contains(code.get))
// Check the second part, also a boolean
val secondPart = date2.nonEmpty && s2.exists(d => LocalDate.parse(d.s).compareTo(date2.get) >= 0)
// Do the AND logic you wanted
if (firstPart && secondPart) {
Result("Found")
} else {
Result("Not Found")
}
}
// This is just for testing to see we got the correct result
val result = Await.result(futureResult, Duration.Inf)
println(result)
As an aside, your code and date2 values in your example are Options... If this is true in your production code, then we should do a check first to see if they are both defined. If they are not then there would be no need to continue with the rest of the code:
val futureResult = if (date2.isEmpty || code.isEmpty) {
Future.successful(Result("Not Found"))
} else {
Future.sequence(Seq(f1, f2)).map {
case Seq(s1, s2) =>
val firstPart = s1.exists(_.s.split(",").contains(code.get))
val secondPart = s2.exists(d => LocalDate.parse(d.s).compareTo(date2.get) >= 0)
if (firstPart && secondPart) {
Result("Found")
} else {
Result("Not Found")
}
}
}
Use pattern matching on Option instead of using flatMap
e.g.
val x = Some("20")
x match {
case Some(i) => println(i) //do whatever you want to do with the value. And then return result
case None => Result("Not Found")
}
Looking at what you are trying to do, You would have to use pattern matching twice, that too nested one.

Tuple seen as Product, compiler rejects reference to element

Constructing phoneVector:
val phoneVector = (
for (i <- 1 until 20) yield {
val p = killNS(r.get("Phone %d - Value" format(i)))
val t = killNS(r.get("Phone %d - Type" format(i)))
if (p == None) None
else
if (t == None) (p,"Main") else (p,t)
}
).filter(_ != None)
Consider this very simple snippet:
for (pTuple <- phoneVector) {
println(pTuple.getClass.getName)
println(pTuple)
//val pKey = pTuple._1.replaceAll("[^\\d]","")
associate() // stub prints "associate"
}
When I run it, I see output like this:
scala.Tuple2
((609) 954-3815,Mobile)
associate
When I uncomment the line with replaceAll(), compile fails:
....scala:57: value _1 is not a member of Product with Serializable
[error] val pKey = pTuple._1.replaceAll("[^\\d]","")
[error] ^
Why does it not recognize pTuple as a Tuple2 and treat it only as Product
OK, this compiles and produces the desired result. But it's too verbose. Can someone please demonstrate a more concise solution for dealing with this typesafe stuff?
for (pTuple <- phoneVector) {
println(pTuple.getClass.getName)
println(pTuple)
val pPhone = pTuple match {
case t:Tuple2[_,_] => t._1
case _ => None
}
val pKey = pPhone match {
case s:String => s.replaceAll("[^\\d]","")
case _ => None
}
println(pKey)
associate()
}
You can do:
for (pTuple <- phoneVector) {
val pPhone = pTuple match {
case (key, value) => key
case _ => None
}
val pKey = pPhone match {
case s:String => s.replaceAll("[^\\d]","")
case _ => None
}
println(pKey)
associate()
}
Or simply phoneVector.map(_._1.replaceAll("[^\\d]",""))
By changing the construction of phoneVector, as wrick's question implied, I've been able to eliminate the match/case stuff because Tuple is assured. Not thrilled by it, but Change is Hard, and Scala seems cool.
Now, it's still possible to slip a None value into either of the Tuple values. My match/case does not check for that, and I suspect that could lead to a runtime error in the replaceAll call. How is that allowed?
def killNS (s:Option[_]) = {
(s match {
case _:Some[_] => s.get
case _ => None
}) match {
case None => None
case "" => None
case s => s
}
}
val phoneVector = (
for (i <- 1 until 20) yield {
val p = killNS(r.get("Phone %d - Value" format(i)))
val t = killNS(r.get("Phone %d - Type" format(i)))
if (t == None) (p,"Main") else (p,t)
}
).filter(_._1 != None)
println(phoneVector)
println(name)
println
// Create the Neo4j nodes:
for (pTuple <- phoneVector) {
val pPhone = pTuple._1 match { case p:String => p }
val pType = pTuple._2
val pKey = pPhone.replaceAll(",.*","").replaceAll("[^\\d]","")
associate(Map("target"->Map("label"->"Phone","key"->pKey,
"dial"->pPhone),
"relation"->Map("label"->"IS_AT","key"->pType),
"source"->Map("label"->"Person","name"->name)
)
)
}
}

Apache Spark: dealing with Option/Some/None in RDDs

I'm mapping over an HBase table, generating one RDD element per HBase row. However, sometimes the row has bad data (throwing a NullPointerException in the parsing code), in which case I just want to skip it.
I have my initial mapper return an Option to indicate that it returns 0 or 1 elements, then filter for Some, then get the contained value:
// myRDD is RDD[(ImmutableBytesWritable, Result)]
val output = myRDD.
map( tuple => getData(tuple._2) ).
filter( {case Some(y) => true; case None => false} ).
map( _.get ).
// ... more RDD operations with the good data
def getData(r: Result) = {
val key = r.getRow
var id = "(unk)"
var x = -1L
try {
id = Bytes.toString(key, 0, 11)
x = Long.MaxValue - Bytes.toLong(key, 11)
// ... more code that might throw exceptions
Some( ( id, ( List(x),
// more stuff ...
) ) )
} catch {
case e: NullPointerException => {
logWarning("Skipping id=" + id + ", x=" + x + "; \n" + e)
None
}
}
}
Is there a more idiomatic way to do this that's shorter? I feel like this looks pretty messy, both in getData() and in the map.filter.map dance I'm doing.
Perhaps a flatMap could work (generate 0 or 1 items in a Seq), but I don't want it to flatten the tuples I'm creating in the map function, just eliminate empties.
An alternative, and often overlooked way, would be using collect(PartialFunction pf), which is meant to 'select' or 'collect' specific elements in the RDD that are defined at the partial function.
The code would look like this:
val output = myRDD.collect{case Success(tuple) => tuple }
def getData(r: Result):Try[(String, List[X])] = Try {
val id = Bytes.toString(key, 0, 11)
val x = Long.MaxValue - Bytes.toLong(key, 11)
(id, List(x))
}
If you change your getData to return a scala.util.Try then you can simplify your transformations considerably. Something like this could work:
def getData(r: Result) = {
val key = r.getRow
var id = "(unk)"
var x = -1L
val tr = util.Try{
id = Bytes.toString(key, 0, 11)
x = Long.MaxValue - Bytes.toLong(key, 11)
// ... more code that might throw exceptions
( id, ( List(x)
// more stuff ...
) )
}
tr.failed.foreach(e => logWarning("Skipping id=" + id + ", x=" + x + "; \n" + e))
tr
}
Then your transform could start like so:
myRDD.
flatMap(tuple => getData(tuple._2).toOption)
If your Try is a Failure it will be turned into a None via toOption and then removed as part of the flatMap logic. At that point, your next step in the transform will only be working with the successful cases being whatever the underlying type is that is returned from getData without the wrapping (i.e. No Option)
If you are ok with dropping the data then you can just use mapPartitions. Here is a sample:
import scala.util._
val mixedData = sc.parallelize(List(1,2,3,4,0))
mixedData.mapPartitions(x=>{
val foo = for(y <- x)
yield {
Try(1/y)
}
for{goodVals <- foo.partition(_.isSuccess)._1}
yield goodVals.get
})
If you want to see the bad values, then you can use an accumulator or just log as you have been.
Your code would look something like this:
val output = myRDD.
mapPartitions( tupleIter => getCleanData(tupleIter) )
// ... more RDD operations with the good data
def getCleanData(iter: Iter[???]) = {
val triedData = getDataInTry(iter)
for{goodVals <- triedData.partition(_.isSuccess)._1}
yield goodVals.get
}
def getDataInTry(iter: Iter[???]) = {
for(r <- iter) yield {
Try{
val key = r._2.getRow
var id = "(unk)"
var x = -1L
id = Bytes.toString(key, 0, 11)
x = Long.MaxValue - Bytes.toLong(key, 11)
// ... more code that might throw exceptions
}
}
}