Scala Slick - Convert TableQuery[E] to Query[E] to chain operations - scala

I'm trying to apply a series of optional filtering operations to a query by using a list of the operations and folding over the list.
val table = TableQuery[Fizz]
val filters = List(filter1(option1)_, filter2(option2)_, filter3(option3)_)
val filteredQuery = filters.foldLeft(table){(q, filter) => filter(q)}
The partially applied filter functions have a signature of
Query[Fizz, FizzRow, Seq] => Query[Fizz, FizzRow, Seq]
Basically, in each function, I am optionally applying the filtering if the filter parameter option* is present. However, the compiler does not like the fact that I am passing in a TableQuery to a function that takes Query, even though TableQuery is a subtype of Query. Is there a way to convert a TableQuery to Query? Or a better way to go about chaining filter functions on a query?
The compiler error in question is
type mismatch;
found :scala.slick.lifted.Query[generated.Tables.Farm,generated.Tables.FarmRow,Seq]
required: scala.slick.lifted.TableQuery[generated.Tables.Farm]
I can get it to compile by using table.drop(0) instead of table but obviously that seems like a poor workaround. I see that there's a to method on TableQuery that converts it to a Query but it also takes an implicit ctc: TypedCollectionTypeConstructor[D].
An example of one of the filterX functions listed above:
def filterCharacteristics(characteristics: Option[List[Int]])(table: Query[Farm,FarmRow,Seq]) = {
characteristics.map(chars =>
(for {
(fc, f) <- Farmcharacteristic.filter(_.characteristicId inSet chars) join table on (_.farmId === _.farmId)
} yield f)).getOrElse(table)
}

I think you can try another approach. Instead of using a fold, you can use a collect to get only the Some values.
Then you can apply a filter to each of the options you have:
val table = TableQuery[Fizz]
val filteredQueries = List(Some(option1), Some(option2), Some(option3)) collect {
case Some(option) => option
} map { currentOption =>
table.filter(currentOption)
}
// We need to get the last value or the TableQuery
val lastValue = filteredQueries reverse headOption
// Or we have Some(Query) or None, In case it is a None, we will use table
lastValue.getOrElse(table)

Related

How to avoid NullPointerException in Scala while storing query result in variable

Here is a code that requires a change:
val activityDate = validation.select("activity_date").first.get(0).toString
When we run a job, 'activityDate' might return null as a result of query since there might not be any data in db. In this case we get NullPointerException. I need to update this code to avoid NPE.
I tried to do it in different ways but there is always smth missing. I should probably use Match Expression here but have face some errors while initializing it.
The usual way to model some kind of data that might or might not be there in Scala is the Option type. The Option type has two concrete implementations, Some for a value which is there and the None singleton to represent any absent value. Option conveniently has a constructor that wraps a nullable value and turns it into either a Some(value) for non-null values and None for nulls. You can use it as follows:
Option(validation.select("activity_date").first.get(0))
You can apply transformations to it using various combinators. If you want to transform the piece of data itself into something more meaningful for your application, map is usually a good call. The following applies the logic you had before:
val activityDate: Option[String] =
Option(validation.select("activity_date").first.get(0)).
map { activityDate => activityDate.toString }
Note that now activityDate is an Option itself, which means that you have to explicitly handle the case in which the data is not there. You can do so with a match on the concrete type of the option as follows:
activityDate match {
case Some(date) => // `date` is there for sure!
case None => // handle the `select` returned nothing
}
Altenrnatively if you want to apply a default value you can use the getOrElse method on the Option:
val activityDate: String =
Option(validation.select("activity_date").first.get(0)).
map { activityDate => activityDate.toString }.
getOrElse("No Data")
Another possibility to apply a default value on a None and a function to the value in a Some is using fold:
val activityDate: String =
Option(validation.select("activity_date").first.get(0)).
fold("No Data")(activityDate => activityDate.toString)
As a final note, you can shorten anonymous functions in these cases as follows:
val activityDate: String =
Option(validation.select("activity_date").first.get(0)).
fold("No Data")(_.toString)
Where _ is used to refer to the only parameter.

Grouping by generic parameter in Slick

I am trying to implement generic grouping using Slick 3.2.3. By generic grouping I mean grouping the same query by different parameters or sets thereof.
Supposing I have a table:
class MyTable(tag: Tag) extends Table[MyEntry](tag, "my_table") {
def text1 = column[String]("text1")
def text2 = column[Option[String]]("text2")
def list = column[List[String]]("list") // I am using postgres+slick_pg
...
}
Then I have a complex query with several joins and I would like to be able to group it by text1, (text1, text2), list etc. One way to do it would be to define a generic function which performs grouping using extractor parameter:
private def getData[T](extractor: MyTable => T) = {
// supposing MyTable comes second in the list
// of joined tables in my complex query
val groupedQuery = myComplexQuery.groupedBy(x => extractor(x._2))
...
// here goes aggregation functions, mapping etc.
}
where one of extractor implementations may be defined as
val extractor: MyTable => (Rep[String], Rep[Option[String]]) = me => me.text1 -> me.text2
However, since extractor is generic, groupBy cannot find matching Shape for T type, and it means that I will have to provide it as well. My question is how exactly to define such Shapes? Documentation for slick.lifted package lacks examples, and it is not exactly obvious what generic types K, T, G and P mean in Query#groupBy definition (or FlatShapeLevel for that matter). I would appreciate if somebody provided examples of such extractor functions at least for a primitive type (String) and a tuple2 (say, (String, Option[String])). Or perhaps there is a better way to achieve the same result which I have overlooked? Thanks.

How map work on Options in Scala?

I have this two functions
def pattern(s: String): Option[Pattern] =
try {
Some(Pattern.compile(s))
} catch {
case e: PatternSyntaxException => None
}
and
def mkMatcher(pat: String): Option[String => Boolean] =
pattern(pat) map (p => (s: String) => p.matcher(s).matches)
Map is the higher-order function that applies a given function to each element of a list.
Now I am not getting that how map is working here as per above statement.
Map is the higher-order function that applies a given function to each element of a list.
This is an uncommonly restrictive definition of map.
At any rate, it works because it was defined by someone who did not hold to that.
For example, that someone wrote something akin to
sealed trait Option[+A] {
def map[B](f: A => B): Option[B] = this match {
case Some(value) => Some(f(value))
case None => None
}
}
as part of the standard library. This makes map applicable to Option[A]
It was defined because it makes sense to map many kinds of data structures not just lists.
Mapping is a transformation applied to the elements held by the data structure.
It applies a function to each element.
Option[A] can be thought of as a trivial sequence. It either has zero or one elements. To map it means to apply the function on its element if it has one.
Now it may not make much sense to use this facility all of the time, but there are cases where it is useful.
For example, it is one of a few distinct methods that, when present enable enable For Expressions to operate on a type. Option[A] can be used in for expressions which can be convenient.
For example
val option: Option[Int] = Some(2)
val squared: Option[Int] = for {
n <- option
if n % 2 == 0
} yield n * n
Interestingly, this implies that filter is also defined on Option[A].
If you just have a simple value it may well be clearer to use a less general construct.
Map is working the same way that it does with other collections types like List and Vector. It applies your function to the contents of the collection, potentially changing the type but keeping the collection type the same.
In many cases you can treat an Option just like a collection with either 0 or 1 elements. You can do a lot of the same operations on Option that you can on other collections.
You can modify the value
var opt = Option(1)
opt.map(_ + 3)
opt.map(_ * math.Pi)
opt.filter(_ == 1)
opt.collect({case i if i > 0 => i.toString })
opt.foreach(println)
and you can test the value
opt.contains(3)
opt.forall(_ > 0)
opt.exists(_ > 0)
opt.isEmpty
Using these methods you rarely need to use a match statement to unpick an Option.

Create spark function which accepts key ,value as argumets and return back RDD[string]?

I want to create a function which can later be used by three different RDD data sets.
Function takes key and value and converts to seq[String]
def ConvertToMap2(value: RDD[(String, (String,String,String,String,String,String))]): Seq[String] = {
value.collect().toMap.values.toSeq.map(x => x.toString.replace("(","").replace(")",""))
}
when I tried to apply by one data set its ok because it has one key with 6 values example:-
val StatusRDD=ConvertToMap(FilterDataSet("1013").map(x => ((x(5)+x(4)),(x(5),x(4),x(1),x(6),x(7),x(8)))))
but I tried to apply on another data set I need to we write the function because other data set contains 7 values with one key this makes to re write the function with same logic but different name.
def ConvertToMap2(value: RDD[(String,(String,String,String,String,String,String,String))]): Seq[String] = {
value.collect().toMap.values.toSeq.map(x => x.toString.replace("(","").replace(")",""))
}
val LuldRDD2=ConvertToMap2(FilterDataSet("1041").map(x => ((x(5)+x(4)),(x(5),x(4),x(1),x(6),x(7),x(8),x(9)))))
Is there a way to write one function for both which accepts 6 or 7 values of string with just one key ? or can I extend my function ?
TupleX classes inherit from Product, so I would define the function like this:
def convertToSeq(rdd: RDD[(String, Product)]): Seq[String] = {
rdd.values.map(x => x.productIterator.mkString).collect().toSeq
}
Note that TupleX classes have a productIterator that I'm using here to create the string (I found your way somewhat verbose and more difficult to read) and I'm also delaying the collect call until after converting the values, so the map operation is run in parallel.
Finially, I have changed the name of the function, since it converts to a Seq and not a Map.
Yep go the answer need to use data type of any
def ConvertToMap (value: RDD[(String,Any)]): Seq[String] = {
value.collect().toMap.values.toSeq.map(x => x.toString.replace("(","").replace(")",""))
}

Scala's for-comprehension `if` statements

Is it possible in scala to specialize on the conditions inside an if within a for comprehension? I'm thinking along the lines of:
val collection: SomeGenericCollection[Int] = ...
trait CollectionFilter
case object Even extends CollectionFilter
case object Odd extends CollectionFilter
val evenColl = for { i <- collection if(Even) } yield i
//evenColl would be a SomeGenericEvenCollection instance
val oddColl = for { i <- collection if(Odd) } yield i
//oddColl would be a SomeGenericOddCollection instance
The gist is that by yielding i, I get a new collection of a potentially different type (hence me referring to it as "specialization")- as opposed to just a filtered-down version of the same GenericCollection type.
The reason I ask is that I saw something that I couldn't figure out (an example can be found on line 33 of this ScalaQuery example. What it does is create a query for a database (i.e. SELECT ... FROM ... WHERE ...), where I would have expected it to iterate over the results of said query.
So, I think you are asking if it is possible for the if statement in a for-comprehension to change the result type. The answer is "yes, but...".
First, understand how for-comprehensions are expanded. There are questions here on Stack Overflow discussing it, and there are parameters you can pass to the compiler so it will show you what's going on.
Anyway, this code:
val evenColl = for { i <- collection if(Even) } yield i
Is translated as:
val evenColl = collection.withFilter(i => Even).map(i => i)
So, if the withFilter method changes the collection type, it will do what you want -- in this simple case. On more complex cases, that alone won't work:
for {
x <- xs
y <- ys
if cond
} yield (x, y)
is translated as
xs.flatMap(ys.withFilter(y => cond).map(y => (x, y)))
In which case flatMap is deciding what type will be returned. If it takes the cue from what result was returned, then it can work.
Now, on Scala Collections, withFilter doesn't change the type of the collection. You could write your own classes that would do that, however.
yes you can - please refer to this tutorial for an easy example. The scala query example you cited is also iterating on the collection, it is then using that data to build the query.