Given a Field and a Record, how to obtain the value of that Field within that Record? - lift

I have something like:
val jobs = Job.where(...).fetch()
val fieldsToDisplay = Seq(Job.status, Job._id, ...)
val header = fieldsToDisplay map { _.name }
val tbl = jobs map { j => fieldsToDisplay map { _.getValueIn(j) } }
renderTable(header, tbl)
...and it's that hypothetical getValueIn I'm looking for.
I've been unable to find anything, but perhaps more experienced Lift'ers know a trick.

Each Field has a name that is unique within the Record
jobs map { j =>
fieldsToDisplay map { f =>
j.fieldByName(f.name)
}
}

Related

Scala future with unknown loop count

The case is try to get children comment list for a comment, and it is easy to get all the children comments by a 'parentId' field in database. However, take the possible long list into consideration, it's better to do pagination for db query. We can get specified rows like this:
def getChildCommentsByParentId(parentId: Int, offset: Int, rows: Int): Future[Seq[CommentsRow]] = {
val query = Comments.filter(row => row.iparentid === parentId && row.istatus === statusNormal ).drop(offset).take(rows)
db.run(query.result)
}
For getting all the rows by doing several pagination call with a fixed page size, it's easy to express this in an iterative way in other programing language, like this:
$replyList = [];
do {
$ret = getChildCommentsByParentId(parentId, page * pagesize, pagesize);
if (!empty($ret)) {
$replyList = array_merge($replyList, $ret);
}
if (count($ret) < $pageSize) {
break;
}
$page ++ ;
} while(true);
It's reasonable to use for comprehension to handle multiple futures in scala, but i don't know how to handle this case. The key point is, current iteration depends on previous one(the 'page' parameter), and when the count of rows is less than pagesize, returns a full result with all the previous data rows.
Thanks!
I try to solve it in a recursive way:
def getChildComments(nextPage: Int, commentId: Int) : Future[CommentResult] = {
val pageSize = 5
val currentPageFuture = commentModel.getChildCommentsByParentId(commentId, nextPage*pageSize, pageSize).map {
data => CommentResult(nextPage, data)
}
val nextPageFuture = currentPageFuture.flatMap({ commentRes =>
val nextPage = commentRes.nextPage+1
if(commentRes.data.size >= pageSize) {
getChildComments(nextPage, commentId)
}
else {
Future.successful(CommentResult(nextPage, Seq()))
}
})
for {
currentPageComments <- currentPageFuture
nextPageComments <- nextPageFuture
} yield {
CommentResult(currentPageComments.nextPage, currentPageComments.data ++ nextPageComments.data)
}
}
case class CommentResult(nextPage: Int, data: Seq[TbcommentsRow])
But still feel it's ugly, doubt it has more elegant way?

tuple result in slick 3 DBIO

There is a table with key, values.
There is an another table with auto incremented PK.
Take the value for the key from the first table. If not present insert and
return defaulted value.
Query on other table based on it.
Return filtered result and the value.
So I could try:
def lastLogs(limit: Long = 666): Future[(Long, Seq[VLogEntry])] = {
val q: DBIO[(Long, Seq[VLogEntry])] = {
for {
existing <- kvTable.filter(_.key === "log").result.headOption
conf = existing getOrElse KV(key = "log", value = "0")
last = conf.value.toLong
rds = vLogTable.filter(_.id > last).take(limit).result
_ <- kvTable.insertOrUpdate(conf)
} yield {
(last, rds)
}
}
db.run(q)
}
This gives compile error:
found : HipDAO.this.domain.dbConfig.profile.StreamingProfileAction[Seq[HipDAO.this.domain.VLogTable ...
required: Seq[db.types.Types.VLogEntry]
In Slick 2 I could call list or result on queries in session.
How do I to this in Slick 3.
Going through the iterations of what could be done I came to:
def lastLogs(limit: Long = 666): Future[(Long, Seq[VLogEntry])] = {
val q: DBIO[(Long, Seq[VLogEntry])] = {
for {
existing <- kvTable.filter(_.key === "log").result.headOption
conf = existing getOrElse KV(key = "log", value = "0")
_ <- kvTable.insertOrUpdate(conf)
last = conf.value.toLong
rds <- vLogTable.filter(_.id > last).take(limit).result
} yield {
(last, rds)
}
}
db.run(q)
}
Since difference is only 2 characters this seems like an easy fix.
Internet is devoid from what StreamingProfileAction is, and how to read this messages.
Some insight might come from reading essential slick 3
Eventually, how I saw it, it's a monad, you are supposed to flatMap it.

Scala broadcast join with "one to many" relationship

I am fairly new to Scala and RDDs.
I have a very simple scenario yet it seems very hard to implement with RDDs.
Scenario:
I have two tables. One large and one small. I broadcast the smaller table.
I then want to join the table and finally aggregate the values after the join to a final total.
Here is an example of the code:
val bigRDD = sc.parallelize(List(("A",1,"1Jan2000"),("B",2,"1Jan2000"),("C",3,"1Jan2000"),("D",3,"1Jan2000"),("E",3,"1Jan2000")))
val smallRDD = sc.parallelize(List(("A","Fruit","Apples"),("A","ZipCode","1234"),("B","Fruit","Apples"),("B","ZipCode","456")))
val broadcastVar = sc.broadcast(smallRDD.keyBy{ a => (a._1,a._2) } // turn to pair RDD
.collectAsMap() // collect as Map
)
//first join
val joinedRDD = bigRDD.map( accs => {
//get list of groups
val groups = List("Fruit", "ZipCode")
val i = "Fruit"
//for each group
//for(i <- groups) {
if (broadcastVar.value.get(accs._1, i) != None) {
( broadcastVar.value.get(accs._1, i).get._1,
broadcastVar.value.get(accs._1, i).get._2,
accs._2, accs._3)
} else {
None
}
//}
}
)
//expected after this
//("A","Fruit","Apples",1, "1Jan2000"),("B","Fruit","Apples",2, "1Jan2000"),
//("A","ZipCode","1234", 1,"1Jan2000"),("B","ZipCode","456", 2,"1Jan2000")
//then group and sum
//cannot do anything with the joinedRDD!!!
//error == value copy is not a member of Product with Serializable
// Final Expected Result
//("Fruit","Apples",3, "1Jan2000"),("ZipCode","1234", 1,"1Jan2000"),("ZipCode","456", 2,"1Jan2000")
My questions:
Is this the best approach first of all with RDDs?
Disclaimer - I have done this whole task using dataframes successfully. The idea is to create another version using only RDDs to compare performance.
Why is the type of my joinedRDD not recognised after it was created so that I can continue to use functions like copy on it?
How can I get away with not doing a .collectAsMap() when broadcasting the variable. I currently have to include the first to items to enforce uniqueness and not dropping any values.
Thanks for the help in advance!
Final solution for anyone interested
case class dt (group:String, group_key:String, count:Long, date:String)
val bigRDD = sc.parallelize(List(("A",1,"1Jan2000"),("B",2,"1Jan2000"),("C",3,"1Jan2000"),("D",3,"1Jan2000"),("E",3,"1Jan2000")))
val smallRDD = sc.parallelize(List(("A","Fruit","Apples"),("A","ZipCode","1234"),("B","Fruit","Apples"),("B","ZipCode","456")))
val broadcastVar = sc.broadcast(smallRDD.keyBy{ a => (a._1) } // turn to pair RDD
.groupByKey() //to not loose any data
.collectAsMap() // collect as Map
)
//first join
val joinedRDD = bigRDD.flatMap( accs => {
if (broadcastVar.value.get(accs._1) != None) {
val bc = broadcastVar.value.get(accs._1).get
bc.map(p => {
dt(p._2, p._3,accs._2, accs._3)
})
} else {
None
}
}
)
//expected after this
//("Fruit","Apples",1, "1Jan2000"),("Fruit","Apples",2, "1Jan2000"),
//("ZipCode","1234", 1,"1Jan2000"),("ZipCode","456", 2,"1Jan2000")
//then group and sum
var finalRDD = joinedRDD.map(s => {
(s.copy(count=0),s.count) //trick to keep code to minimum (count = 0)
})
.reduceByKey(_ + _)
.map(pair => {
pair._1.copy(count=pair._2)
})
In your map statement you return either a tuple or None based on the if condition. These types do not match so you fall back the a common supertype so joinedRDD is an RDD[Product with Serializable] Which is not what you want at all (it's basically RDD[Any]). You need to make sure all paths return the same type. In this case, you probably want an Option[(String, String, Int, String)]. All you need to do is wrap the tuple result into a Some
if (broadcastVar.value.get(accs._1, i) != None) {
Some(( broadcastVar.value.get(accs._1, i).get.group_key,
broadcastVar.value.get(accs._1, i).get.group,
accs._2, accs._3))
} else {
None
}
And now your types will match up. This will make joinedRDD and RDD[Option(String, String, Int, String)]. Now that the type is correct the data is usable, however, it means that you will need to map the Option to work with the tuples. If you don't need the None values in the final result, you can use flatmap instead of map to create joinedRDD which will unwrap the Options for you, filtering out all the Nones.
CollectAsMap is the correct way to turnan RDD into a Hashmap, but you need multiple values for a single key. Before using collectAsMap but after mapping the smallRDD into a Key,Value pair, use groupByKey to group all of the values for a single key together. When when you look up a key from your HashMap, you can map over the values, creating a new record for each one.

Do Aggregation with Slick

My database structure looks like this:
id | content
I what to get the entry with max id (not just id).
I read the answer How to make aggregations with slick, but I found there is no first method in the statement: Query(Coffees.map(_.price).max).first. How to do that now?
What if I need the content of the item with the max id?
To retrieve another column, you could do something like the following. The below example calculates the max of one column, finds the row with that maximum value, and returns the value of another column in that row:
val coffees = TableQuery[Coffees]
val mostExpensiveCoffeeQuery =
for {
maxPrice <- coffees.map(_.price).max.result
c <- maxPrice match {
case Some(p) => coffees.filter(_.price === p).result
case None => DBIO.successful(Seq())
}
} yield c.headOption.map(_.name)
val mostExpensiveCoffee = db.run(mostExpensiveCoffeeQuery)
// Future[Option[String]]
Alternatively, to return a full Coffees object:
val mostExpensiveCoffeeQuery =
for {
...
} yield c.headOption
val mostExpensiveCoffee = db.run(mostExpensiveCoffeeQuery)
// Future[Option[Coffees]]

How to filter FrequentItemset?

I have a beginner question about filtering FrequetItemset with Scala .
The code starts by the book :
import org.apache.spark.mllib.fpm.FPGrowth
import org.apache.spark.rdd.RDD
val transactions: RDD[Array[String]] = data.map(s => s.trim.split(','))
val fpg = new FPGrowth()
.setMinSupport(0.04)
.setNumPartitions(10)
val model = fpg.run(transactions)
Now I'd like to filter out items start with 'aaa' , for example "aaa_ccc", from result
I tried :
val filtered_result = model.freqItemsets.itemset.filter{ item => startwith("aaa")}
and
val filtered_result = model.freqItemsets.filter( itemset.items => startwith("aaa"))
and
val filtered_result = model.freqItemsets.filter( itemset => items.startwith("aaa"))
What I did wrong?
Any of the suggested codes would not compile. So I am not sure if the problem you are talking about is why is it not compiling or is it that you get wrong results.
Scala collections can be filtered using filter method and passing method as a parameter, which can be written in a couple of ways:
coll filter { item => filterLogic(item) }
so to filter freqItemsets you would use something like:
model.freqItemsets filter { itemSet => filterLogic(itemSet) }
If you wanted to filter all freqItemsets that contain at least one string that starts with "aaa"
model.freqItemsets filter { itemSet => itemSet.items exists { item => item.startsWith("aaa") }
Or if your goal was to filter items inside the freqItemset then:
model.freqItemsets map { itemSet => itemSet.copy(items = itemSet filter { _.startsWith("aaa") }) }
Note that in Scala you can use:
someCollection filter { item => item.startsWith("string") } which is the same as: someCollection filter { _.startsWith("string") }
Hope this helps.
items is an Array[String]. If you want to filter any itemset which contains item which is starts with aaa you'll need something like this:
model.freqItemsets.filter(_.items.exists(_.startsWith("aaa")))