Do Aggregation with Slick - scala

My database structure looks like this:
id | content
I what to get the entry with max id (not just id).
I read the answer How to make aggregations with slick, but I found there is no first method in the statement: Query(Coffees.map(_.price).max).first. How to do that now?
What if I need the content of the item with the max id?

To retrieve another column, you could do something like the following. The below example calculates the max of one column, finds the row with that maximum value, and returns the value of another column in that row:
val coffees = TableQuery[Coffees]
val mostExpensiveCoffeeQuery =
for {
maxPrice <- coffees.map(_.price).max.result
c <- maxPrice match {
case Some(p) => coffees.filter(_.price === p).result
case None => DBIO.successful(Seq())
}
} yield c.headOption.map(_.name)
val mostExpensiveCoffee = db.run(mostExpensiveCoffeeQuery)
// Future[Option[String]]
Alternatively, to return a full Coffees object:
val mostExpensiveCoffeeQuery =
for {
...
} yield c.headOption
val mostExpensiveCoffee = db.run(mostExpensiveCoffeeQuery)
// Future[Option[Coffees]]

Related

Grouping duplicates in a csv file

I have a CSV file which I've applied a case class onto and made into a list e.g
CSV file was like this -
"user_id","age","liked_ad","location"
2145,34,true,USA
6786,25,true,UK
9025,21,false,USA
1145,40,false,UK
It goes on. Ultimately I am trying to find the top user_id's who have the most liked_ad's (true values). I know that there are duplicates within the csv file as I did -
val origFile = processCSV("src/main/resources/advert-data.csv")
val origFileLength = origFile.length
val uniqueList = origFile.distinct
val uniqueListLength = uniqueList.length
The two lengths were different. I am thinking I need to group all the user_id's so that all the entries of the same user_id are in a group where I can then count how many 'trues' are in that user's entries. I am completely stuck on the right way to go about this.
This is my processCSV function at the moment -
final case class AdvertInfo(userId: Int, age: Int, likedAd: Boolean, location: String)
def processCSV(file: String): List[AdvertInfo] = {
val data = io.Source.fromFile(file)
data
.getLines()
.map(_.split(',').iterator.map(_.trim).toList)
.flatMap {
case userIdRaw :: ageRaw :: likedAdRaw :: locationRaw :: Nil =>
for {
userId <- userIdRaw.toIntOption
age <- ageRaw.toIntOption
likedAd <- likedAdRaw.toBooleanOption
location <- Some(locationRaw)
} yield AdvertInfo(userId, age, likedAd, location)
case _ =>
None
}.toList
}
Your description is a bit confusing but I think what you want is:
origFile.filter(_.likedAd)
.groupMapReduce(_.userId)(_ => 1)(_+_) //Scala 2.13.x
The result is a Map with the user_ids as the keys and the count of all the liked_ad=="true" as the values.
From there you can .toList.sortBy(-_._2) in order to get the ranking-by-liked-count.

Passing result of one DBIO into another

I'm new to Slick and I am trying to rewrite the following two queries to work in one transaction. My goal is to
1. check if elements exists
2. return existing element or create it handling autoincrement from MySQL
The two functions are:
def createEmail(email: String): DBIO[Email] = {
// We create a projection of just the email column, since we're not inserting a value for the id column
(emails.map(p => p.email)
returning emails.map(_.id)
into ((email, id) => Email(id, email))
) += email
}
def findEmail(email: String): DBIO[Option[Email]] =
emails.filter(_.email === email).result.headOption
How can I safely chain them, ie. to run first check for existence, return if object already exists and if it does not exist then create it and return the new element in one transaction?
You could use a for comprehension:
def findOrCreate(email: String) = {
(for {
found <- findEmail(email)
em <- found match {
case Some(e) => DBIO.successful(e)
case None => createEmail(email)
}
} yield em).transactionally
}
val result = db.run(findOrCreate("batman#gotham.gov"))
// Future[Email]
With a little help of cats library:
def findOrCreate(email: String): DBIO[Email] = {
OptionT(findEmail(email)).getOrElseF(createEmail(email)).transactionally
}

How to mix select and delete in a Slick transaction

Why does it not work to combine a SELECT and a DELETE statement in a Slick query? as in:
val query = (for {
item <- SomeTable
_ <- OtherTable.filter(_.id === item.id).delete
} yield ()).transactionally
"Cannot resolve symbol 'transactionally'"
(without .transactionally, it is a Query[Nothing, Nothing, Seq], if that helps)
while the two actions work separately:
val query = (for {
item <- SomeTable
} yield ()).transactionally
,
val query = (for {
_ <- OtherTable.filter(_.id === 2).delete
} yield ()).transactionally
OK so this is a classic example of mixing DBIO with Query.
In your first case:
val query = (for {
item <- SomeTable // this is `Query`
_ <- OtherTable.filter(_.id === item.id).delete // this is `DBIO`
} yield ()).transactionally
Obviously for DML you can use only actions (Query is for DQL - being simply SELECT).
So first thing is - change your code to use only DBIOs. Below example is incorrect.
val query = (for {
item <- SomeTable.result // this is `DBIO` now
_ <- OtherTable.filter(_.id === item.id).delete // but this won't work !!
} yield ()).transactionally
OK, we are nearly there - the problem is that it doesn't compile. What you need to do is to be aware that now this part:
item <- SomeTable.result
returns Seq of your SomeTable case class (which among other things contains your id).
So let's take into account:
val query = (for {
items <- SomeTable.result // I changed the name to `items` to reflect it's plural nature
_ <- OtherTable.filter(_.id.inset(items.map(_.id))).delete // I needed to change it to generate `IN` query
} yield ()).transactionally

Slick/Scala: What is a Rep[Bind] and how do I turn it into a value?

I'm trying to figure out Slick (the Scala functional relational model). I've started to build a prototype in Slick 3.0.0 but of course... most of the documentation is either out of date or incomplete.
I've managed to get to a point where I can create a schema and return an object from the database.
The problem is, what I'm getting back is a "Rep[Bind]" and not the object I would expect to get back. I can't figure out what to do this this value. For instance, if I try something like rep.countDistinct.result, I get a crash.
Here's a quick synopsis of the code... some removed for brevity:
class UserModel(tag: Tag) extends Table[User](tag, "app_dat_user_t") {
def id = column[Long]("n_user_id", O.PrimaryKey)
def created = column[Long]("d_timestamp_created")
def * = (id.?, created) <> (User.tupled, User.unapply)
}
case class User(id: Option[Long], created: Long)
val users = TableQuery[UserModel]
(users.schema).create
db.run(users += User(Option(1), 2))
println("ID is ... " + users.map(_.id)) // prints "Rep[Bind]"... huh?
val users = for (user <- users) yield user
println(users.map(_.id).toString) // Also prints "Rep[Bind]"...
I can't find a way to "unwrap" the Rep object and I can't find any clear explanation of what it is or how to use it.
Rep[] is a replacement to the Column[] datatype used in slick .
users.map(_.id) returns values of the Column('n_user_id') for all rows
val result : Rep[Long] = users.map(_.id)
users.map(_.id) // => select n_user_id from app_dat_user_t;
The obtained value is of type Column[Long] [ which is now Rep[Long] ].
You cannot directly print values of the above resultSet as it is not of any scala collection type
You can first convert it to some scala collection and then print it as
below :
var idList : List[Long] = List()
users.map(_.id).forEach(id =>
idList = idList :+ id
)
println(idList)** // if you need to print all ids at once
else you can simply use :
users.map(_.id).forEach(id =>
println(id)
) // print for each id
And ,
val users = TableQuery[UserModel] // => returns Query[UserModel, UserModel#TableElementType, Seq])
val users = for (user <- users) yield user // => returns Query[UserModel, UserModel#TableElementType, Seq])
both mean the same , So you can directly use the former and remove the latter

SLICK: How to use query result in another query?

I'd like to perform something like the following:
I'd like to return a list of users sorted first by who the user is "following", second by some additional point score.
The following code below which I wrote however doesn't work because the funder is the lifted Slick type and therefore is never found in the List.
//The following represents the query for only funders who we are following
val following_funders: List[User] = (
for {
funder <- all_funders
f <- follows if f.followerId === id //get all the current users follower objects
if f.followeeId === funder.id
} yield funder
).list
val all_funders_sorted = for {
funder <- all_funders
following_funder = following_funders contains funder
} yield (funder, following_funder)
//sort the funders by whether or not they are following the funder and then map it to only the funders (i.e. remove the boolean)
all_funders_sorted.sortBy(_._2.desc).sortBy(_._1.score.desc).map( x => x._1 )
All help appreciated!
You need to work with ids (i.e. primary keys) in Slick. That's how objects are uniquely identified on the db side. You do not need to execute the first query. You can use it as a component of your second without executing it first using the in operator:
//The following represents the query for only funders who we are following
val following_funders_ids = (
for {
funder <- all_funders
f <- follows if f.followerId === id //get all the current users follower objects
if f.followeeId === funder.id
} yield funder.id
val all_funders_sorted = for {
funder <- all_funders
following_funder = funder.id in following_funders_ids
} yield (funder, following_funder)
//sort the funders by whether or not they are following the funder and then map it to only the funders (i.e. remove the boolean)
all_funders_sorted.sortBy(_._1.impactPoints.desc).sortBy(_._2.desc).map( x => x._1 )
Be aware that your sort order was wrong, if you first want to sort by following. Slick translates .sortBy(_.a).sortBy(_.b) to ORDER BY B,A because that's how Scala collections work:
scala> List( (1,"b"), (2,"a") ).sortBy(_._1).sortBy(_._2)
res0: List[(Int, String)] = List((2,a), (1,b))
Ended up figuring it out the following way by using 'inSet'
//The following represents the query for only funders who we are following
val following_funders_ids: List[Long] = (
for {
funder <- all_funders
f <- follows if f.followerId === id //get all the current users follower objects
if f.followeeId === funder.id
} yield funder.id
).list
val all_funders_sorted = for {
funder <- all_funders
following_funder = funder.id inSet following_funders_ids
} yield (funder, following_funder)
//sort the funders by whether or not they are following the funder and then map it to only the funders (i.e. remove the boolean)
all_funders_sorted.sortBy(_._2.desc).sortBy(_._1.impactPoints.desc).map( x => x._1 )