Insert if not exists in Slick 3.0.0 - scala

I'm trying to insert if not exists, I found this post for 1.0.1, 2.0.
I found snippet using transactionally in the docs of 3.0.0
val a = (for {
ns <- coffees.filter(_.name.startsWith("ESPRESSO")).map(_.name).result
_ <- DBIO.seq(ns.map(n => coffees.filter(_.name === n).delete): _*)
} yield ()).transactionally
val f: Future[Unit] = db.run(a)
I'm struggling to write the logic from insert if not exists with this structure. I'm new to Slick and have little experience with Scala. This is my attempt to do insert if not exists outside the transaction...
val result: Future[Boolean] = db.run(products.filter(_.name==="foo").exists.result)
result.map { exists =>
if (!exists) {
products += Product(
None,
productName,
productPrice
)
}
}
But how do I put this in the transactionally block? This is the furthest I can go:
val a = (for {
exists <- products.filter(_.name==="foo").exists.result
//???
// _ <- DBIO.seq(ns.map(n => coffees.filter(_.name === n).delete): _*)
} yield ()).transactionally
Thanks in advance

It is possible to use a single insert ... if not exists query. This avoids multiple database round-trips and race conditions (transactions may not be enough depending on isolation level).
def insertIfNotExists(name: String) = users.forceInsertQuery {
val exists = (for (u <- users if u.name === name.bind) yield u).exists
val insert = (name.bind, None) <> (User.apply _ tupled, User.unapply)
for (u <- Query(insert) if !exists) yield u
}
Await.result(db.run(DBIO.seq(
// create the schema
users.schema.create,
users += User("Bob"),
users += User("Bob"),
insertIfNotExists("Bob"),
insertIfNotExists("Fred"),
insertIfNotExists("Fred"),
// print the users (select * from USERS)
users.result.map(println)
)), Duration.Inf)
Output:
Vector(User(Bob,Some(1)), User(Bob,Some(2)), User(Fred,Some(3)))
Generated SQL:
insert into "USERS" ("NAME","ID") select ?, null where not exists(select x2."NAME", x2."ID" from "USERS" x2 where x2."NAME" = ?)
Here's the full example on github

This is the version I came up with:
val a = (
products.filter(_.name==="foo").exists.result.flatMap { exists =>
if (!exists) {
products += Product(
None,
productName,
productPrice
)
} else {
DBIO.successful(None) // no-op
}
}
).transactionally
It's is a bit lacking though, for example it would be useful to return the inserted or existing object.
For completeness, here the table definition:
case class DBProduct(id: Int, uuid: String, name: String, price: BigDecimal)
class Products(tag: Tag) extends Table[DBProduct](tag, "product") {
def id = column[Int]("id", O.PrimaryKey, O.AutoInc) // This is the primary key column
def uuid = column[String]("uuid")
def name = column[String]("name")
def price = column[BigDecimal]("price", O.SqlType("decimal(10, 4)"))
def * = (id, uuid, name, price) <> (DBProduct.tupled, DBProduct.unapply)
}
val products = TableQuery[Products]
I'm using a mapped table, the solution works also for tuples, with minor changes.
Note also that it's not necessary to define the id as optional, according to the documentation it's ignored in insert operations:
When you include an AutoInc column in an insert operation, it is silently ignored, so that the database can generate the proper value
And here the method:
def insertIfNotExists(productInput: ProductInput): Future[DBProduct] = {
val productAction = (
products.filter(_.uuid===productInput.uuid).result.headOption.flatMap {
case Some(product) =>
mylog("product was there: " + product)
DBIO.successful(product)
case None =>
mylog("inserting product")
val productId =
(products returning products.map(_.id)) += DBProduct(
0,
productInput.uuid,
productInput.name,
productInput.price
)
val product = productId.map { id => DBProduct(
id,
productInput.uuid,
productInput.name,
productInput.price
)
}
product
}
).transactionally
db.run(productAction)
}
(Thanks Matthew Pocock from Google group thread, for orienting me to this solution).

I've run into the solution that looks more complete. Section 3.1.7 More Control over Inserts of the Essential Slick book has the example.
At the end you get smth like:
val entity = UserEntity(UUID.random, "jay", "jay#localhost")
val exists =
users
.filter(
u =>
u.name === entity.name.bind
&& u.email === entity.email.bind
)
.exists
val selectExpression = Query(
(
entity.id.bind,
entity.name.bind,
entity.email.bind
)
).filterNot(_ => exists)
val action = usersDecisions
.map(u => (u.id, u.name, u.email))
.forceInsertQuery(selectExpression)
exec(action)
// res17: Int = 1
exec(action)
// res18: Int = 0

according to the slick 3.0 manual insert query section (http://slick.typesafe.com/doc/3.0.0/queries.html), the inserted values can be returned with id as below:
def insertIfNotExists(productInput: ProductInput): Future[DBProduct] = {
val productAction = (
products.filter(_.uuid===productInput.uuid).result.headOption.flatMap {
case Some(product) =>
mylog("product was there: " + product)
DBIO.successful(product)
case None =>
mylog("inserting product")
(products returning products.map(_.id)
into ((prod,id) => prod.copy(id=id))) += DBProduct(
0,
productInput.uuid,
productInput.name,
productInput.price
)
}
).transactionally
db.run(productAction)
}

Related

Performance slick query one to many

I'm using play 2.5 and slick 3.1.1 and I'm trying to build optimal query for multiple relations one to many, and one to one. I have a such db model:
case class Accommodation(id: Option[Long], landlordId: Long, name: String)
case class LandLord(id: Option[Long], name: String)
case class Address(id: Option[Long], accommodationId: Long, street: String)
case class ExtraCharge(id: Option[Long], accommodationId: Long, title: String)
For data output:
case class AccommodationFull(accommodation: Accommodation, landLord: LandLord, extraCharges:Seq[ExtraCharge], addresses:Seq[Address])
I've created two queries to get accommodation by id:
/** Retrieve a accommodation from the id. */
def findByIdFullMultipleQueries(id: Long): Future[Option[AccommodationFull]] = {
val q = for {
(a, l) <- accommodations join landLords on (_.landlordId === _.id)
if a.id === id
} yield (a, l)
for {
(data) <- db.run(q.result.headOption)
(ex) <- db.run(extraCharges.filter(_.accommodationId === id).result)
(add) <- db.run(addresses.filter(_.accommodationId === id).result)
} yield data.map { accLord => AccommodationFull(accLord._1, accLord._2, ex, add) }
}
/** Retrieve a accommodation from the id. */
def findByIdFull(id: Long): Future[Option[AccommodationFull]] = {
val qr = accommodations.filter(_.id === id).join(landLords).on(_.landlordId === _.id)
.joinLeft(extraCharges).on(_._1.id === _.accommodationId)
.joinLeft(addresses).on(_._1._1.id === _.accommodationId)
.result.map { res =>
res.groupBy(_._1._1._1.id).headOption.map {
case (k, v) =>
val addresses = v.flatMap(_._2).distinct
val extraCharges = v.flatMap(_._1._2).distinct
val landLord = v.map(_._1._1._2).head
val accommodation = v.map(_._1._1._1).head
AccommodationFull(accommodation, landLord, extraCharges, addresses)
}
}
db.run(qr)
}
After tests multiple query is like 5x faster than join. How can I create more optimal join query?
=== Update ===
I'm testing now on postgresql 9.3 with data:
private[bootstrap] object InitialData {
def landLords = (1L to 10000L).map { id =>
LandLord(Some(id), s"Good LandLord $id")
}
def accommodations = (1L to 10000L).map { id =>
Accommodation(Some(id), s"Nice house $id", 100 * id, 3, 5, 500, 1l, None)
}
def extraCharge = (1L to 10000L).flatMap { id =>
(1 to 100).map { nr =>
ExtraCharge(None, id, s"Extra $nr", 100.0)
}
}
def addresses = (1L to 1000L).flatMap { id =>
(1 to 100).map { nr =>
Address(None, id, s"SĹ‚oneczna 4 - $nr", "17-200", "", "PL")
}
}
}
and here results for multiple runs (ms):
JOIN: 367
MULTI: 146
JOIN: 306
MULTI: 110
JOIN: 300
MULTI: 103
== Update 2 ==
After adding indexes it's better, but still multi is much faster:
def accommodationLandLordIdIndex = index("ACCOMMODATION__LANDLORD_ID__INDEX", landlordId, unique = false)
def addressAccommodationIdIndex = index("ADDRESS__ACCOMMODATION_ID__INDEX", accommodationId, unique = false)
def extraChargeAccommodationIdIndex = index("EXTRA_CHARGE__ACCOMMODATION_ID__INDEX", accommodationId, unique = false)
I made a test:
val multiResult = (1 to 1000).map { i =>
val start = System.currentTimeMillis()
Await.result(accommodationDao.findByIdFullMultipleQueries(i), Duration.Inf)
System.currentTimeMillis() - start
}
println(s"MULTI AVG Result: ${multiResult.sum.toDouble / multiResult.length}")
val joinResult = (1 to 1000).map { i =>
val start = System.currentTimeMillis()
Await.result(accommodationDao.findByIdFull(i), Duration.Inf)
System.currentTimeMillis() - start
}
println(s"JOIN AVG Result: ${joinResult.sum.toDouble / joinResult.length}")
here result for 2 runs:
MULTI AVG Result: 3.287
JOIN AVG Result: 96.797
MULTI AVG Result: 3.206
JOIN AVG Result: 100.221
Postgres does not add indexes for foreign key columns. The multi-query is using an index on all three tables (the primary key), while the single join query will scan the join tables for the desired IDs.
Try adding indexes on your accommodationId columns.
Update
While indexes would help if this were a 1:1 relationship, it looks like these are all 1:many relationships. In that case, using joins and a later distinct filter is going to return a lot more data from the database than you need.
For your data model, doing multiple queries looks like the correct way to process the data.
I think it depends on your DB engine. Slick generates queries that may not be optimal (see docs), but you need to profile queries on a database level to understand what's happening and to optimize

Slick3 with SQLite - autocommit seems to not be working

I'm trying to write some basic queries with Slick for SQLite database
Here is my code:
class MigrationLog(name: String) {
val migrationEvents = TableQuery[MigrationEventTable]
lazy val db: Future[SQLiteDriver.backend.DatabaseDef] = {
val db = Database.forURL(s"jdbc:sqlite:$name.db", driver = "org.sqlite.JDBC")
val setup = DBIO.seq(migrationEvents.schema.create)
val createFuture = for {
tables <- db.run(MTable.getTables)
createResult <- if (tables.length == 0) db.run(setup) else Future.successful()
} yield createResult
createFuture.map(_ => db)
}
val addEvent: (String, String) => Future[String] = (aggregateId, eventType) => {
val id = java.util.UUID.randomUUID().toString
val command = DBIO.seq(migrationEvents += (id, aggregateId, None, eventType, "CREATED", System.currentTimeMillis, None))
db.flatMap(_.run(command).map(_ => id))
}
val eventSubmitted: (String, String) => Future[Unit] = (id, batchId) => {
val q = for { e <- migrationEvents if e.id === id } yield (e.batchId, e.status, e.updatedAt)
val updateAction = q.update(Some(batchId), "SUBMITTED", Some(System.currentTimeMillis))
db.map(_.run(updateAction))
}
val eventMigrationCompleted: (String, String, String) => Future[Unit] = (batchId, id, status) => {
val q = for { e <- migrationEvents if e.batchId === batchId && e.id === id} yield (e.status, e.updatedAt)
val updateAction = q.update(status, Some(System.currentTimeMillis))
db.map(_.run(updateAction))
}
val allEvents = () => {
db.flatMap(_.run(migrationEvents.result))
}
}
Here is how I'm using it:
val migrationLog = MigrationLog("test")
val events = for {
id <- migrationLog.addEvent("aggregateUserId", "userAccessControl")
_ <- migrationLog.eventSubmitted(id, "batchID_generated_from_idam")
_ <- migrationLog.eventMigrationCompleted("batchID_generated_from_idam", id, "Successful")
events <- migrationLog.allEvents()
} yield events
events.map(_.foreach(event => event match {
case (id, aggregateId, batchId, eventType, status, submitted, updatedAt) => println(s"$id $aggregateId $batchId $eventType $status $submitted $updatedAt")
}))
The idea is to add event first, then update it with batchId (which also updates status) and then update the status when the job is done. events should contain events with status Successful.
What happens is that after running this code it prints events with status SUBMITTED. If I wait a while and do the same allEvents query or just go and check the db from command line using sqlite3 then it's updated correctly.
I'm properly waiting for futures to be resolved before starting the next operation, auto-commit should be enabled by default.
Am I missing something?
Turns out the problem was with db.map(_.run(updateAction)) which returns Future[Future[Int]] which means that the command was not finished by the time I tried to run another query.
Replacing it with db.flatMap(_.run(updateAction)) solved the issue.

Slick 3.0 bulk insert returning object's order

I want to do a bulk insert using Slick 3.0 ++= function and also using returning to return the inserted objects.
I am wondering whether the return objects (Future[Seq[Something]]) has the same order as my arguments Seq[Something] (without id).
More specifically,
val personList: Seq[Person] = Seq(Person("name1"), Person("name2"), Person("name3"))
persons returning persons ++= personList
Is the result definitely be Future(Seq(Person(1, "name1"), Person(2, "name2"), Person(3, "name3")))? or can be in other result order?
Thanks.
Yes,I believe you are using auto incremented primary key.
I am also doing same as you have mentioned:
case class Person(name: String, id: Option[Int] = None)
class PersonTable(tag: Tag) extends Table[Person](tag, "person") {
val id = column[Int]("id", O.PrimaryKey, O.AutoInc)
val name = column[String]("name")
def * = (name, id.?) <> (Person.tupled, Person.unapply)
}
val personTableQuery = TableQuery[PersonTable]
def personTableAutoIncWithObject =
(personTableQuery returning personTableQuery.map(_.id)).into((person, id) => person.copy(id = Some(id)))
// insert all person without id and return all person with their id.
def insertAll(persons: List[Person]): Future[Seq[Person]] =
db.run { personTableAutoIncWithObject ++= persons }
//unit test for insertion order:
test("Add new persons ") {
val response = insertAll(List(Person("A1"), Person("A2"), Person("A3"), Person("A4"), Person("A5")))
whenReady(response) { persons =>
assert(persons === List(Person("A1", Some(1)), Person("A2", Some(2)), Person("A3", Some(3)),
Person("A4", Some(4)), Person("A5", Some(5))))
}
}
As per I know , the result of the batch inserts are in same order what you are sending to the database and "++=" function will return you the count of records inserted to the table.

How can I filter with inSetBind for multiple columns in Slick?

I have the following table definition (simplified):
class Houses(tag: Tag) extends Table[HouseRow](tag, "HOUSE") {
def houseId = column[Long]("HOUSE_ID", O.NotNull, O.PrimaryKey, O.AutoInc)
def houseName = column[String]("HOUSE_NAME", O.NotNull)
def houseType = column[String]("HOUSE_TYPE", O.NotNull)
def uniqueHouseName = index("UQ_HOUSE_NAME_HOUSE_TYPE", (houseName, houseType), true)
def * = (houseId, houseName, houseType) <> (HouseRow.tupled, HouseRow.unapply)
}
val houses = TableQuery[Houses]
I'd like to select houses that match on a set of the uniqueHouseName index as follows.
case class HouseKey(houseName: String, houseType: String)
val houseKeys: Seq(HouseKey("name1", "type1"), HouseKey("name2", "type2"))
A naive inSetBind filter will match on for eg. HouseRow(ID, "name1", "type2") which is incorrect.
In MySql I would do something like:
SELECT * FROM HOUSE h
WHERE(h.HOUSE_TYPE, d.HOUSE_NAME) IN
(
SELECT 'type1' as HOUSE_TYPE, 'name1' as HOUSE_NAME
UNION
SELECT 'type2', 'name2'
);
Like #cvogt version, but doesn't blow up on empty list:
val filteredHouses =
houses.filter(h =>
houseKeys.map(hk => h.houseName === hk.houseName &&
h.houseType === hk.houseType)
.reduceOption(_ || _).getOrElse(false: Rep[Boolean])
)
Tested in slick 3.1.0
Adapting tuxdna's answer to allow arbitrary seqs. This query can however not be precompiled to SQL at the moment and has a runtime overhead.
val filteredHouses =
houses.filter(h =>
houseKeys.map(hk => h.houseName === hk.houseName && h.houseType === hk.houseType)
.reduce(_ || _)
)
This is not complete answer, but for only two pairs of values you could do this:
val filteredHouses = for {
h <- houses
if (h.houseName === "name1" && h.houseType === "type1") || (
h.houseName === "name2" && h.houseType === "type2")
} yield h

Select rows based on MAX values of a column in ScalaQuery/SLICK

Say that i have table such as:
UserActions
UserId INT
ActionDate TIMESTAMP
Description TEXT
that holds dates where users perfomed certainActions. If i wanted to get the last action that every user perfomed, i would have to do something like this in SQL:
SELECT *
FROM UserActions,
(
SELECT ua.UserId,
max(ua.ActionDate) AS lastActionDate
FROM UserActions ua
GROUP BY ua.UserId
) AS lastActionDateWithUserId
WHERE UserActions.UserId = lastActionDateWithUserId.UserId
AND UserActions.ActionDate = lastActionDateWithUserId.lastActionDate
Now, assume that i already have a table structure set up in scalaquery 0.9.5 for the UserActions such as:
case class UserAction(userId:Int,actionDate:Timestamp,description:String)
object UserActions extends BasicTable[UserAction]("UserActions"){
def userId = column[Int]("UserId")
def actionDate = column[Timestamp]("ActionDate")
def description = column[String]("Description")
def * = userId ~ actionDate ~ description <> (UserAction, UserAction.unapply _)
}
My question is: in ScalaQuery/SLICK how can i perform such a query?.
I have used Slick 1.0.0 with Scala 2.10.
I defined the objects like this:
case class UserAction(userId: Int, actionDate: Timestamp, description: String)
object UserActions extends Table[UserAction]("UserActions") {
def userId = column[Int]("UserId")
def actionDate = column[Timestamp]("ActionDate")
def description = column[String]("Description")
def * = userId ~ actionDate ~ description <> (UserAction, UserAction.unapply _)
}
Within session block
Database.forURL("jdbc:h2:mem:test1", driver = "org.h2.Driver") withSession {
//...
}
I inserted some sample data
UserActions.insert(UserAction(10, timeStamp, "Action 1"))
UserActions.insert(UserAction(10, timeStamp, "Action 2"))
UserActions.insert(UserAction(10, timeStamp, "Action 3"))
UserActions.insert(UserAction(20, timeStamp, "Action 1"))
UserActions.insert(UserAction(20, timeStamp, "Action 2"))
UserActions.insert(UserAction(30, timeStamp, "Action 1"))
Query(UserActions).list foreach println
First thing to do is create the max query
// group by userId and select the userId and the max of the actionDate
val maxQuery =
UserActions
.groupBy { _.userId }
.map {
case (userId, ua) =>
userId -> ua.map(_.actionDate).max
}
The resulting query looks like this
val result =
for {
ua <- UserActions
m <- maxQuery
if (ua.userId === m._1 && ua.actionDate === m._2)
} yield ua
result.list foreach println