I am currently incrementing a column (not a auto-increment PK column) in my database using the following:
def incrementLikeCount(thingId: Int)(implicit session: Session) = {
sqlu"update things set like_count = like_count + 1 where id = $thingId".first
}
Is this currently (slick 2.0.2) the best and fastest way to do this? (I'm using postgresql)
I was hoping for a more typesafe way of doing this e.g. if I rename my table or column I want compile time errors.
I don't want to read in the row and then update, because then I would have to wrap the call in a transaction during the read + write operation and that is not as efficient as I would want.
I would love if there was a way to do this using the normal slick api, and also be able to update/increment multiple counters at the same time in a single operation (but even one column increment/decrement at a time would be lovely)
Not on Slick, in the lovely ScalaQuery stone ages here, but you should be able to use what was called a MutatingUnitInvoker to modify DB row in place (i.e. perform a single query).
Something like:
val q = for{id <- Parameters[Int]; t <- Things if t.id is id} yield t
def incrementLikeCount(thingId: Int)(implicit session: Session) = {
q(thingId).mutate(r=> r.row.copy(like_count = r.row.like_count + 1))
}
Performance should be acceptible, prepared statement generated once at compile time and a single query against the database. Not sure how you can improve on that in a type safe manner with what Slick currently has on offer.
Related
Im currently using jOOQ to build my SQL (with code generation via the mvn plugin).
Executing the created query is not done by jOOQ though (Using vert.X SqlClient for that).
Lets say I want to select all columns of two tables which share some identical column names. E.g. UserAccount(id,name,...) and Product(id,name,...). When executing the following code
val userTable = USER_ACCOUNT.`as`("u")
val productTable = PRODUCT.`as`("p")
create().select().from(userTable).join(productTable).on(userTable.ID.eq(productTable.AUTHOR_ID))
the build method query.getSQL(ParamType.NAMED) returns me a query like
SELECT "u"."id", "u"."name", ..., "p"."id", "p"."name", ... FROM ...
The problem here is, the resultset will contain the column id and name twice without the prefix "u." or "p.", so I can't map/parse it correctly.
Is there a way how I can say to jOOQ to alias these columns like the following without any further manual efforts ?
SELECT "u"."id" AS "u.id", "u"."name" AS "u.name", ..., "p"."id" AS "p.id", "p"."name" AS "p.name" ...
Im using the holy Postgres Database :)
EDIT: Current approach would be sth like
val productFields = productTable.fields().map { it.`as`(name("p.${it.name}")) }
val userFields = userTable.fields().map { it.`as`(name("p.${it.name}")) }
create().select(productFields,userFields,...)...
This feels really hacky though
How to correctly dereference tables from records
You should always use the column references that you passed to the query to dereference values from records in your result. If you didn't pass column references explicitly, then the ones from your generated table via Table.fields() are used.
In your code, that would correspond to:
userTable.NAME
productTable.NAME
So, in a resulting record, do this:
val rec = ...
rec[userTable.NAME]
rec[productTable.NAME]
Using Record.into(Table)
Since you seem to be projecting all the columns (do you really need all of them?) to the generated POJO classes, you can still do this intermediary step if you want:
val rec = ...
val userAccount: UserAccount = rec.into(userTable).into(UserAccount::class.java)
val product: Product = rec.into(productTable).into(Product::class.java)
Because the generated table has all the necessary meta data, it can decide which columns belong to it, and which ones don't. The POJO doesn't have this meta information, which is why it can't disambiguate the duplicate column names.
Using nested records
You can always use nested records directly in SQL as well in order to produce one of these 2 types:
Record2<Record[N], Record[N]> (e.g. using DSL.row(table.fields()))
Record2<UserAccountRecord, ProductRecord> (e.g using DSL.row(table.fields()).mapping(...), or starting from jOOQ 3.17 directly using a Table<R> as a SelectField<R>)
The second jOOQ 3.17 solution would look like this:
// Using an implicit join here, for convenience
create().select(productTable.userAccount(), productTable)
.from(productTable)
.fetch();
The above is using implicit joins, for additional convenience
Auto aliasing all columns
There are a ton of flavours that users could like to have when "auto-aliasing" columns in SQL. Any solution offered by jOOQ would be no better than the one you've already found, so if you still want to auto-alias all columns, then just do what you did.
But usually, the desire to auto-alias is a derived feature request from a misunderstanding of what's the best approch to do something in jOOQ (see above options), so ideally, you don't follow down the auto-aliasing road.
I have a spark dataframe (let's call it "records") like the following one:
id
name
a1
john
b"2
alice
c3'
joe
If you notice, the primary key column (id) values may have single/double quotes in them (like the second and third row in the dataframe).
I wrote following scala code to check for quotes in primary key column values:
def checkForQuotesInPrimaryKeyColumn(primaryKey: String, records: DataFrame): Boolean = {
// Extract primary key column values
val pkcValues = records.select(primaryKey).collect().map(_(0)).toList
// Check for single and double quotes in the values
var checkForQuotes = false // indicates no quotes
breakable {
pkcValues.foreach(pkcValue => {
if (pkcValue.toString.contains("\"") || pkcValue.toString.contains("\'")) {
checkForQuotes = true
println("Value that has quotes: " + pkcValue.toString)
break()
}
})}
checkForQuotes
}
This code works. But it doesn't take advantage of spark functionalities. I wish to make use of spark executors (and other features) that can complete this task faster.
The updated function looks like the following:
def checkForQuotesInPrimaryKeyColumnsUpdated(primaryKey: String, records: DataFrame): Boolean = {
val findQuotes = udf((s: String) => if (s.contains("\"") || s.contains("\'")) true else false)
records
.select(findQuotes(col(primaryKey)) as "quotes")
.filter(col("quotes") === true)
.collect()
.nonEmpty
}
The unit tests give similar runtimes on my machine for both the functions when run on a dataframe with 100 entries.
Is the updated function any faster (and/or better) than the original function? Is there any way the function can be improved?
Your first approach collects the entire dataframe to the driver. If your data does not fit into the driver's memory, it is going to break. Also you are right, you do not take advantage of spark.
The second approach uses spark to detect quotes. That's better. The problem is that you then collect a dataframe containing one boolean per record containing a quote to the driver just to see if there is at least one. This is a waste of time, especially if many records contain quotes. It is also a shame to use a UDF for this, since they are known to be slower than spark SQL primitives.
You could simply use spark to count the number records containing a quote, without collecting anything.
records.where(col(primaryKey).contains("\"") || col(primaryKey).contains("'"))
.count > 0
Since, you do not actually care about the number of records. You just want to check if there is at least one, you could use limit(1). SparkSQL will be able to further optimize the query:
records.where(col(primaryKey).contains("\"") || col(primaryKey).contains("'"))
.limit(1).count > 0
NB: it makes sense that in unit tests, with little data, both of your queries take the same time. Spark is meant for big data and has some overhead. With real data, your second approach should be faster than the first and the one I propose even so. Also, your first approach will get an OOM on the driver as soon as you add in more data.
I am using Scala, Slick and Postgres to build an application. I have used Slick code generator to generate the slick tables.
I want to know if there is any way to validate if the database table schema and the slick table schema matches and do that for all slick tables in my application.
For example:
class DepartmentTable(_tableTag: Tag) extends Table[Department](_tableTag, Some("base"), "Department") {
val id: Rep[Long] = column[Long]("DepartmentId", O.AutoInc, O.PrimaryKey)
val name: Rep[String] = column[String]("Name", O.Length(50,varying=true))
val shortCode: Rep[String] = column[String]("ShortCode", O.Length(50,varying=true))
def * = ???
def ? = ???
}
I changed the database table, say add a column parentDepartmentId to the table and then added the same to the Slick table. Many a times, there have been issues that the alter scripts are not run on the test database and hence we will be getting some run time exceptions.
To avoid such issues, I was trying to implement something to check if the slick table matches with actual postgres table. Is it achievable ?
I tried with reflection, but not able to get all the details from the slick table. eg: actual column name
Slick Version : 3.0
What I am trying to achieve?
On startup of the application, I want to compare the database schema with the slick schema.
My plan:
Get all the TableQuery / Slick Tables from my application
Get the actual database schema using the Slick Meta
Compare slick tablequery structure with the actual db
Now, as Maxim suggested, I can create a registry and add each table to the registry. I just want to check if there is any other way. The reason is that if I or someone else accidentally removed adding a couple of table query to the registry, the check on that table will not be done. I am just trying to be more safer, but not sure if any such method exist.
You can use slick.meta to achieve this. You are not saying which version of slick you are using so I am going to show an example using slick 3.0, but it should be really similar if you were using slick 2.x replacing the DBIO with the old withSession API and removing the reference to ExecutionContext and Future.
Here it is how you can print all the columns every table in the schema assuming that you have an implicit ExecutionContext in scope, that you import YourDriver.api._ and replace the ??? with an actual Database instance:
val db: Database = ???
val tablesWithCols = for {
tables <- slick.jdbc.meta.MTable.getTables
withCols <- DBIO.sequence(tables.map(t => t.getColumns.map((t, _))))
} yield withCols
val printLines: DBIO[Seq[String]] = tablesWithCols.map {
_.map {
case (t, cs) => s"Table: ${t.name.name} - columns: ${cs.map(_.name).mkString(", ")}"
}
}
val res: Future[Seq[String]] = db.run(printLines)
res.foreach(println)
Also, please not that the last foreach invocation is performed on a Future so you may want to wait on the future to complete or (better) to chain it with relevant computations; if your program terminates without waiting/chaining you won't probably see anything from there.
Surprisingly, a somewhat more complex matter is getting the information out of the slick table definitions; the only way I found to do it is something like this:
TableQuery[YourTable].toNode.getDumpInfo
That will give you an AST-like structure that you can traverse to get out the definitions you need; the structure itself is not that pleasant to traverse but it should contain everything you need.
Another approach that you could explore to avoid this troubles could be creating a layer that wraps the generation of slick definitions and expose relevant metadata in a more accessible way; not sure if this wouldn't get you in bigger troubles though.
Here is an example of how you can detect if for a given Slick table the number, names and SQL types of all columns in the database schema that is supposed to correspond with the table equal the number, names and SQL types of columns in the Slick table description of the table
def ?[AT <: AbstractTable[_]](tableQuery: profile.api.TableQuery[AT])
(implicit ec: ExecutionContext) = {
val table = tableQuery.baseTableRow.create_*.map(c =>
(c.name, profile.jdbcTypeFor(c.tpe).sqlType)).toSeq.sortBy(_._1)
MTable.getTables(tableQuery.baseTableRow.tableName).headOption.map(
_.map{_.getColumns.map(
_.sortBy(_.name).map(c => (c.name, c.sqlType)) == table
)}
) flatMap (_.head)
}
You can also detect whether indexes, primary and foreign keys are identical to some extent. For that you can correspondingly combine
tableQuery.baseTableRow.indexes
tableQuery.baseTableRow.primaryKeys
tableQuery.baseTableRow.foreignKeys
with the following methods of MTable
getIndexInfo
getPrimaryKeys
getImportedKeys
as I did with tableQuery.baseTableRow.create_* and getColumns in the excerpt.
Now having this method you can easily check all the tables you have in your code. The only really easy question is how to get their list. To say the truth, I do not even understand how it can be a problem, as it is just a matter of keeping a centralized registry where you can enlist a table each time it is created in your code and which you can query for the objects stored. Let's say you have such registry with the methods enlistTable and listTables then your workflow will look something like
val departmentTable = TableQuery[DepartmentTable]
regsitry.enlistTable(departmentTable)
...
val someTable = TableQuery[SomeTableStructureClass]
regsitry.enlistTable(someTable)
...
val anotherTable = TableQuery[AnotherTableStructureClass]
regsitry.enlistTable(anotherTable)
...
for(table <- regsitry.listTables)
db run ?(table) map ( columnsAndTypesAreIdentical => ... )
...
.
The Slick code generator you used "generates Table classes, corresponding TableQuery values,..., as well as case classes for holding complete rows of values" by default. The corresponding TableQuery values have exactly the form val someTable = TableQuery[SomeTableStructureClass].
One way to achieve it would be like this:
val now = DateTime.now
val today = now.toLocalDate
val tomorrow = today.plusDays(1)
val startOfToday = today.toDateTimeAtStartOfDay(now.getZone)
val startOfTomorrow = tomorrow.toDateTimeAtStartOfDay(now.getZone)
val todayLogItems = logItems.filter(logItem =>
logItem.MyDateTime >= startOfToday && logItem.MyDateTime < startOfTomorrow
).list
Is there any way to write the query in a more concise way? Something on the lines of:
logItems.filter(_.MyDateTime.toDate == DateTime.now.toDate).list
I'm asking this because in LINQ to NHibernate that is achievable (Fetching records by date with only day part comparison using nhibernate).
Unless the Slick joda mapper adds support for comparisons you are out of luck unless you add it yourself. For giving it a shot these may be helpful pointers:
* http://slick.typesafe.com/doc/2.0.0/userdefined.html
* http://slick.typesafe.com/doc/2.0.0/api/#scala.slick.lifted.ExtensionMethods
* https://github.com/slick/slick/blob/2.0.0/src/main/scala/scala/slick/lifted/ExtensionMethods.scala
I create a ticket to look into it in Slick at some point: https://github.com/slick/slick/issues/627
You're confusing matters by working with LocalDateTimes instead of using LocalDates directly:
val today = LocalDate.now
val todayLogItems = logItems.filter(_.MyDateTime.toLocalDate isEqual today)
UPDATE
A Major clarification is needed on the question here, Slick was only mentioned in passing, by way of a tag.
However... Slick is central to this question, which hinges on the fact that filter operation is actually into an SQL query by way of PlainColumnExtensionMethods
I'm not overly familiar with the library, but this must surely mean that you're restricted to just operations which can be executed in SQL. As this is a Column[DateTime] you must therefore compare it to another DateTime.
As for the LINQ example, it seems to recommend first fetching everything and then proceeding as per my example above (performing the comparison in Scala and not in SQL). This is an option, but I suspect you won't want the performance cost that it entails.
UPDATE 2 (just to clarify)
There is no answer.
There's no guarantee that your underlying database has the ability to do an equality check between dates and timestamps, slick therefore can't rely on such an ability existing.
You're stuck between a rock and a hard place. Either do the range check between timestamps as you already are, or pull everything from the query and filter it in Scala - with the heavy performance cost that this would likely involve.
FINAL UPDATE
To refer to the Linq/NHibernate question you referenced, here are a few quotes:
You can also use the date function from Criteria, via SqlFunction
It depends on the LINQ provider
I'm not sure if NHibernate LINQ provider supports...
So the answers there seem to be either:
Relying on NHibernate to push the date coercion logic into the DB, perhaps silently crippling performance (by fetching all records and filtering locally) if this is not possible
Relying on you to write custom SQL logic
The best-case scenario is that NHibernate could translate date/timestamp comparisons into timestamp range checks. Doing something like that is quite a deep question about how Slick (and slick-joda-mapper) can handle comparisons, the fact that you'd use it in a filter is incidental.
You'd need an extremely compelling use-case to write a feature like this yourself, given the risk for creating complicated bugs. You'd be better off:
splitting the column into separate date/time columns
adding the date as a calculated column (maybe in a view)
using custom SQL (or a stored proc) for the query
sticking with the range check
using a helper function
In the case of a helper:
def equalsDate(dt: LocalDate) = {
val start = dt.toDateTimeAtStartOfDay()
val end = dt.plusDays(1).toDateTimeAtStartOfDay()
(col: Column[DateTime]) => {
col >= start && col < end
}
}
val isToday = equalsDate(LocalDate.now)
val todayLogItems = logItems.filter(x => isToday(x.MyDateTime))
I'm currently developping a small application in Scala using the Play framework and I would like to persist a list of operations made by a user. Is is possible to store a simple list of ids (List[Long]) using just Anorm like I'm doing?
Otherwise, what else could I use to make it work? Do I need to use an ORM like explained in Scala Play! Using anorm or ORM?
If you're talking about persisting to a SQL database then Anorm can certainly handle that for you.
At the most basic level, you could create a table of long integers in your SQL database and then use Anorm to persist your list. Assume your store your integers in a single-column table called UserActions with its sole column called action:
def saveList(list: List[Long]) = {
DB.withConnection { implicit connection =>
val insertQuery = SQL("insert into UserActions(action) values ({action})")
val batchInsert = (insertQuery.asBatch /: list)(
(sql, elem) => sql.addBatchParams(elem)
)
batchInsert.execute()
}
}
I threw together a little demo for you and I'm pushing it to Heroku, I'll update with the link soon (edit: Heroku and I aren't getting along tonight, sorry).
The code is in my Github at: https://github.com/ryantanner/anorm-batch-demo
Look in models/UserActions.scala to find that snippet specifically. The rest is just fluff to make the demo more interesting.
Now, I'd take a step back for a moment and ask yourself what information you need about these user operations. Semantically, what does that List[Long] mean? Do you need to store more information about those user actions? Should it actually be something like rows of (UserID, PageVisited, Timestamp)?
Untested
I think you need to create a batch insert statement like this:
val insertStatement =
SQL("""INSERT INTO UserOperations (id) VALUES ({id})""")
.asBatch
.addBatchParamsList(List(Seq(1), Seq(2)))
.execute()
BatchSql of Anorm have been recently updated. You may want to check out the latest.