I have a simple method to retrieve a user from a db with Sclick plain SQL method:
object Data {
implicit val getListStringResult = GetResult[List[String]] (
prs => (1 to prs.numColumns).map(_ => prs.nextString).toList
)
def getUser(id: Int): Option[List[String]] = DB.withSession {
sql"""SELECT * FROM "user" WHERE "id" = $id""".as[List[String]].firstOption
}
}
The result is List[String] but I would like it to be something like Map[String, String] - column name and value pair map. Is this possible? If so, how?
My stack is Play Framework 2.2.1, Slick 1.0.1, Scala 2.10.3, Java 8 64bit
import scala.slick.jdbc.meta._
val columns = MTable.getTables(None, None, None, None)
.list.filter(_.name.name == "USER") // <- upper case table name
.head.getColumns.list.map(_.column)
val user = sql"""SELECT * FROM "user" WHERE "id" = $id""".as[List[String]].firstOption
.map( columns zip _ toMap )
It's possible to do this without querying the table metadata as follows:
implicit val resultAsStringMap = GetResult[Map[String,String]] ( prs =>
(1 to prs.numColumns).map(_ =>
prs.rs.getMetaData.getColumnName(prs.currentPos+1) -> prs.nextString
).toMap
)
It's also possible to build a Map[String,Any] in the same way but that's obviously more complicated.
Related
I am trying to implement simple db query with optional pagination. My try:
def getEntities(limit: Option[Int], offset: Option[Int]) = {
// MyTable is a slick definition of the db table
val withLimit = limit.fold(MyTable)(l => MyTable.take(l)) // Error here.
// Mytable and MyTable.take(l)
// has various types
val withOffset = offset.fold(withLimit)(o => withLimit.drop(o))
val query = withOffset.result
db.run(query)
}
The problem is an error:
type mismatch:
found: slick.lifted.Query
required: slick.lifted.TableQuery
How to make this code runnable? And maybe a little bit prettier?
My current fix to get Query from TableQuery is to add .filter(_ => true), but IMHO this is not nice:
val withLimit = limit.fold(MyTable.filter(_ => true))(l => MyTable.take(l))
Try to replace
val MyTable = TableQuery[SomeTable]
with
val MyTable: Query[SomeTable, SomeTable#TableElementType, Seq] = TableQuery[SomeTable]
i.e. to specify the type (statically upcast TableQuery to Query).
How to convert rwo columns from a data frame to Map(col1, col2) in scala ?
I tried :
val resultMap = df.select($"col1", $"col2")
.map ({
case Row(a:String, b: String) => Map(a.asInstanceOf[String] ->b.asInstanceOf[String] )
})
But I couldn't able to get the values from this map. Is there any other way to do this ?
There is no Dataset Encoder for Map[String, String], I'm not even sure you can actually make one at all.
Here are two versions, one which is unsafe and other which is safe, to do what you want to do. Effectively you'll need to reduce to RDD level to do the computation:
case class OnFrame(df: DataFrame) {
import df.sparkSession.implicits._
/**
* If input columns don't match we'll fail at query evaluation.
*/
def unsafeRDDMap: RDD[Map[String, String]] = {
df.rdd.map(row => Map(row.getAs[String]("col1") -> row.getAs[String]("col2")))
}
/**
* Use Dataset-to-case-class mapping.
* If input columns don't match we'll fail before query evaluation.
*/
def safeRDDMap: RDD[Map[String, String]] = {
df
.select($"col1" as "key", $"col2" as "value")
.as[OnFrame.Entry]
.rdd
.map(_.toMap)
}
def unsafeMap(): Map[String, String] = {
unsafeRDDMap.reduce(_ ++ _)
}
def safeMap(): Map[String, String] = {
safeRDDMap.reduce(_ ++ _)
}
}
If you provide more clearly what your goal is perhaps we could this even more efficiently as collecting everything into a single map is a potential Spark anti-pattern - meaning your data fits into the driver.
I am looking for a way to generate an UPDATE query over multiple columns that are only known at runtime.
For instance, given a List[(String, Int)], how would I go about generating a query in the form of UPDATE <table> SET k1=v1, k2=v2, kn=vn for all key/value pairs in the list?
I have found that, given a single key/value pair, a plain SQL query can be built as sqlu"UPDATE <table> SET #$key=$value (where the key is from a trusted source to avoid injection), but I've been unsuccessful in generalizing this to a list of updates without running a query for each.
Is this possible?
This is one way to do it. I create a table definition T here with table and column names (TableDesc) as implicit arguments. I would have thought that it should be possible to set them explicitly, but I couldn't find it. For the example a create to table query instances, aTable and bTable. Then I insert and select some values and in the end I update a value in the bTable.
import slick.driver.H2Driver.api._
import scala.concurrent.Await
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.duration.Duration
import scala.util.{Failure, Success}
val db = Database.forURL("jdbc:h2:mem:test1;DB_CLOSE_DELAY=-1", "sa", "", null, "org.h2.Driver")
case class TableDesc(tableName: String, intColumnName: String, stringColumnName: String)
class T(tag: Tag)(implicit tableDesc: TableDesc) extends Table[(String, Int)](tag, tableDesc.tableName) {
def stringColumn = column[String](tableDesc.intColumnName)
def intColumn = column[Int](tableDesc.stringColumnName)
def * = (stringColumn, intColumn)
}
val aTable = {
implicit val tableDesc = TableDesc("TABLE_A", "sa", "ia")
TableQuery[T]
}
val bTable = {
implicit val tableDesc = TableDesc("TABLE_B", "sb", "ib")
TableQuery[T]
}
val future = for {
_ <- db.run(aTable.schema.create)
_ <- db.run(aTable += ("Hi", 1))
resultA <- db.run(aTable.result)
_ <- db.run(bTable.schema.create)
_ <- db.run(bTable ++= Seq(("Test1", 1), ("Test2", 2)))
_ <- db.run(bTable.filter(_.stringColumn === "Test1").map(_.intColumn).update(3))
resultB <- db.run(bTable.result)
} yield (resultA, resultB)
Await.result(future, Duration.Inf)
future.onComplete {
case Success(a) => println(s"OK $a")
case Failure(f) => println(s"DOH $f")
}
Thread.sleep(500)
I've got the sleep statement in the end to assert that the Future.onComplete gets time to finish before the application ends. Is there any other way?
Suppose I have a List[Map[String, String]] that represents a table in a database, and a List[String] that represents a list of column names. I'd like to implement the equivalent of a group by clause in SQL query:
def fun(table:List[Map[String, String]], keys:List[String]): List[List[Map[String, String]]
For example:
val table = List(
Map("name"->"jade", "job"->"driver", "sex"->"male"),
Map("name"->"mike", "job"->"police", "sex"->"female"),
Map("name"->"jane", "job"->"clerk", "sex"->"female"),
Map("name"->"smith", "job"->"driver", "sex"->"male")
)
val keys = List("job", "sex")
And then fun(table,keys) should be:
List(
List(
Map("name"->"jade", "job"->"driver", "sex"->"male"),
Map("name"->"smith", "job"->"driver", "sex"->"male")
),
List(Map("name"->"mike", "job"->"police", "sex"->"female")),
List(Map("name"->"jane", "job"->"clerk", "sex"->"female"))
)
You're looking for groupBy:
table.groupBy(row => keys.map(key => row(key))) map {
case (group, values) => values
}
Or more concisely:
table.groupBy(keys.map(_)).map(_._2)
I have methods in my Play app that query database tables with over hundred columns. I can't define case class for each such query, because it would be just ridiculously big and would have to be changed with each alter of the table on the database.
I'm using this approach, where result of the query looks like this:
Map(columnName1 -> columnVal1, columnName2 -> columnVal2, ...)
Example of the code:
implicit val getListStringResult = GetResult[List[Any]] (
r => (1 to r.numColumns).map(_ => r.nextObject).toList
)
def getSomething(): Map[String, Any] = DB.withSession {
val columns = MTable.getTables(None, None, None, None).list.filter(_.name.name == "myTable").head.getColumns.list.map(_.column)
val result = sql"""SELECT * FROM myTable LIMIT 1""".as[List[Any]].firstOption.map(columns zip _ toMap).get
}
This is not a problem when query only runs on a single database and single table. I need to be able to use multiple tables and databases in my query like this:
def getSomething(): Map[String, Any] = DB.withSession {
//The line below is no longer valid because of multiple tables/databases
val columns = MTable.getTables(None, None, None, None).list.filter(_.name.name == "table1").head.getColumns.list.map(_.column)
val result = sql"""
SELECT *
FROM db1.table1
LEFT JOIN db2.table2 ON db2.table2.col1 = db1.table1.col1
LIMIT 1
""".as[List[Any]].firstOption.map(columns zip _ toMap).get
}
The same approach can no longer be used to retrieve column names. This problem doesn't exist when using something like PHP PDO or Java JDBCTemplate - these retrieve column names without any extra effort needed.
My question is: how do I achieve this with Slick?
import scala.slick.jdbc.{GetResult,PositionedResult}
object ResultMap extends GetResult[Map[String,Any]] {
def apply(pr: PositionedResult) = {
val rs = pr.rs // <- jdbc result set
val md = rs.getMetaData();
val res = (1 to pr.numColumns).map{ i=> md.getColumnName(i) -> rs.getObject(i) }.toMap
pr.nextRow // <- use Slick's advance method to avoid endless loop
res
}
}
val result = sql"select * from ...".as(ResultMap).firstOption
Another variant that produces map with not null columns (keys in lowercase):
private implicit val getMap = GetResult[Map[String, Any]](r => {
val metadata = r.rs.getMetaData
(1 to r.numColumns).flatMap(i => {
val columnName = metadata.getColumnName(i).toLowerCase
val columnValue = r.nextObjectOption
columnValue.map(columnName -> _)
}).toMap
})