scala load config Map of Map - scala

I need to read from a config file and map the config to a case class .It works fine if i have one table as below
CONFIG
mapping {
target {
oracle = {
type = "oracle"
schema = "orcl"
tableName = "my_table"
query = "select key from my_table where dob='2020-01-01'
}
}
SCALA CODE SNIPPET
val targetConfig:Map[String,QueryEngine] = config.getObject("mapping.target")
.entrySet()
.asScala
.foldLeft(Map.empty[String , QueryEngine]) { case ( acc , entry ) =>
val target = entry.getKey
val targetConfig = entry.getValue match {
case validElement if validElement.valueType() == ConfigValueType.OBJECT => validElement.asInstanceOf[ConfigObject].toConfig
case invalidElement => sys.error("illegal syntax at $invalidElement")
}
targetConfig.getString("type") match {
case "oracle" => acc + (target -> new OracleQueryEngine(vars,target,targetConfig.getString("schema"),targetConfig.getString("tableName"),targetConfig.getString("query"),targetConfig.getString("param")))
case x => sys.error(s"unknow target not defined $targetConfig with $targetConfig")
}
}
NOW i updated CONFIG with MULTIPLE tables in the target mapping.
mapping {
target {
oracle =
emp = {
type = "oracle"
schema = "orcl"
tableName = "emp"
query = "select key from emp where dob='2020-01-01'
}
dept = {
type = "oracle"
schema = "orcl"
tableName = "dept"
query = "select key from dept where dob='2020-01-01'
}
}
}
CODE SNIPPET for the multiple table scenario
This is giving error
Error:(22, 28) type mismatch;
found : scala.collection.mutable.Buffer[scala.collection.immutable.Map[String,QueryEngine]]
required: scala.collection.immutable.Map[String,QueryEngine]
objs.asScala.map { obj =>
CODE SNIPPET
val sourcesConfig: Map[String, QueryEngine] =
config.getObject("mapping.target")
.entrySet()
.asScala
.foldLeft(Map.empty[String, QueryEngine]) { case (acc, entry) =>
val source = entry.getKey
entry.getValue match {
case objs:ConfigList =>
objs.asScala.map { obj =>
val sourceConfig = obj.asInstanceOf[ConfigObject].toConfig
sourceConfig.getString("type") match {
case "oracle" => acc + (source -> new OracleQueryEngine(vars,source,sourceConfig.getString("schema"), sourceConfig.getString("tableName"), sourceConfig.getString("query"), sourceConfig.getString("param")))
case x => sys.error(s"unknown source not defined $source with $sourceConfig")
}
}
}

Wrap "oracle" in curly brackets as it is a JsObject:
mapping {
target {
oracle = {
emp = {
type = "oracle"
schema = "orcl"
tableName = "emp"
query = "select key from emp where dob='2020-01-01'
}
dept = {
type = "oracle"
schema = "orcl"
tableName = "dept"
query = "select key from dept where dob='2020-01-01'
}
}}
}
Does it work now?
If not use for example PureConfig.
Your solution could be as simple as this:
import pureconfig._
import pureconfig.generic.auto._
case class Mappings(mapping: Mapping)
case class Mapping(target: Oracle)
case class Oracle(oracle: Map[String, Map[String, String]])
ConfigSource.default.load[Mappings]

Related

scala read config to Map of Map of case class

I need to read from a config file and map the config to a case class .It works fine if i have one table as below
CONFIG
mapping {
target {
oracle = {
type = "oracle"
schema = "orcl"
tableName = "my_table"
query = "select key from my_table where dob='2020-01-01'
}
}
SCALA CODE SNIPPET
val targetConfig:Map[String,QueryEngine] = config.getObject("mapping.target")
.entrySet()
.asScala
.foldLeft(Map.empty[String , QueryEngine]) { case ( acc , entry ) =>
val target = entry.getKey
val targetConfig = entry.getValue match {
case validElement if validElement.valueType() == ConfigValueType.OBJECT => validElement.asInstanceOf[ConfigObject].toConfig
case invalidElement => sys.error("illegal syntax at $invalidElement")
}
targetConfig.getString("type") match {
case "oracle" => acc + (target -> new OracleQueryEngine(vars,target,targetConfig.getString("schema"),targetConfig.getString("tableName"),targetConfig.getString("query"),targetConfig.getString("param")))
case x => sys.error(s"unknow target not defined $targetConfig with $targetConfig")
}
}
NOW i updated CONFIG with MULTIPLE tables in the target mapping.
mapping {
target {
oracle =
emp = {
type = "oracle"
schema = "orcl"
tableName = "emp"
query = "select key from emp where dob='2020-01-01'
}
dept = {
type = "oracle"
schema = "orcl"
tableName = "dept"
query = "select key from dept where dob='2020-01-01'
}
}
}
CODE SNIPPET for the multiple table scenario This is giving error Expression of mutable.Set[Map[String,QueryEngine]] doesnt confirm to Map[Query,String]
val targetConfig:Map[String,QueryEngine] = config.getObject("mapping.target")
.entrySet()
.asScala
.foldLeft(Map.empty[String , QueryEngine]) { case ( acc , entry ) =>
val target = entry.getKey
val targetConfig = entry.getValue match {
case validElement if validElement.valueType() == ConfigValueType.OBJECT => validElement.asInstanceOf[ConfigObject].toConfig
case invalidElement => sys.error("illegal syntax at $invalidElement")
}
targetConfig.getObject(s"mapping.target.$target").keySet().asScala.map { key =>
target match {
case "oracle" => acc + (target -> new OracleQueryEngine(vars,target,targetConfig.getString("schema"),targetConfig.getString("tableName"),targetConfig.getString("query"),targetConfig.getString("param")))
case x => sys.error(s"unknow target not defined $targetConfig with $targetConfig")
}
}
}
PURE CONFIG CODE
import pureconfig._
import pureconfig.generic.ProductHint
import pureconfig.generic.auto._
import com.typesafe.config.ConfigFactory
import pureconfig._
import pureconfig.generic.auto._
object PureconfigExample {
case class QueryEngine(`type`: String, schema: String, tableName: String, query: String)
type DatabaseConfig = Map[String, Map[String, QueryEngine]]
case class TargetConfig(target: DatabaseConfig)
case class ApplicationConfig(mapping: TargetConfig)
def main(args: Array[String]): Unit = {
val configString =
"""
|mapping {
| target {
| oracle {
| emp {
| type = "oracle"
| schema = "orcl"
| tableName = "emp"
| query = "select key from emp where dob='2020-01-01'"
| }
| dept {
| type = "oracle"
| schema = "orcl"
| tableName = "dept"
| query = "select key from dept where dob='2020-01-01'"
| }
| }
| }
|}
|""".stripMargin
import pureconfig.generic.auto._
// See for more details - https://pureconfig.github.io/docs/overriding-behavior-for-case-classes.html
implicit def hint[T]: ProductHint[T] = ProductHint[T](ConfigFieldMapping(CamelCase, CamelCase))
val config = ConfigSource.string(configString).load[ApplicationConfig]
println(config)
}
}
POM.XML
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>test2</groupId>
<artifactId>test2</artifactId>
<version>1.0-SNAPSHOT</version>
<dependencies>
<dependency>
<groupId>com.github.pureconfig</groupId>
<artifactId>pureconfig-generic-base_2.12</artifactId>
<version>0.12.2</version>
</dependency>
<dependency>
<groupId>com.github.pureconfig</groupId>
<artifactId>pureconfig-generic_2.12</artifactId>
<version>0.12.0</version>
</dependency>
</dependencies>
</project>
COMPILER ERROR
PureConfigExample.scala
Error:(43, 56) Cannot find an implicit instance of pureconfig.ConfigReader[PureconfigExample.ApplicationConfig].
If you are trying to read or write a case class or sealed trait consider using PureConfig's auto derivation by adding `import pureconfig.generic.auto._`
val config = ConfigSource.string(configString).load[ApplicationConfig]
Error:(43, 56) not enough arguments for method load: (implicit reader: pureconfig.Derivation[pureconfig.ConfigReader[PureconfigExample.ApplicationConfig]])pureconfig.ConfigReader.Result[PureconfigExample.ApplicationConfig].
Unspecified value parameter reader.
val config = ConfigSource.string(configString).load[ApplicationConfig]
Excuse me, for probably, not the answer you probably expected, but I strongly encourage avoid manual config parsing, because there are tools that doing it for you automatically, without boiler-plate code and type-safe at the same time. The most popular and probably the best library in Scala eco-system is PureConfig.
So in you case the solution would look like something like this:
import pureconfig._
import pureconfig.generic.ProductHint
import pureconfig.generic.auto._
object PureconfigExample {
case class QueryEngine(`type`: String, schema: String, tableName: String, query: String)
type DatabaseConfig = Map[String, Map[String, QueryEngine]]
case class TargetConfig(target: DatabaseConfig)
case class ApplicationConfig(mapping: TargetConfig)
def main(args: Array[String]): Unit = {
val configString =
"""
|mapping {
| target {
| oracle {
| emp {
| type = "oracle"
| schema = "orcl"
| tableName = "emp"
| query = "select key from emp where dob='2020-01-01'"
| }
| dept {
| type = "oracle"
| schema = "orcl"
| tableName = "dept"
| query = "select key from dept where dob='2020-01-01'"
| }
| }
| }
|}
|""".stripMargin
// See for more details - https://pureconfig.github.io/docs/overriding-behavior-for-case-classes.html
implicit def hint[T]: ProductHint[T] = ProductHint[T](ConfigFieldMapping(CamelCase, CamelCase))
val config = ConfigSource.string(configString).load[ApplicationConfig]
println(config)
}
}
which in my case produces next result:
Right(ApplicationConfig(TargetConfig(Map(oracle -> Map(emp -> QueryEngine(oracle,orcl,emp,select key from emp where dob='2020-01-01'), dept -> QueryEngine(oracle,orcl,dept,select key from dept where dob='2020-01-01'))))))
Hope this help!

MongoSpark duplicate key error only on massive datasets

Using MongoSpark, running the same code on 2 different datasets of differing sizes causing one to throw the E11000 duplicate key error.
Before we proceed, here is the code below:
object ScrapeHubCompanyImporter {
def importData(path: String, companyMongoUrl: String): Unit = {
val spark = SparkSession.builder()
.master("local[*]")
.config("spark.mongodb.input.uri", companyMongoUrl)
.config("spark.mongodb.output.uri", companyMongoUrl)
.config("spark.mongodb.input.partitionerOptions.partitionKey", "profileUrl")
.getOrCreate()
import spark.implicits._
val websiteToDomainTransformer = udf((website: String) => {
val tldExtract = SplitHost.fromURL(website)
if (tldExtract.domain == "") {
null
} else {
tldExtract.domain + "." + tldExtract.tld
}
})
val jsonDF =
spark
.read
.json(path)
.filter { row =>
row.getAs[String]("canonical_url") != null
}
.dropDuplicates(Seq("canonical_url"))
.select(
toHttpsUdf($"canonical_url").as("profileUrl"),
$"city",
$"country",
$"founded",
$"hq".as("headquartes"),
$"industry",
$"company_id".as("companyId"),
$"name",
$"postal",
$"size",
$"specialties",
$"state",
$"street_1",
$"street_2",
$"type",
$"website"
)
.filter { row => row.getAs[String]("website") != null }
.withColumn("domain", websiteToDomainTransformer($"website"))
.filter(row => row.getAs[String]("domain") != null)
.as[ScrapeHubCompanyDataRep]
val jsonColsSet = jsonDF.columns.toSet
val mongoData = MongoSpark
.load[LinkedinCompanyRep](spark)
.withColumn("companyUrl", toHttpsUdf($"companyUrl"))
.as[CompanyRep]
val mongoColsSet = mongoData.columns.toSet
val union = jsonDF.joinWith(
mongoData,
jsonDF("companyUrl") === mongoData("companyUrl"),
joinType = "left")
.map { t =>
val scrapeHub = t._1
val liCompanyRep = if (t._2 != null ) {
t._2
} else {
CompanyRep(domain = scrapeHub.domain)
}
CompanyRep(
_id = pickValue(liCompanyRep._id, None),
city = pickValue(scrapeHub.city, liCompanyRep.city),
country = pickValue(scrapeHub.country, liCompanyRep.country),
postal = pickValue(scrapeHub.postal, liCompanyRep.postal),
domain = scrapeHub.domain,
founded = pickValue(scrapeHub.founded, liCompanyRep.founded),
headquartes = pickValue(scrapeHub.headquartes, liCompanyRep.headquartes),
headquarters = liCompanyRep.headquarters,
industry = pickValue(scrapeHub.industry, liCompanyRep.industry),
linkedinId = pickValue(scrapeHub.companyId, liCompanyRep.companyId),
companyUrl = Option(scrapeHub.companyUrl),
name = pickValue(scrapeHub.name, liCompanyRep.name),
size = pickValue(scrapeHub.size, liCompanyRep.size),
specialties = pickValue(scrapeHub.specialties, liCompanyRep.specialties),
street_1 = pickValue(scrapeHub.street_1, liCompanyRep.street_1),
street_2 = pickValue(scrapeHub.street_2, liCompanyRep.street_2),
state = pickValue(scrapeHub.state, liCompanyRep.state),
`type` = pickValue(scrapeHub.`type`, liCompanyRep.`type`),
website = pickValue(scrapeHub.website, liCompanyRep.website),
updatedDate = None,
scraped = Some(true)
)
}
val idToMongoId = udf { st: String =>
if (st != null) {
ObjectId(st)
} else {
null
}
}
val saveReady =
union
.map { rep =>
rep.copy(
updatedDate = Some(new Timestamp(System.currentTimeMillis)),
scraped = Some(true),
headquarters = generateCompanyHeadquarters(rep)
)
}
.dropDuplicates(Seq("companyUrl"))
MongoSpark.save(
saveReady.withColumn("_id", idToMongoId($"_id")),
WriteConfig(Map(
"uri" -> companyMongoUrl
)))
}
def generateCompanyHeadquarters(companyRep: CompanyRep): Option[CompanyHeadquarters] = {
val hq = CompanyHeadquarters(
country = companyRep.country,
geographicArea = companyRep.state,
city = companyRep.city,
postalCode = companyRep.postal,
line1 = companyRep.street_1,
line2 = companyRep.street_2
)
CompanyHeadquarters
.unapply(hq)
.get
.productIterator.toSeq.exists {
case a: Option[_] => a.isDefined
case _ => false
} match {
case true => Some(hq)
case false => None
}
}
def pickValue(left: Option[String], right: Option[String]): Option[String] = {
def _noneIfNull(opt: Option[String]): Option[String] = {
if (opt != null) {
opt
} else {
None
}
}
val lOpt = _noneIfNull(left)
val rOpt = _noneIfNull(right)
lOpt match {
case Some(l) => Option(l)
case None => rOpt match {
case Some(r) => Option(r)
case None => None
}
}
}
}
This issue is around the companyUrl which is one of the unique keys in the collection, the other being the _id key. The issue is that there are tons of duplicates that Spark will attempt to save on a 700gb dataset, but if I run a very small dataset locally, Im never able to replicate the issue. Im trying to understand whats going on, and how can I make sure to group all the existing companies on the companyUrl, and make sure that duplicates really are removed globally across the dataset.
EDIT
Here are some scenarios that arise:
Company is in Mongo, the file thats read has updated data -> Duplicate key error can occur here
Company not in Mongo but in file -> Duplicate key error can occur here as well.
EDIT2
The duplication error occurs around companyUrl field.
EDIT 3
I've narrowed down this as being an issue the merging stage. Looking through records that have been marked as having duplicate companyUrl's, some of those records are not in the target collection yet somehow a duplicate record is still being written to the collection. In other situations, the _id field of the new record doesn't match the old record that had the same companyUrl.

Insert if not exists in Slick 3.0.0

I'm trying to insert if not exists, I found this post for 1.0.1, 2.0.
I found snippet using transactionally in the docs of 3.0.0
val a = (for {
ns <- coffees.filter(_.name.startsWith("ESPRESSO")).map(_.name).result
_ <- DBIO.seq(ns.map(n => coffees.filter(_.name === n).delete): _*)
} yield ()).transactionally
val f: Future[Unit] = db.run(a)
I'm struggling to write the logic from insert if not exists with this structure. I'm new to Slick and have little experience with Scala. This is my attempt to do insert if not exists outside the transaction...
val result: Future[Boolean] = db.run(products.filter(_.name==="foo").exists.result)
result.map { exists =>
if (!exists) {
products += Product(
None,
productName,
productPrice
)
}
}
But how do I put this in the transactionally block? This is the furthest I can go:
val a = (for {
exists <- products.filter(_.name==="foo").exists.result
//???
// _ <- DBIO.seq(ns.map(n => coffees.filter(_.name === n).delete): _*)
} yield ()).transactionally
Thanks in advance
It is possible to use a single insert ... if not exists query. This avoids multiple database round-trips and race conditions (transactions may not be enough depending on isolation level).
def insertIfNotExists(name: String) = users.forceInsertQuery {
val exists = (for (u <- users if u.name === name.bind) yield u).exists
val insert = (name.bind, None) <> (User.apply _ tupled, User.unapply)
for (u <- Query(insert) if !exists) yield u
}
Await.result(db.run(DBIO.seq(
// create the schema
users.schema.create,
users += User("Bob"),
users += User("Bob"),
insertIfNotExists("Bob"),
insertIfNotExists("Fred"),
insertIfNotExists("Fred"),
// print the users (select * from USERS)
users.result.map(println)
)), Duration.Inf)
Output:
Vector(User(Bob,Some(1)), User(Bob,Some(2)), User(Fred,Some(3)))
Generated SQL:
insert into "USERS" ("NAME","ID") select ?, null where not exists(select x2."NAME", x2."ID" from "USERS" x2 where x2."NAME" = ?)
Here's the full example on github
This is the version I came up with:
val a = (
products.filter(_.name==="foo").exists.result.flatMap { exists =>
if (!exists) {
products += Product(
None,
productName,
productPrice
)
} else {
DBIO.successful(None) // no-op
}
}
).transactionally
It's is a bit lacking though, for example it would be useful to return the inserted or existing object.
For completeness, here the table definition:
case class DBProduct(id: Int, uuid: String, name: String, price: BigDecimal)
class Products(tag: Tag) extends Table[DBProduct](tag, "product") {
def id = column[Int]("id", O.PrimaryKey, O.AutoInc) // This is the primary key column
def uuid = column[String]("uuid")
def name = column[String]("name")
def price = column[BigDecimal]("price", O.SqlType("decimal(10, 4)"))
def * = (id, uuid, name, price) <> (DBProduct.tupled, DBProduct.unapply)
}
val products = TableQuery[Products]
I'm using a mapped table, the solution works also for tuples, with minor changes.
Note also that it's not necessary to define the id as optional, according to the documentation it's ignored in insert operations:
When you include an AutoInc column in an insert operation, it is silently ignored, so that the database can generate the proper value
And here the method:
def insertIfNotExists(productInput: ProductInput): Future[DBProduct] = {
val productAction = (
products.filter(_.uuid===productInput.uuid).result.headOption.flatMap {
case Some(product) =>
mylog("product was there: " + product)
DBIO.successful(product)
case None =>
mylog("inserting product")
val productId =
(products returning products.map(_.id)) += DBProduct(
0,
productInput.uuid,
productInput.name,
productInput.price
)
val product = productId.map { id => DBProduct(
id,
productInput.uuid,
productInput.name,
productInput.price
)
}
product
}
).transactionally
db.run(productAction)
}
(Thanks Matthew Pocock from Google group thread, for orienting me to this solution).
I've run into the solution that looks more complete. Section 3.1.7 More Control over Inserts of the Essential Slick book has the example.
At the end you get smth like:
val entity = UserEntity(UUID.random, "jay", "jay#localhost")
val exists =
users
.filter(
u =>
u.name === entity.name.bind
&& u.email === entity.email.bind
)
.exists
val selectExpression = Query(
(
entity.id.bind,
entity.name.bind,
entity.email.bind
)
).filterNot(_ => exists)
val action = usersDecisions
.map(u => (u.id, u.name, u.email))
.forceInsertQuery(selectExpression)
exec(action)
// res17: Int = 1
exec(action)
// res18: Int = 0
according to the slick 3.0 manual insert query section (http://slick.typesafe.com/doc/3.0.0/queries.html), the inserted values can be returned with id as below:
def insertIfNotExists(productInput: ProductInput): Future[DBProduct] = {
val productAction = (
products.filter(_.uuid===productInput.uuid).result.headOption.flatMap {
case Some(product) =>
mylog("product was there: " + product)
DBIO.successful(product)
case None =>
mylog("inserting product")
(products returning products.map(_.id)
into ((prod,id) => prod.copy(id=id))) += DBProduct(
0,
productInput.uuid,
productInput.name,
productInput.price
)
}
).transactionally
db.run(productAction)
}

Slick code generation for only a single schema

Is there a way to have Slick's code generation generate code for only a single schema? Say, public? I have extensions that create a whole ton of tables (eg postgis, pg_jobman) that make the code that slick generates gigantic.
Use this code with appropriate values and schema name,
object CodeGenerator {
def outputDir :String =""
def pkg:String =""
def schemaList:String = "schema1, schema2"
def url:String = "dburl"
def fileName:String =""
val user = "dbUsername"
val password = "dbPassword"
val slickDriver="scala.slick.driver.PostgresDriver"
val JdbcDriver = "org.postgresql.Driver"
val container = "Tables"
def generate() = {
val driver: JdbcProfile = buildJdbcProfile
val schemas = createSchemaList
var model = createModel(driver,schemas)
val codegen = new SourceCodeGenerator(model){
// customize Scala table name (table class, table values, ...)
override def tableName = dbTableName => dbTableName match {
case _ => dbTableName+"Table"
}
override def code = {
//imports is copied right out of
//scala.slick.model.codegen.AbstractSourceCodeGenerator
val imports = {
"import scala.slick.model.ForeignKeyAction\n" +
(if (tables.exists(_.hlistEnabled)) {
"import scala.slick.collection.heterogenous._\n" +
"import scala.slick.collection.heterogenous.syntax._\n"
} else ""
) +
(if (tables.exists(_.PlainSqlMapper.enabled)) {
"import scala.slick.jdbc.{GetResult => GR}\n" +
"// NOTE: GetResult mappers for plain SQL are only generated for tables where Slick knows how to map the types of all columns.\n"
} else ""
) + "\n\n" //+ tables.map(t => s"implicit val ${t.model.name.table}Format = Json.format[${t.model.name.table}]").mkString("\n")+"\n\n"
}
val bySchema = tables.groupBy(t => {
t.model.name.schema
})
val schemaFor = (schema: Option[String]) => {
bySchema(schema).sortBy(_.model.name.table).map(
_.code.mkString("\n")
).mkString("\n\n")
}
}
val joins = tables.flatMap( _.foreignKeys.map{ foreignKey =>
import foreignKey._
val fkt = referencingTable.TableClass.name
val pkt = referencedTable.TableClass.name
val columns = referencingColumns.map(_.name) zip
referencedColumns.map(_.name)
s"implicit def autojoin${fkt + name.toString} = (left:${fkt} ,right:${pkt}) => " +
columns.map{
case (lcol,rcol) =>
"left."+lcol + " === " + "right."+rcol
}.mkString(" && ")
})
override def entityName = dbTableName => dbTableName match {
case _ => dbTableName
}
override def Table = new Table(_) {
table =>
// customize table value (TableQuery) name (uses tableName as a basis)
override def TableValue = new TableValue {
override def rawName = super.rawName.uncapitalize
}
// override generator responsible for columns
override def Column = new Column(_){
// customize Scala column names
override def rawName = (table.model.name.table,this.model.name) match {
case _ => super.rawName
}
}
}
}
println(outputDir+"\\"+fileName)
(new File(outputDir)).mkdirs()
val fw = new FileWriter(outputDir+File.separator+fileName)
fw.write(codegen.packageCode(slickDriver, pkg, container))
fw.close()
}
def createModel(driver: JdbcProfile, schemas:Set[Option[String]]): Model = {
driver.simple.Database
.forURL(url, user = user, password = password, driver = JdbcDriver)
.withSession { implicit session =>
val filteredTables = driver.defaultTables.filter(
(t: MTable) => schemas.contains(t.name.schema)
)
PostgresDriver.createModel(Some(filteredTables))
}
}
def createSchemaList: Set[Option[String]] = {
schemaList.split(",").map({
case "" => None
case (name: String) => Some(name)
}).toSet
}
def buildJdbcProfile: JdbcProfile = {
val module = currentMirror.staticModule(slickDriver)
val reflectedModule = currentMirror.reflectModule(module)
val driver = reflectedModule.instance.asInstanceOf[JdbcProfile]
driver
}
}
I encountered the same problem and I found this question. The answer by S.Karthik sent me in the right direction. However, the code in the answer is slightly outdated. And I think a bit over-complicated. So I crafted my own solution:
import slick.codegen.SourceCodeGenerator
import slick.driver.JdbcProfile
import slick.model.Model
import scala.concurrent.duration.Duration
import scala.concurrent.{Await, ExecutionContext}
val slickDriver = "slick.driver.PostgresDriver"
val jdbcDriver = "org.postgresql.Driver"
val url = "jdbc:postgresql://localhost:5432/mydb"
val outputFolder = "/path/to/src/test/scala"
val pkg = "com.mycompany"
val user = "user"
val password = "password"
object MySourceCodeGenerator {
def run(slickDriver: String, jdbcDriver: String, url: String, outputDir: String,
pkg: String, user: Option[String], password: Option[String]): Unit = {
val driver: JdbcProfile =
Class.forName(slickDriver + "$").getField("MODULE$").get(null).asInstanceOf[JdbcProfile]
val dbFactory = driver.api.Database
val db = dbFactory.forURL(url, driver = jdbcDriver, user = user.orNull,
password = password.orNull, keepAliveConnection = true)
try {
// **1**
val allSchemas = Await.result(db.run(
driver.createModel(None, ignoreInvalidDefaults = false)(ExecutionContext.global).withPinnedSession), Duration.Inf)
// **2**
val publicSchema = new Model(allSchemas.tables.filter(_.name.schema.isEmpty), allSchemas.options)
// **3**
new SourceCodeGenerator(publicSchema).writeToFile(slickDriver, outputDir, pkg)
} finally db.close
}
}
MySourceCodeGenerator.run(slickDriver, jdbcDriver, url, outputFolder, pkg, Some(user), Some(password))
I'll explain what's going on here:
I copied the run function from the SourceCodeGenerator class that's in the slick-codegen library. (I used version slick-codegen_2.10-3.1.1.)
// **1**: In the origninal code, the generated Model was referenced in a val called m. I renamed that to allSchemas.
// **2**: I created a new Model (publicSchema), using the options from the original model, and using a filtered version of the tables set from the original model. It turns out tables from the public schema don't get a schema name in the model. Hence the isEmpty. Should you need tables from one or more other schemas, you can easily create a different filter expression.
// **3**: I create a SourceCodeGenerator with the created publicSchema model.
Of course, it would even be better if the Slick codegenerator could incorporate an option to select one or more schemas.

How to populate User defined objects from list of lists in scala

I am new to Scala.
Currently trying to write a program which fetches table metadata from a database as a list of lists in the below format, and convert it to Objects of user defined type TableFieldMetadata.
The getAllTableMetaDataAsVO function does this conversion.
Can you tell me if I can write this function much better in functional way.
| **Table Name** | **FieldName** | **Data type** |
| table xxxxxx | field yyyyyy | type zzzz |
| table xxxxxx | field wwwww| type mmm|
| table qqqqqq| field nnnnnn| type zzzz |
Note: Here table name can repeat as it usually has multiple columns.
User defined classes:
1. TableFieldMetadata:
/**
* Class to hold the meta data for a table
*/
class TableFieldMetadata (name: String){
var tableName: String = name
var fieldMetaDataList: List[FieldMetaData] = List()
def add(fieldMetadata: FieldMetaData) {
fieldMetaDataList = fieldMetadata :: fieldMetaDataList
}
}
2. FieldMetaData :
/**
* Class to hold the meta data for a field
*/
class FieldMetaData (fieldName: String, fieldsDataType: String) {
var name:String = fieldName
var dataType:String = fieldsDataType
}
Function:
/**
* Function to convert list of lists to user defined objects
*/
def getAllTableMetaDataAsVO(allTableMetaData: List[List[String]]):List[TableFieldMetadata] = {
var currentTableName:String = null
var currentTable: TableFieldMetadata = null;
var tableFieldMetadataList: List[TableFieldMetadata] = List()
allTableMetaData.foreach { tableFieldMetadataItem =>
var tableName = tableFieldMetadataItem.head
if (currentTableName == null || !currentTableName.equals(tableName)) {
currentTableName = tableName
currentTable = new TableFieldMetadata(tableName)
tableFieldMetadataList = currentTable :: tableFieldMetadataList
}
if (currentTableName.equals(tableName)) {
var tableField = tableFieldMetadataItem.tail
currentTable.add(new FieldMetaData(tableField(0), tableField(1)))
}
}
return tableFieldMetadataList
}
Here is one solution. NOTE: I just used Table and Field for easy of typing into the REPL. You should use case classes.
scala> case class Field( name: String, dType : String)
defined class Field
scala> case class Table(name : String, fields : List[Field])
defined class Table
scala> val rawData = List( List("table1", "field1", "string"), List("table1", "field2", "int"), List("table2", "field1", "string") )
rawData: List[List[String]] = List(List(table1, field1, string), List(table1, field2, int), List(table2, field1, string))
scala> val grouped = rawData.groupBy(_.head)
grouped: scala.collection.immutable.Map[String,List[List[String]]] = Map(table1 -> List(List(table1, field1, string), List(table1, field2, int)), table2 -> List(List(table2, field1, string)))
scala> val result = grouped.map { case(k,v) => { val fields = for { r <- v } yield { Field( r(1), r(2) ) }; Table( k, fields ) } }
result: scala.collection.immutable.Iterable[Table] = List(Table(table1,List(Field(field1,string), Field(field2,int))), Table(table2,List(Field(field1,string))))