Concatenating databases with Squeryl - scala

I'm trying to use Squeryl to take the contents of a table from one database, and append it to the equivalent table in another database. The primary key will have to be reassigned in the process, but I'm getting the error NULL not allowed for column "SIMID". Why is this?
object Concatenator {
def main(args: Array[String]) {
Class.forName("org.h2.Driver");
val seshA = Session.create(
java.sql.DriverManager.getConnection("jdbc:h2:file:data/resultsA", "sa", "password"),
new H2Adapter
)
val seshB = Session.create(
java.sql.DriverManager.getConnection("jdbc:h2:file:data/resultsB", "sa", "password"),
new H2Adapter
)
using(seshA){
import Library._
from(sims){s => select(s)}.foreach{item =>
using(seshB){
sims.insert(item);
}
}
}
}
case class Simulation(
#Column("SIMID")
var id: Long,
val date: Date
) extends KeyedEntity[Long]
object Library extends Schema {
val sims = table[Simulation]
on(sims)(s => declare(
s.id is(unique, indexed, autoIncremented)
))
}
}
Update:
I think it might be something to do with the DBs. They were created in a Java project using JPA/EclipseLink and in additional to generating tables for my entities it also created a table called SEQUENCE, presumably for primary key generation.
I've found that I can create an brand new table in Squeryl and manually put the contents of both databases in that, thus achieving the same effect. Interestingly this new table did not have any SEQUENCE table auto generated. So I'm guessing it comes down to how JPA/EclipseLink was generating my primary keys?
Update 2:
As requested, I appended trace_level_file=3 to the url and the files are here: resultsA.trace.db and resultsB.trace.db. B is the more interesting one I think. Also, I've put a simplified version of the database here which has had unnecessary tables removed (the same database is used for resultsA and resultsB).

Just got a moment to look at this more closely. I turns out you were on the right track. While I guess that EclipseLink uses Sequences to generate the PK value, Squeryl defines the column as something like:
simid bigint not null primary key auto_increment
Without the auto_increment flag a value is never placed in the column and you end up with the constraint violation you mentioned. It sounds like you've already worked around the issue, but hopefully this will help you or someone else in the future.

Not really a solution, but my workaround is to create a new database
val seshNew = Session.create(java.sql.DriverManager.getConnection("jdbc:h2:file:data/resultsNew", "sa","password"),new H2Adapter)
and then just write all the data from the other databases into it
using(seshNew){
sims.insert(new Simulation(0,item.date))
}
The primary keys 0 gets overwritten as appropriate.

Related

How to get correct type and nullability information for enum fields using jOOQ's metadata API?

I'm trying to use jOOQ's metadata API, and most columns behave the way I'd expect, but enum columns seem to be missing type and nullability information somehow.
For example, if I have a schema defined as:
CREATE TYPE public.my_enum AS ENUM (
'foo',
'bar',
'baz'
);
CREATE TABLE public.my_table (
id bigint NOT NULL,
created_at timestamp with time zone DEFAULT now() NOT NULL,
name text,
my_enum_column public.my_enum NOT NULL,
);
The following test passes:
// this is Kotlin, but hopefully pretty easy to decipher
test("something fishy going on here") {
val jooq = DSL.using(myDataSource, SQLDialect.POSTGRES)
val myTable = jooq.meta().tables.find { it.name == "my_table" }!!
// This looks right...
val createdAt = myTable.field("created_at")!!
createdAt.dataType.nullability() shouldBe Nullability.NOT_NULL
createdAt.dataType.typeName shouldBe "timestamp with time zone"
// ...but none of this seems right
val myEnumField = myTable.field("my_enum_column")!!
myEnumField.dataType.typeName shouldBe "other"
myEnumField.dataType.nullability() shouldBe Nullability.DEFAULT
myEnumField.dataType.castTypeName shouldBe "other"
myEnumField.type shouldBe Any::class.java
}
It's telling me that enum columns have Nullability.DEFAULT regardless of whether they are null or not null. For other types, Field.dataType.nullability will vary depending on whether the column is null or not null, as expected.
For any enum column, the type is Object (Any in Kotlin), and the dataType.typeName is "other". For non-enum columns, dataType.typeName gives me the correct SQL for the type.
I'm also using the jOOQ code generator, and it generates the correct types for enum columns. That is, it creates an enum class and uses that as the type for the corresponding fields, which are marked as not-nullable. The generated code for this field looks something like (reformatted to avoid long lines):
public final TableField<MyTableRecord, MyEnum> MY_ENUM_COLUMN =
createField(
DSL.name("my_enum_column"),
SQLDataType.VARCHAR
.nullable(false)
.asEnumDataType(com.example.schema.enums.MyEnum.class),
this,
""
)
So it appears that jOOQ's code generator has the type information, but how can I access the type information via the metadata API?
I'm using postgres:11-alpine and org.jooq:jooq:3.14.11.
Update 1
I tried testing this with org.jooq:jooq:3.16.10 and org.jooq:jooq:3.17.4. They seem to fix the nullability issue, but the datatype is still "other", and the type is still Object. So it appears the nullability issue was a bug in jOOQ. I'll file an issue about the type+datatype.
Update 2
This is looking like it may be a bug, so I've filed an issue.

Alternatives for withFilterExpression for supporting composite key

I'm trying to query dynamoDB through withFilterExpression. I get an error as the argument is a composite key
Filter Expression can only contain non-primary key attributes: Primary key attribute: question_id
and also as it uses OR operator in the query and it cannot be passed to withKeyConditionExpression.
The query that was passed to withFilterExpression is similar to this question_id = 1 OR question_id = 2. The entire code is like follows
def getQuestionItems(conceptCode : String) = {
val qIds = List("1","2","3")
val hash_map : java.util.Map[String, Object] = new java.util.HashMap[String, Object]()
var queries = ArrayBuffer[String]()
hash_map.put(":c_id", conceptCode)
for ((qId, index) <- qIds.zipWithIndex) {
val placeholder = ":qId" + index
hash_map.put(placeholder, qId)
queries += "question_id = " + placeholder
}
val query = queries.mkString(" or ")
val querySpec = new QuerySpec()
.withKeyConditionExpression("concept_id = :c_id")
.withFilterExpression(query)
.withValueMap(hash_map)
questionsTable.query(querySpec)
}
Apart from withFilterExpression and withConditionExpression is there any other methods that I can use which is a part of QuerySpec ?
Let's raise things up a level. With a Query (as opposed to a GetItem or Scan) you provide a single PK value and optionally an SK condition. That's what a Query requires. You can't provide multiple PK values. If you want multiple PK values, you can do multiple Query calls. Or possibly you may consider a Scan across all PK values.
You can also consider having a GSI that presents the data in a format more suitable to efficient lookup.
Side note: With PartiQL you can actually specify multiple PK values, up to a limit. So if you really truly want this, that's a possibility. The downside is it raises things up to a new level of abstraction and can make inefficiencies hard to spot.

Anorm Scala insert list of objects with nested list

I find myself in need of inserting a sequence of elements with a sequence of nested elements into a PostgreSQL database, preferably with a single statement, because I am returning a Future. I am using Scala Play with Anorm.
My data looks something like below.
case class Question(id: Long, titel: String)
case class Answer(questionId: Long, text: String)
In db it looks like this:
CREATE TABLE questions (
question_id SERIAL PRIMARY KEY NOT NULL,
titel TEXT NOT NULL,
);
CREATE TABLE answers (
answer_id SERIAL PRIMARY KEY NOT NULL,
question_id INT NOT NULL,
text TEXT NOT NULL,
FOREIGN KEY (question_id) REFERENCES questions(question_id) ON DELETE CASCADE
);
My function would look something like this:
def saveFormQuestions(questions: Seq[Question], answers: Seq[Answer]): Future[Long] = {
Future {
db.withConnection{ implicit c =>
SQL(
// sql
).executeInsert()
}
}
}
Somehow, in Anorm, SQL or both, I have to do the following, preferably in a single transaction:
foreach question in questions
insert question into questions
foreach answer in answers, where answer.questionId == old question.id
insert answer into answers with new question id gained from question insert
I am new with Scala Play, so I might have made some assumptions I shouldn't have. Any ideas to get me started would be appreciated.
I solved it with logic inside the db.withConnection block. Somehow I assumed that you had to have a single SQL statement inside db.withConnection, which turned out not to be true. So like this:
val idMap = scala.collection.mutable.Map[Long,Long]() // structure to hold map of old ids to new
db.withConnection { implicit conn =>
// save all questions and gather map of the new ids to the old
for (q <- questions) {
val id: Long = SQL("INSERT INTO questions (titel) VALUES ({titel})")
.on('titel -> q.titel)
.executeInsert(scalar[Long].single)
idMap(q.id) = id
}
// save answers with new question ids
if (answers.nonEmpty) {
for (a <- answers ) {
SQL("INSERT INTO answers (question_id, text) VALUES ({qid}, {text});")
.on('qid -> idMap(a.questionId), 'text -> a.text).execute()
}
}
}
As indicated by its name, Anorm is not an ORM, and won't generate the statement for you.
You will have to determine the statements appropriate the represent the data and relationships (e.g. my Acolyte tutorial).
As for transaction, Anorm is a thin/smart wrapper around JDBC, so JDBC transaction semantic is keep. BTW Play provides .withTransaction on its DB resolution utility.

Entity Framework - Eager load two many-to-many relationships

Sorry for this being so long, but at least I think I got all info to be able to understand and maybe help?
I would like to load data from my database using eager loading.
The data is set up in five tables, setting up two Levels of m:n relations. So there are three tables containing data (ordered in a way of hierarchy top to bottom):
CREATE TABLE [dbo].[relations](
[relation_id] [bigint] NOT NULL
)
CREATE TABLE [dbo].[ways](
[way_id] [bigint] NOT NULL
)
CREATE TABLE [dbo].[nodes](
[node_id] [bigint] NOT NULL,
[latitude] [int] NOT NULL,
[longitude] [int] NOT NULL
)
The first two really only consist of their own ID (to hook other data not relevant here into).
In between these three data tables are two m:n tables, with a sorting hint:
CREATE TABLE [dbo].[relations_ways](
[relation_id] [bigint] NOT NULL,
[way_id] [bigint] NOT NULL,
[sequence_id] [smallint] NOT NULL
)
CREATE TABLE [dbo].[ways_nodes](
[way_id] [bigint] NOT NULL,
[node_id] [bigint] NOT NULL,
[sequence_id] [smallint] NOT NULL
)
This is, essentially, a part of the OpenStreetMap data structure. I let Entity Framework build it's objects from this database and it set up the classes exactly as the tables are.
The m:n tables do really exist as class. (I understand in EF you can build your objects m:n relation without having the explicit in-between class - should I try to change the object model in this way?)
What I want to do: My entry point is exactly one item of relation.
I think it would be best to first eager load the middle m:n relation, and then in a loop iterate over that and eager load the lowest one. I try to do that in the following way
IQueryable<relation> query = context.relations;
query = query.Where( ... ); // filters down to exactly one
query = query.Include(r => r.relation_members);
relation rel = query.SingleOrDefault();
That loads the relation and all it's 1:n info in just one trip to the database - ok, good. But I noticed it only loads the 1:n table, not the middle data table "ways".
This does NOT change if I modify the line like so:
query = query.Include(r => r.relation_members.Select(rm => rm.way));
So I cannot get the middle level loaded here, it seems?
What I cannot get working at all is load the node level of data eagerly. I tried the following:
foreach (relation_member rm in rel.relation_members) {
IQueryable<way_node> query = rm.way.way_nodes.AsQueryable();
query = query.Include(wn => wn.node);
query.Load();
}
This does work and eagerly loads the middle level way and all 1:n info of way_node in one statement for each iteration, but not the Information from node (latitude/longitude). If I access one of these values I trigger another trip to the database to load one single node object.
This last trip is deadly, since I want to load 1 relation -> 300 ways which each way -> 2000 nodes. So in the end I am hitting the server 1 + 300 + 300*2000... room for improvment, I think.
But how? I cannot get this last statement written in valid syntax AND eager loading.
Out of interest; is there a way to load the whole object graph in one trip, starting with one relation?
Loading the whole graph in one roundtrip would be:
IQueryable<relation> query = context.relations;
query = query.Where( ... ); // filters down to exactly one
query = query.Include(r => r.relation_members
.Select(rm => rm.way.way_nodes
.Select(wn => wn.node)));
relation rel = query.SingleOrDefault();
However, since you say that the Include up to ...Select(rm => rm.way) didn't work it is unlikely that this will work. (And if it would work the performance possibly isn't funny due to the complexity of the generated SQL and the amount of data and entities this query will return.)
The first thing you should investigate further is why .Include(r => r.relation_members.Select(rm => rm.way)) doesn't work because it seems correct. Is your model and mapping to the database correct?
The loop to get the nodes via explicit loading should look like this:
foreach (relation_member rm in rel.relation_members) {
context.Entry(rm).Reference(r => r.way).Query()
.Include(w => w.way_nodes.Select(wn => wn.node))
.Load();
}
Include() for some reason sometimes gets ignored when there is sorting/grouping/joining involved.
In most cases you can rewrite an Include() as a Select() into an anonymous intermediary object:
Before:
context.Invoices
.Include(invoice => invoice .Positions)
.ToList();
After:
context.Invoices
.Select(invoice => new {invoice, invoice.Positions})
.AsEnumerable()
.Select(x => x.invoice)
.ToList();
This way the query never should loose Include() information.
//get an associate book to an author
var datatable = _dataContext.Authors
.Where(x => authorids.Contains(x.AuthorId))
.SelectMany(x => x.Books)
.Distinct();

Slick issue when going with PostgreSQL

I'm using slick in a scala project to query some tables.
//define table
object Addresses extends Table[Address]("assetxs.address") {
def id = column[Int]("id", O.PrimaryKey)
def street = column[String]("street")
def number = column[String]("number")
def zipcode = column[String]("zipcode")
def country = column[String]("country")
def * = id ~ street ~ number ~ zipcode ~ country <> (Address, Address.unapply _)
}
If I use any query of this table it does not work (it says it cannot find my table) so I went further and print out the query like:
implicit val session = Database.forURL("jdbc:postgresql://localhost:5432/postgres", driver = "org.postgresql.Driver", user="postgres", password="postgres").createSession()
session.withTransaction{
val query = Query(Addresses)
println("Addresses: " + query.selectStatement)
}
I noticed that the name of the schema.table appears in "" so the statement is:
select x2."id", x2."street", x2."number", x2."zipcode", x2."country"
from "assetxs.address" x2
which of course does not work (I've tried to run it in PostgreSQL tool and I needed to remove "" from table name in order to have it working.
Can you please tell me if there is any slick option to not include "" in any query when using table names?
You've put the schema into the table name. A (quoted) table name containing a dot character is valid in SQL but it's not what you want here. You have to specify the schema separately:
object Addresses extends Table[Address](Some("assetxs"), "address")
In the end I was able to solve this issue.
I specify the table name only:
object Addresses extends Table[Address]("address")
and change my postgresql conf to include my schema when searching (it seems that slick is looking on public schema only):
search_path = '"$user",assetxs,public'
and now it works.
The solution I found when wanting to work with both H2 (testing) and Postgres (production) using liquibase and slick.
Stick with lowercase in your Slick Table objects
class MyTable(tag: Tag) extends Table[MyRecord](tag,
Some("my_schema"), "my_table")
In your H2 url config you'll need to specify DATABASE_TO_UPPER=false (this prevents the table and column names from being upper cased) and put quotation marks around the INIT schema (this prevents the schema from being upper cased)
url =
jdbc:h2:mem:test;MODE=PostgreSQL;DATABASE_TO_UPPER=false;INIT=create
schema if not exists \"my_schema\"\;SET SCHEMA \"my_schema\""
When specifying schema names in liquibase scripts it must also be quoted so that H2 won't try to capitalize it.
Since this problem is still bothering Scala newcomers (like me), I've performed small research and found that such an application.conf was successful with Slick 3.1.1 and PostgreSQL 9.5:
postgres.devenv = {
url = "jdbc:postgresql://localhost:5432/dbname?currentSchema=customSchema"
user = "user"
password = "password"
driver = org.postgresql.Driver
}
You're just using the wrong driver, check your imports
import scala.slick.driver.PostgresDriver.simple._