Groupby on multiple objects generates invalid SQL in slick - scala

I am writing a query that calculates a possible score for a QuestionAnswer, when executing the query I get a PSQLException
Info about the model
A QuestionAnswer can have several (at least one) questionAnswerPossibilities, since there are multiple ways to answer the question correctly.
Every questionAnswerPossibility has several questionAnswerParts, in the query below we query the score per questionAnswerPossibility.
The Problematic Query
The query itself does generate SQL, but the SQL can not be executed
def queryMogelijkePuntenByVragenViaOpenVragen()(implicit session: Session) = {
(for{
ovam <- OpenVraagAntwoordMogelijkheden //questionAnswerPossibilites
ovad <- OpenVraagAntwoordOnderdelen if ovad.ovamId === ovam.id //questionAnswerParts
ova <- OpenVraagAntwoorden if ovam.ovaId === ova.id //questionAnswers
} yield ((ova, ovam), ovad.punten))
.groupBy{ case ((ova, ovam), punten) => (ova, ovam)}
.map{ case ((ova, ovam), query) => (ova, ovam, query.map(_._2).sum)}
}
Here the generated SQL (postgreSQL)
select x2."id", x2."vraag_id", x3."id", x3."volgorde", x3."ova_id", sum(x4."punten")
from "open_vraag_antwoord_mogelijkheden" x3, "open_vraag_antwoord_onderdelen" x4, "open_vraag_antwoorden" x2
where (x4."ovam_id" = x3."id") and (x3."ova_id" = x2."id")
group by (x2."id", x2."vraag_id"), (x3."id", x3."volgorde", x3."ova_id")
The problem is that the SQL can not execute , I get the following error
play.api.Application$$anon$1:
Execution exception[[
PSQLException: ERROR: column "x2.id" must appear in the GROUP BY clause or be used in an aggregate function
Position: 8]]
The SQL that is genarated contains too many brackets, the last part of the SQL should be
group by x2."id", x2."vraag_id", x3."id", x3."volgorde", x3."ova_id"
However slick generates it with brackets, am I doing something wrong here? Or is this a bug?

I solved the issue
...
} yield ((ova.id, ovam.id), ovad.punten))
Because I now only yield the nessecary id's and not all data, the sql that is generated does not contain the unnecessary braces that caused the SQL error.
I really wanted more data than just those id's, but I can work around this by using this query as a subquery, the outer query will fetch all the needed data for me.

Related

EF- two WHERE IN clauses in an INCLUDE table produce two EXISTS in SQL sent to server

I have an INCLUDE table that I want to check a couple of values, in the same row, using an IN clause. The below doesn't return the correct result set because it produces two EXISTS clauses with subqueries. This results in the 2 values being checked independently and not strictly in the same child row. (forgive any typos as I'm typing this in from printed code)
var db = new dbEntities();
IQueryable<dr> query = db.drs;
// filter the parent table
query = query.Where(p => DropDown1.KeyValue.ToString().Contains(p.system_id.ToString()));
// include the child table
query = query.Include(p => p.drs_versions);
// filter the child table using the other two dropdowns
query = query.Where(p => p.drs_versions.Any(c => DropDown2.KeyValue.ToString().Contains(c.version_id.ToString())) && c => DropDown3.KeyValue.ToString().Contains(c.status_id.ToString()));
// I tried removing the second c=> but received an error "'c' is inaccessible due to its protection level" error and couldn't find an clear answer to how this related to Entity Framework
// query = query.Where(p => p.drs_versions.Any(c => DropDown2.KeyValue.ToString().Contains(c.version_id.ToString())) && DropDown3.KeyValue.ToString().Contains(c.status_id.ToString()));
This is an example of the query the code above produces...
SELECT *
FROM drs d
LEFT OUTER JOIN drs_versions v ON d.dr_id = v.dr_id
WHERE d.system_id IN (9,8,3)
AND EXISTS (SELECT 1 AS C1
FROM drs_versions sub1
WHERE d.tr_id = sub1.tr_id
AND sub1.version_id IN (9, 4, 1))
AND EXISTS (SELECT 1 AS C1
FROM drs_versions sub2
WHERE d.tr_id = sub2.tr_id
AND sub2.status_id IN (12, 7))
This is the query I actually want:
SELECT *
FROM drs d
LEFT OUTER JOIN drs_versions v ON d.dr_id = v.dr_id
WHERE d.system_id IN (9, 8, 3)
AND v.version_id IN (9, 4, 1)
AND v.status_id IN (12, 7)
How do I get Entity Framework to create a query that will give me the desired result set?
Thank you for your help
I'd drop all of the .ToString() everywhere and format your values ahead of the query to make it a lot easier to follow.. If EF is generating SQL anything like what you transcribed, you are casting to String just to have EF revert it back to the appropriate type.
From that it just looks like your parenthesis are a bit out of place:
I'm also not sure how something like DropDown2.KeyBalue.ToString() resolves back to what I'd expect to be a collection of numbers based on your SQL examples... I've just substituted this with a method called getSelectedIds().
IEnumerable<int> versions = getSelectedIds(DropDown2);
IEnumerable<int> statuses = getSelectedIds(DropDown3);
query = query
.Where(p => p.drs_versions
.Any(c => versions.Contains(c.version_id)
&& statuses.Contains(c.status_id));
As a general bit of advice I suggest always looking to simplify the variables you want to use in a linq expression as much as possible ahead of time to keep the text inside the expression as simple to read as possible. (avoiding parenthesis as much as possible) Make liberal use of line breaks and indentation to organize what falls under what, and use the code highlighting to double-check your closing parenthesis that they are closing the opening you expect.
I don't think your first example actually was input correctly as it would result in a compile error as you cannot && c => ... within an Any() block. My guess would be that you have:
query = query.Where(p => p.drs_versions.Any(c => DropDown2.KeyValue.ToString().Contains(c.version_id.ToString())) && p.drs_versions.Any(c => DropDown3.KeyValue.ToString().Contains(c.status_id.ToString()));
Your issue is closing off the inner .Any()
query.Where(p => p.drs_versions.Any(c => DropDown2.KeyValue.Contains(c.version_id))
&& DropDown3.KeyValue.Contains(c.status_id)); //<-- "c" is still outside the single .Any() condition so invalid.
Even then I'm not sure this will fully explain the difference in queries or results. It sounds like you've tried typing across code rather than pasting the actual statements and captured EF queries. It may help to copy the exact statements from the code because it's pretty easy to mistype something when trying to simplify an example only to find out you've accidentally excluded the smoking gun for your issue.

Slick:Insert into a Table from Raw SQL Select

Insert into a Table from Raw SQL Select
val rawSql: DBIO[Vector[(String, String)]] = sql"SELECT id, name FROM SomeTable".as[(String, String)]
val myTable :TableQuery[MyClass] // with columns id (String), name(String) and some other columns
Is there a way to use forceInsert functions to insert data from select into the tables?
If not, Is there a way to generate a sql string by using forceInsertStatements?
Something like:
db.run {
myTable.map{ t => (t.id, t.name)}.forceInsert????(rawSql)
}
P.S. I don't want to make two I/O calls because my RAW SQL might be returning thousands of records.
Thanks for the help.
If you can represent your rawSql query as a Slick query instead...
val query = someTable.map(row => (row.id, row.name))
...for example, then forceInsertQuery will do what you need. An example might be:
val action =
myTable.map(row => (row.someId, row.someName))
.forceInsertQuery(
someTable.map(query)
)
However, I presume you're using raw SQL for a good reason. In that case, I don't believe you can use forceInsert (without a round-trip to the database) because the raw SQL is already an action (not a query).
But, as you're using raw SQL, why not do the whole thing in raw SQL? Something like:
val rawEverything =
sqlu" insert into mytable (someId, someName) select id, name from sometable "
...or similar.

Unable to figure out filter in slickdb

Using scala with slickdb. I have table called persons. And I am filtering out persons by name as below
table.Persons.filter({ row => {
println("inside filter")
req.personName.map(name => row.personName === name).getOrElse(true:Rep[Boolean])
})
The table contains 3 rows. But still println() is executed only once. How is this filter working?
First of all when you write something like
personTable.filter(p => { .... })
It evaluates it self as a Query which can generate the SQL Query when needed for actual DB querying. The generated SQL will be something like,
SELECT ...
FROM persons
WHERE ...
Now this SQL query is submitted to the DB for execution.
So, you code inside { ... } gets evaluated to generate the Query itself. And it has no relation to how many rows do you have in your DB table.
So, the println in your example will run just once even if your DB table has 0 rows, 1 row or a million rows.

SQL statement with Anorm gives me an other result than in PostgreSQL CLI

I want to check if something is present in my database before saving it in order to avoid key duplicate errors. I'm using Play! 2.2.6 with anorm and Postgresql 9.3.
So I wrote a little function (I omit the errors check):
def testIfExist(fieldName: String, value: String): Boolean = {
DB.withConnection { implicit connection =>
SQL( """SELECT exists(SELECT 1 FROM events where {fieldName}={value} LIMIT 1)""")
.on(
'fieldName -> fieldName,
'value -> value
).execute()
}
}
But it always return true although my database is totally empty.
So I tested to replace
SELECT exists(SELECT 1 FROM events where {fieldName}={value} LIMIT 1
by
SELECT exists(SELECT 1 FROM events where name='aname' LIMIT 1
and it still always return true...
I also tested the same query directly in psql and the response is what I except : false...
execute returns true if anything was returned in the result set. In this case it will be 0 or 1. It will only return false if the query was an update (returns no result set). You need to use as with a ResultSetParser to parse the results.
There's another problem with this code as well. You can't supply column names in prepared statements. {fieldName}={value}. This will get turned into a string comparison, which will probably always be false. Instead, you can use string interpolation to insert the field name into the query. Though be wary, fieldName should be be from user defined input as it is vulnerable to SQL injection. (Your users shouldn't need know about your columns anyway)
SQL(s"SELECT exists(SELECT 1 FROM events where ${fieldName} = {value} LIMIT 1)")
.on("value" -> value)
.as(scalar[Boolean].single)

EntityFramework counting of query results vs counting list

Should efQuery.ToList().Count and efQuery.Count() produce the same value?
How is it possible that efQuery.ToList().Count and efQuery.Count() don't produce the same value?
//GetQuery() returns a default IDbSet which is used in EntityFramework
using (var ds = _provider.DataSource())
{
//return GetQuery(ds, filters).Count(); //returns 0???
return GetQuery(ds, filters).ToList().Count; //returns 605 which is correct based on filters
}
Just ran into this myself. In my case the issue is that the query has a .Select() clause that causes further relationships to be established which end up filtering the query further as the relationship inner join's constrain the result.
It appears that .Count() doesn't process the .Select() part of the query.
So I have:
// projection created
var ordersData = orders.Select( ord => new OrderData() {
OrderId = ord.OrderId,
... more simple 1 - 1 order maps
// Related values that cause relations in SQL
TotalItemsCost = ord.OrderLines.Sum(lin => lin.Qty*lin.Price),
CustomerName = ord.Customer.Name,
};
var count = ordersData.Count(); // 207
var count = ordersData.ToList().Count // 192
When I compare the SQL statements I find that Count() does a very simple SUM on the Orders table which returns all orders, while the second query is a monster of 100+ lines of SQL that has 10 inner joins that are triggered by the .Select() clause (there are a few more related values/aggregations retrieved than shown here).
Basically this seems to indicate that .Count() doesn't take the .Select() clause into account when it does its count, so those same relationships that cause further constraining of the result set are not fired for .Count().
I've been able to make this work by explicitly adding expressions to the .Count() method that pull in some of those aggregated result values which effectively force them into the .Count() query as well:
var count = ordersData.Count( o=> o.TotalItemsCost != -999 &&
o.Customer.Name != "!##"); // 207
The key is to make sure that any of the fields that are calculated or pull in related data and cause a relationship to fire, are included in the expression which forces Count() to include the required relationships in its query.
I realize this is a total hack and I'm hoping there's a better way, but for the moment this has allowed us at least to get the right value without pulling massive data down with .ToList() first.
Assuming here that efQuery is IQueryable:
ToList() actually executes a query. If changes to data in the datastore, between calls to ToList() and .Count(), result in a different resultset, calling ToList() will repopulate the list. ToList().Count and .Count() should then match until the data in the store changes the resultset again.