Parsing different queries with same parser anorm - scala

I was trying to parse some different SQL queries using one anorm parser.
Some of the queries will return a,b,c as their result. Eventually, one of the query will return
a,b,c,d,e.
The result is tightly coupled to a return object called Obj with definition like:
case class Obj(a:Int,b:String,c:String,d:Option[String],e:Option[Double]
The parsing is done using :
val parser = Macro.namedParser[Obj](ColumnNaming.SnakeCase)
SQL("""select a,b,c .....further query""").on("a"->a).as(parser *)
SQL("""select a,b,c,d,e .... further query""").on("a"->a).as(parser *)
Can I get the parser to avoid looking up for d,e in first case.
I want to make both the queries work and the result to be parsed under same object.
As its a legacy codebase and the object is used at multiple places for parsing.. I would prefer to avoid to map all other queries to some other object and then copy the values there with None in d,e fields.

Related

Selecting identical named columns in jOOQ

Im currently using jOOQ to build my SQL (with code generation via the mvn plugin).
Executing the created query is not done by jOOQ though (Using vert.X SqlClient for that).
Lets say I want to select all columns of two tables which share some identical column names. E.g. UserAccount(id,name,...) and Product(id,name,...). When executing the following code
val userTable = USER_ACCOUNT.`as`("u")
val productTable = PRODUCT.`as`("p")
create().select().from(userTable).join(productTable).on(userTable.ID.eq(productTable.AUTHOR_ID))
the build method query.getSQL(ParamType.NAMED) returns me a query like
SELECT "u"."id", "u"."name", ..., "p"."id", "p"."name", ... FROM ...
The problem here is, the resultset will contain the column id and name twice without the prefix "u." or "p.", so I can't map/parse it correctly.
Is there a way how I can say to jOOQ to alias these columns like the following without any further manual efforts ?
SELECT "u"."id" AS "u.id", "u"."name" AS "u.name", ..., "p"."id" AS "p.id", "p"."name" AS "p.name" ...
Im using the holy Postgres Database :)
EDIT: Current approach would be sth like
val productFields = productTable.fields().map { it.`as`(name("p.${it.name}")) }
val userFields = userTable.fields().map { it.`as`(name("p.${it.name}")) }
create().select(productFields,userFields,...)...
This feels really hacky though
How to correctly dereference tables from records
You should always use the column references that you passed to the query to dereference values from records in your result. If you didn't pass column references explicitly, then the ones from your generated table via Table.fields() are used.
In your code, that would correspond to:
userTable.NAME
productTable.NAME
So, in a resulting record, do this:
val rec = ...
rec[userTable.NAME]
rec[productTable.NAME]
Using Record.into(Table)
Since you seem to be projecting all the columns (do you really need all of them?) to the generated POJO classes, you can still do this intermediary step if you want:
val rec = ...
val userAccount: UserAccount = rec.into(userTable).into(UserAccount::class.java)
val product: Product = rec.into(productTable).into(Product::class.java)
Because the generated table has all the necessary meta data, it can decide which columns belong to it, and which ones don't. The POJO doesn't have this meta information, which is why it can't disambiguate the duplicate column names.
Using nested records
You can always use nested records directly in SQL as well in order to produce one of these 2 types:
Record2<Record[N], Record[N]> (e.g. using DSL.row(table.fields()))
Record2<UserAccountRecord, ProductRecord> (e.g using DSL.row(table.fields()).mapping(...), or starting from jOOQ 3.17 directly using a Table<R> as a SelectField<R>)
The second jOOQ 3.17 solution would look like this:
// Using an implicit join here, for convenience
create().select(productTable.userAccount(), productTable)
.from(productTable)
.fetch();
The above is using implicit joins, for additional convenience
Auto aliasing all columns
There are a ton of flavours that users could like to have when "auto-aliasing" columns in SQL. Any solution offered by jOOQ would be no better than the one you've already found, so if you still want to auto-alias all columns, then just do what you did.
But usually, the desire to auto-alias is a derived feature request from a misunderstanding of what's the best approch to do something in jOOQ (see above options), so ideally, you don't follow down the auto-aliasing road.

How to create DataFrame from the an array in Scala?

I have a use case where I need to create a DataFrame from an array.
I've created a DataFrame that reads a CSV then I am using a map to process/transform it further.
var mapTransform = df1.collect.map(
line => {
// line.split(",") logic for fields separation
//transformation logic here for various fields
(field1+","+field2+","+field3);
}
)
From this, I am getting an array(Array[String]) which is transformed result.
I want to further convert it DataFrames with separate columns so that later it can be used to write to DB or file, however, I am facing an issue. Is it possible to do it? Any solutions?
This does your job:
spark.sparkContext.parallelize(mapTransform.toSeq)
But note that you must avoid methods that produce non-rdd, as they load all the contents of the array to the one node and that's ineffective in the general case.
Also, there's a convention turn vars to vals as much as possible.

Update a String list column in Cassandra DB with Scala

I'm new on Cassandra and Scala, I'm working on a Kafka consumer (written in Scala) that has to update a field of a row on Cassandra from data it receives.
And so far no problem.
In this row a field is a String list and when I do the update this field hasn't to change, so I have to assign the same String list to it self.
UPDATE keyspaceName.tableName
SET fieldToChange = newValue
WHERE id = idValue
AND fieldA = '${currentRow.getString("fieldA")}'
AND fieldB = ${currentRow.getInt("fieldB")}
...
AND fieldX = ${currentRow.getList("fieldX", classOf[String]).toString}
...
But I receive even the exception:
com.datastax.driver.core.exceptions.SyntaxError: line 19:49 no viable alternative at input ']' (... 482 AND fieldX = [[listStringItem1]]...)
I currently haven't found anything that could help me through the web
The problem is that Scala's string representation of the list doesn't match to the Cassandra's representation of the list, so it generates errors.
Instead of constructing the CQL statement directly in your code, it's better to use PreparedStatement and bind variables to it:
first, it will speedup the execution as Cassandra won't parse every statement separately;
it will be easier to bind variables as you won't need to care about corresponding string representation
But be very careful with Scala - Java driver expects Java's lists, sets, maps, and base types, like, ints, etc.. You may look to java-driver-scala-extras package, but you'll need to compile it yourself, as it's not available on Maven Central.

How to use CQLinq to get metrics of Methods and Fields within a single query

I am calculating average length of identifiers with CQLinq in NDepend, and I want to get the length of the names of classes, fields and methods. I walked through this page of CQlinq: http://www.ndepend.com/docs/cqlinq-syntax, and I have code like:
let id_m = Methods.Select(m => new { m.SimpleName, m.SimpleName.Length })
let id_f = Fields.Select(f => new { f.Name, f.Name.Length })
select id_m.Union(id_f)
It doesn't work, one error says:
'System.Collections.Generic.IEnumerable' does not
contain a definition for 'Union'...
The other one is:
cannot convert from
'System.Collections.Generic.IEnumerable' to
'System.Collections.Generic.HashSet'
However, according to MSDN, IEnumerable Interface defines Union() and Concat() methods.
It seems to me that I cannot use CQLinq exactly the same way as Linq. Anyway, is there a way to get the information from Types, Methods and Fields domains within a singe query?
Thanks a lot.
is there a way to get the information from Types, Methods and Fields domains within a singe query?
Not for now, because a CQLinq query can only match a sequence of types, or a sequence of methods or a sequence of field, so you need 3 distinct code queries.
For next version CQLinq, will be improved a lot and indeed you'll be able to write things like:
from codeElement in Application.TypesAndMembers
select new { codeElement, codeElement.Name.Length }
Next version will be available before the end of the year 2016.

How are multiple result sets accessed in slick?

How does Slick handle a query that returns multiple result sets?
For example, if I wanted to retrieve information about a table using sp_help someTableName
Which would return a number of result sets. I can get the first result set, simply using scala.slick.jdbc.StaticQuery.queryNA[Tuple4[String, String, String,String]]("sp_help tblInbox_membership").first()
How do I get the second result set?
You must be using Sybase or maybe SqlServer.
I'm not familiar with Slick (yet), but the way to access subsequent ResultSets from a statement in JDBC is to call Statement.getMoreResults(), then if that succeeds Statement.getResultSet(). Slick gives you a Statement object with Session.withStatement, so you could at least use the JDBC api to get your resultsets, or feed the ResultSet to Slick if there is a way to do that.