jOOQ - how to use .whereNotExists() properly? - postgresql

I want to persist daily close entities, representing close prices of stocks.
public class DailyData {
private Long id;
private String ticker;
private BigDecimal open;
private BigDecimal high;
private BigDecimal low;
private BigDecimal close;
private Timestamp date;
//getters, setters
}
Because of the limited API of the data provider, I may get duplicate entries for certain dates (if, for example, I only need two days, I still need to ask for a month worth of data). Obviously, I only want to have one record per date, so any date that already exists in the DB should not be persisted.
This may have already been answered here, but I am having trouble implementing it in practice. In particular, I don't understand how to pass actual values to be persisted. This is adapted from the example in the link:
Param<Integer> myId = param("date", Timestamp.class);
create.insertInto(DATA, DATA.TICKER, DATA.OPEN, DATA.HIGH, DATA.LOW, DATA.CLOSE, DATA.DATE)
.select(
select(
date,
param("ticker", DATA.TICKER.getType()),
param("open", DATA.OPEN.getType()),
param("high", DATA.HIGH.getType()),
param("low", DATA.LOW.getType()),
param("close", DATA.CLOSE.getType()),
param("date", DATA.DATE.getType())
)
.whereNotExists(
selectOne()
.from(DATA)
.where(DATA.DATE.eq(date))
)
);
Where are the actual values passed on in the example? There is no call to .values() DSL command, which normally appears in jOOQ documentation to tell it what values are to be inserted.
Is .execute at the end not needed?
There is a batchInsert() command to persist many entities/rows at once. Is there a batch variety of the above mentioned example? Or do I simply have to iterate through all the entities and perform the uniqueness check on each one separately?

Where are the actual values passed on in the example? There is no call to .values() DSL command, which normally appears in jOOQ documentation to tell it what values are to be inserted.
Why are you using the named parameter API through DSL.param()? Just pass DSL.val() and you'll be fine. E.g.
select(
date,
val(ticker),
val(open),
val(high),
val(low),
val(close),
val(date)
)
In fact, there is also a DSL.param(String, T) method, which you could use to pass actual values.
Probably there should be more overloads. I've created a feature request for this: https://github.com/jOOQ/jOOQ/issues/7136
However, this query is probably better implemented using INSERT .. ON CONFLICT in PostgreSQL. See also my answer to this question here.
Is .execute at the end not needed?
Yes it is.
There is a batchInsert() command to persist many entities/rows at once. Is there a batch variety of the above mentioned example? Or do I simply have to iterate through all the entities and perform the uniqueness check on each one separately?
You can batch any statement. The relevant documentation is here:
https://www.jooq.org/doc/latest/manual/sql-execution/batch-execution

Related

Substring of column name in Copy Activity in ADF v2

Is there a way in the V2 Copy Activity to operate upon one of the input columns (of type string) with an expression? Before I load rows to the destination, I need to limit the number of characters in the column.
My hope was to simply switch from something like this:
"ColumnMappings": "inColumn: outColumn"
to something like this:
"ColumnMappings": "#substring(inColumn, 1, 300): outColumn"
If anyone can point me to where I can read-up on where & when string expressions can be used, I could use the guidance.
This is the official documentation on expressions and functions: https://learn.microsoft.com/en-us/azure/data-factory/control-flow-expression-language-functions
And this is the documentation on mappings: https://learn.microsoft.com/en-us/azure/data-factory/copy-activity-schema-and-type-mapping
Also remember that if you are using a defined query in the copy activity, you can use sql functions like CAST([fieldName] as varchar(300)) to limit the amount of characters on a particular field.
Hope this helped!
When you don't have a SQL Source, but your destination is a SQL sink, you can use a Stored Procedure to insert your data into the final table. That way, you can define these kinds of transformations in the stored procedure. I don't think the Data Factory can handle these kinds of activities, it is more intended as an orchestrator.
Have a look here:
https://learn.microsoft.com/en-us/azure/data-factory/connector-sql-server#invoke-stored-procedure-from-sql-sink

How to use CQLinq to get metrics of Methods and Fields within a single query

I am calculating average length of identifiers with CQLinq in NDepend, and I want to get the length of the names of classes, fields and methods. I walked through this page of CQlinq: http://www.ndepend.com/docs/cqlinq-syntax, and I have code like:
let id_m = Methods.Select(m => new { m.SimpleName, m.SimpleName.Length })
let id_f = Fields.Select(f => new { f.Name, f.Name.Length })
select id_m.Union(id_f)
It doesn't work, one error says:
'System.Collections.Generic.IEnumerable' does not
contain a definition for 'Union'...
The other one is:
cannot convert from
'System.Collections.Generic.IEnumerable' to
'System.Collections.Generic.HashSet'
However, according to MSDN, IEnumerable Interface defines Union() and Concat() methods.
It seems to me that I cannot use CQLinq exactly the same way as Linq. Anyway, is there a way to get the information from Types, Methods and Fields domains within a singe query?
Thanks a lot.
is there a way to get the information from Types, Methods and Fields domains within a singe query?
Not for now, because a CQLinq query can only match a sequence of types, or a sequence of methods or a sequence of field, so you need 3 distinct code queries.
For next version CQLinq, will be improved a lot and indeed you'll be able to write things like:
from codeElement in Application.TypesAndMembers
select new { codeElement, codeElement.Name.Length }
Next version will be available before the end of the year 2016.

ormlite select count(*) as typeCount group by type

I want to do something like this in OrmLite
SELECT *, COUNT(title) as titleCount from table1 group by title;
Is there any way to do this via QueryBuilder without the need for queryRaw?
The documentation states that the use of COUNT() and the like necessitates the use of selectRaw(). I hoped for a way around this - not having to write my SQL as strings is the main reason I chose to use ORMLite.
http://ormlite.com/docs/query-builder
selectRaw(String... columns):
Add raw columns or aggregate functions
(COUNT, MAX, ...) to the query. This will turn the query into
something only suitable for using as a raw query. This can be called
multiple times to add more columns to select. See section Issuing Raw
Queries.
Further information on the use of selectRaw() as I was attempting much the same thing:
Documentation states that if you use selectRaw() it will "turn the query into" one that is supposed to be called by queryRaw().
What it does not explain is that normally while multiple calls to selectColumns() or selectRaw() are valid (if you exclusively use one or the other),
use of selectRaw() after selectColumns() has a 'hidden' side-effect of wiping out any selectColumns() you called previously.
I believe that the ORMLite documentation for selectRaw() would be improved by a note that its use is not intended to be mixed with selectColumns().
QueryBuilder<EmailMessage, String> qb = emailDao.queryBuilder();
qb.selectColumns("emailAddress"); // This column is not selected due to later use of selectRaw()!
qb.selectRaw("COUNT (emailAddress)");
ORMLite examples are not as plentiful as I'd like, so here is a complete example of something that works:
QueryBuilder<EmailMessage, String> qb = emailDao.queryBuilder();
qb.selectRaw("emailAddress"); // This can also be done with a single call to selectRaw()
qb.selectRaw("COUNT (emailAddress)");
qb.groupBy("emailAddress");
GenericRawResults<String[]> rawResults = qb.queryRaw(); // Returns results with two columns
Is there any way to do this via QueryBuilder without the need for queryRaw(...)?
The short answer is no because ORMLite wouldn't know what to do with the extra count value. If you had a Table1 entity with a DAO definition, what field would the COUNT(title) go into? Raw queries give you the power to select various fields but then you need to process the results.
With the code right now (v5.1), you can define a custom RawRowMapper and then use the dao.getRawRowMapper() method to process the results for Table1 and tack on the titleCount field by hand.
I've got an idea how to accomplish this in a better way in ORMLite. I'll look into it.

Is it correct that "ResultSet.getMetaData.getTableName(col)" of postgresql's jdbc driver is always returning an empty string?

When I use postgresql, I found following code:
Statement stmt = conn.createStatement();
ResultSet rs = stmt.executeQuery("select * from t");
String tableName = rs.getMetaData().getTableName(1);
System.out.println(tableName);
It prints an empty string.
So I checked the source code, and found the method org.postgresql.jdbc2.AbstractJdbc2ResultSetMetaData#getTableName always returns an empty string.
The source code is:
public abstract class AbstractJdbc2ResultSetMetaData implements PGResultSetMetaData {
/*
* #param column the first column is 1, the second is 2...
* #return column name, or "" if not applicable
* #exception SQLException if a database access error occurs
*/
public String getTableName(int column) throws SQLException
{
return "";
}
}
You can see it just return a "".
I found a discussion about this, please see: http://archives.postgresql.org/pgsql-jdbc/2009-12/msg00100.php
They think
"rs.getMetaData.getTableName(col)" should return the alias name in
query not the underlying table name. But which is hard to implement, so it's better to leave it empty.
Also they gave a method to get the table name, use:
PGResultSetMetaData.getBaseTableName()
Sample:
ResultSet rs = stmt.executeQuery("select * from x");
// convert it to PGResultSetMetaData
PGResultSetMetaData meta = (PGResultSetMetaData)rs.getMetaData();
String tableName = meta.getBaseTableName(1);
Now it can print the correct table name.
I don't know the implementation of postgresql is correct, but returning the underlying table name is much more useful than an empty string, and, most of other databases provides underlying table name instead of an empty string.
I have a problem using play2's anorm framework with postgesql: Play2's anorm can't work on postgresql, but that works well on other databases.
What do you think the correct implementation of postgresql's jdbc driver? Return an empty string, underlying table name, or something else?
I would say that returning an empty string is obviously an incorrect implementation of the interface, since the table name could never be considered to be an empty string.
The problem that I believe they are struggling with is that while their current implementation is wrong, once they choose an implementation they will be stuck with it until they decide that breaking dependencies on the behaviour is acceptable. Therefore, they choose to add a method whose name is unambiguous and provide the data that most users were expecting to come from getTableName, and leave an obviously broken implementation of the getTableName method until some consensus is reached on what it should return or until a patch is submitted that implements the consensus.
My gut reaction is that the method getTableName should return the alias being used for that table. A table could be joined with itself, and using the alias would allow you to identify which was being referenced. A table might have been generated in the query (such as unnesting an array), and therefore not even have a table name in the database. If you make the decision “absolutely always, getTableName returns the alias”, then at least users know what to expect; otherwise, you end up with it not being obvious what the method should return.
However, even if I assume that my gut reaction is “the correct implementation”, it raises the issue of compatibility. It is desirable that it be possible to switch from another DBMS to PostgreSQL with as little investment as possible, if one of PostgreSQL’s goals is to grow in popularity. Therefore, things like “how do other JDBCs implement the java.sql interfaces?” become relevant. As you say, a framework exists that has expectations of how the ResultSetMetaData should be implemented, and it is likely not the only one with certain expectations of how java.sql interfaces will be implemented.
Whichever implementation they end up choosing is going to be a tradeoff, so I can see why “kick the can down the road” is their choice. Once they choose the tradeoff they want to make, they are locked in.
EDIT: I would suggest that throwing an exception regarding not implemented would be better than just silently failing. I expect that frameworks that rely on a specific implementation of getTableName will not have much use for empty string anyway, and either error or themselves fail silently.

ExecuteSprocAccessor does not function for CUD operations?

I have several stored procedures in my database. For example a delete stored procedure like:
alter procedure [dbo].[DeleteFactor]
#Id uniqueidentifier
as
begin
delete from Factors where Id = #Id
end
When I call this from code like this:
dc.ExecuteSprocAccessor("DeleteFactor", id);
then the row does not get deleted. However this code functions:
dc.ExecuteNonQuery("DeleteFactor", id);
id is a passed in parameter and of type Guid.
Can anyone explain why the second does work and the first approach does not? I find it quite strange as the first method is clearly to be used with stored procedures.
According to Retrieving Data as Objects, the ExecuteSprocAccessor method uses deferred execution (ala LINQ). So, in the first approach, since you are not accessing the results of the DeleteFactor stored procedure the SQL call is not being made.
I would use the second method anyway since you really are executing a non-query. Also, the first approach may lead to some confusion since the ExecuteSprocAccessor is designed to retrieve data. e.g. "Is data supposed to be returned here? Maybe something was missed?"
Just call ToArray or ToList on the result of your ExecuteSprocAccessor to make it execute.