Search on a Numeric Field in a keyword query with Hibernate Search 5 - hibernate-search

I've tried many ways to search, in a keyword query, for a Long field in my database. I always got errors because Hibernate Search will index the Long as a numeric field. I've tried using FieldBridge, but it just wouldn't work. Here is my field I want to search on.
#Field
#Column(unique = true)
private Long numericField;

I couldn't find anything online on this issue so I'm posting my answer here Q&A-style. I figured a way to do it with a dynamic field, see below:
#Column(unique = true)
private Long numericField;
#Field(name="numericField", analyze = Analyze.NO, index=Index.YES)
public String getNumericFieldAsString(){
return numericField.toString();
}
The field is indexed dynamically as a String now and I can use it in my keyword query.

Keyword queries support long fields just fine: you just have to provide a long value when querying, because the field is of type long. So just use Long.parseLong to turn the text value provided by your user into a long before you pass it to Hibernate Search.
In some cases, you may really need to use a string field for your numeric value. For example if you're targeting multiple fields in the same keyword query, and some of the other fields are string fields.
Then your answer will work fine, though it unfortunately has some side effects that will cause Hibernate Search to reindex data more often (because it doesn't know where getNumbericFieldAsString gets its data from exactly).
Alternatively, you could use bridges, either a custom one, or just this one which is built into Hibernate Search 5:
#Column(unique = true)
#Field(name="numericField", analyze = Analyze.NO, bridge = #FieldBridge(impl = org.hibernate.search.bridge.builtin.LongBridge.class))
private Long numericField;
For information about custom bridges, see here for Hibernate Search 5 and here for Hibernate Search 6 (the newer version of Hibernate Search, with a different API).

Related

jOOQ - how to use .whereNotExists() properly?

I want to persist daily close entities, representing close prices of stocks.
public class DailyData {
private Long id;
private String ticker;
private BigDecimal open;
private BigDecimal high;
private BigDecimal low;
private BigDecimal close;
private Timestamp date;
//getters, setters
}
Because of the limited API of the data provider, I may get duplicate entries for certain dates (if, for example, I only need two days, I still need to ask for a month worth of data). Obviously, I only want to have one record per date, so any date that already exists in the DB should not be persisted.
This may have already been answered here, but I am having trouble implementing it in practice. In particular, I don't understand how to pass actual values to be persisted. This is adapted from the example in the link:
Param<Integer> myId = param("date", Timestamp.class);
create.insertInto(DATA, DATA.TICKER, DATA.OPEN, DATA.HIGH, DATA.LOW, DATA.CLOSE, DATA.DATE)
.select(
select(
date,
param("ticker", DATA.TICKER.getType()),
param("open", DATA.OPEN.getType()),
param("high", DATA.HIGH.getType()),
param("low", DATA.LOW.getType()),
param("close", DATA.CLOSE.getType()),
param("date", DATA.DATE.getType())
)
.whereNotExists(
selectOne()
.from(DATA)
.where(DATA.DATE.eq(date))
)
);
Where are the actual values passed on in the example? There is no call to .values() DSL command, which normally appears in jOOQ documentation to tell it what values are to be inserted.
Is .execute at the end not needed?
There is a batchInsert() command to persist many entities/rows at once. Is there a batch variety of the above mentioned example? Or do I simply have to iterate through all the entities and perform the uniqueness check on each one separately?
Where are the actual values passed on in the example? There is no call to .values() DSL command, which normally appears in jOOQ documentation to tell it what values are to be inserted.
Why are you using the named parameter API through DSL.param()? Just pass DSL.val() and you'll be fine. E.g.
select(
date,
val(ticker),
val(open),
val(high),
val(low),
val(close),
val(date)
)
In fact, there is also a DSL.param(String, T) method, which you could use to pass actual values.
Probably there should be more overloads. I've created a feature request for this: https://github.com/jOOQ/jOOQ/issues/7136
However, this query is probably better implemented using INSERT .. ON CONFLICT in PostgreSQL. See also my answer to this question here.
Is .execute at the end not needed?
Yes it is.
There is a batchInsert() command to persist many entities/rows at once. Is there a batch variety of the above mentioned example? Or do I simply have to iterate through all the entities and perform the uniqueness check on each one separately?
You can batch any statement. The relevant documentation is here:
https://www.jooq.org/doc/latest/manual/sql-execution/batch-execution

Best type for JPA version field for Optimistic locking

I have doubts about which is the best type for a field annotated with #Version for optimistic locking in JPA.
The API javadoc (http://docs.oracle.com/javaee/7/api/javax/persistence/Version.html) says:
"The following types are supported for version properties: int, Integer, short, Short, long, Long, java.sql.Timestamp."
In other page (http://en.wikibooks.org/wiki/Java_Persistence/Locking#Optimistic_Locking) says:
"JPA supports using an optimistic locking version field that gets updated on each update. The field can either be numeric or a timestamp value. A numeric value is recommended as a numeric value is more precise, portable, performant and easier to deal with than a timestamp."
"Timestamp locking is frequently used if the table already has a last updated timestamp column, and is also a convenient way to auto update a last updated column. The timestamp version value can be more useful than a numeric version, as it includes the relevant information on when the object was last updated."
The questions I have are:
Is better a Timestamp type if you are going to have a lastUpdated field or is better to have a numeric version field and the timestamp in other field?
Between numeric types (int, Integer, short, Short, long, Long) which is the best to choose (considering the length of each type)? I mean, I think the best is Long but it requires a lot of space for each row.
What happens when the version field gets to the last number of a numeric type (for example 32,767 in a Short field)? Will it start from 1 again in the next increment?
Just go with Long or Integer.
BUT don't go with int or long.
In opposite to other comment here, null value is expected when entity was never persisted yet.
Having int or long might make Hibernate to think that entity is already persisted and in detached state as version value will be 0 when unset.
Just finished debugging a FK violation where "int" was the cause, so save your time and just go with Long or Integer.
First, know that locking is used to managed concurrent transactions.
1.Separate your concerns. If lastupdated field is business model specific, it should be separate from your versioning field - which is for - versioning.
2.Primitives and objects are usually mapped to your db as the same type. Except for the fact that Boolean by default will be nullable and boolean will be 'not nullable'. However, enforce nullability explicitly. In this case you want to use a primitive as the version field can't be nullable.
Integer or long are better than timestamp. Hibernate recommends numeric versionig and they don't take that much space.
If you use long, you might not live to find out.
Use this and you should be fine.
private long version;
#Version
public long getVersion() {
return version;
}
public void setVersion(long version) {
this.version = version;
}
Don't use a time value like Timestamp (or derivates like Instant or LocalDateTime etc...).
Especially if you have a Java < 15 application and hope to ever migrate to Java >= 15. They changed the precision of timestamps within Java to nano-seconds, but your database probably only stores up to microseconds, so it truncates the value, which will make you run into an OptimisticLockException all the time (1).
Neither use a primitive value, see the answer from #Piotr: The Version field must be null for new entities.
Just go with Long.

How to concat two strings in JPQL with MS SQL

I am trying to update a column (in an MS SQL table) which holds very long strings (text data type) appending it with a string from my application using JPQL. But the following query fails:
UPDATE entity e SET e.longText = CONCAT(e.longText, :textToAppend) WHERE e.id = :id
with message
The data types text and nvarchar are incompatible in the add operator.
The problem is that we need to also support other DBS than MS SQL database and DBS-specific queries are out of question (at least if there is another way).
With this query I was trying to bypass querying for the whole long text and concatenating it in the app and updating it back, so it's not slow (the query is called quite often).
Can I somehow append a string to a very long text column without doing it manually in the app and so it works in MS SQL? (I know there is no cast support in JPQL, sadly)
Using JPA with Hibernate.
It turned out I was using old hibernate dialect. There is a dialect for SQL Server 2008 (org.hibernate.dialect.SQLServer2008Dialect) which maps #Lob String properties correctly.

Using VARCHAR or TEXT as default String mapping in OpenJPA

By default, OpenJPA's postgres dictionary generates VARCHAR(255) for String fields without stated length. Can it be set up to generate VARCHAR or TEXT instead for all such fields, so that I don't need to repeat #Column(columnDefinition = "TEXT") everywhere? Of course, if the length is given explicitly, e.g. #Column(length = 128), the result should be VARCHAR(128). For that matter, do any other JPA providers allow this?
It seems that Hibernate supports this since 3.6: 6.5. Type Registry. Tracked by this JIRA issue: HHH-5138.

Is it correct that "ResultSet.getMetaData.getTableName(col)" of postgresql's jdbc driver is always returning an empty string?

When I use postgresql, I found following code:
Statement stmt = conn.createStatement();
ResultSet rs = stmt.executeQuery("select * from t");
String tableName = rs.getMetaData().getTableName(1);
System.out.println(tableName);
It prints an empty string.
So I checked the source code, and found the method org.postgresql.jdbc2.AbstractJdbc2ResultSetMetaData#getTableName always returns an empty string.
The source code is:
public abstract class AbstractJdbc2ResultSetMetaData implements PGResultSetMetaData {
/*
* #param column the first column is 1, the second is 2...
* #return column name, or "" if not applicable
* #exception SQLException if a database access error occurs
*/
public String getTableName(int column) throws SQLException
{
return "";
}
}
You can see it just return a "".
I found a discussion about this, please see: http://archives.postgresql.org/pgsql-jdbc/2009-12/msg00100.php
They think
"rs.getMetaData.getTableName(col)" should return the alias name in
query not the underlying table name. But which is hard to implement, so it's better to leave it empty.
Also they gave a method to get the table name, use:
PGResultSetMetaData.getBaseTableName()
Sample:
ResultSet rs = stmt.executeQuery("select * from x");
// convert it to PGResultSetMetaData
PGResultSetMetaData meta = (PGResultSetMetaData)rs.getMetaData();
String tableName = meta.getBaseTableName(1);
Now it can print the correct table name.
I don't know the implementation of postgresql is correct, but returning the underlying table name is much more useful than an empty string, and, most of other databases provides underlying table name instead of an empty string.
I have a problem using play2's anorm framework with postgesql: Play2's anorm can't work on postgresql, but that works well on other databases.
What do you think the correct implementation of postgresql's jdbc driver? Return an empty string, underlying table name, or something else?
I would say that returning an empty string is obviously an incorrect implementation of the interface, since the table name could never be considered to be an empty string.
The problem that I believe they are struggling with is that while their current implementation is wrong, once they choose an implementation they will be stuck with it until they decide that breaking dependencies on the behaviour is acceptable. Therefore, they choose to add a method whose name is unambiguous and provide the data that most users were expecting to come from getTableName, and leave an obviously broken implementation of the getTableName method until some consensus is reached on what it should return or until a patch is submitted that implements the consensus.
My gut reaction is that the method getTableName should return the alias being used for that table. A table could be joined with itself, and using the alias would allow you to identify which was being referenced. A table might have been generated in the query (such as unnesting an array), and therefore not even have a table name in the database. If you make the decision “absolutely always, getTableName returns the alias”, then at least users know what to expect; otherwise, you end up with it not being obvious what the method should return.
However, even if I assume that my gut reaction is “the correct implementation”, it raises the issue of compatibility. It is desirable that it be possible to switch from another DBMS to PostgreSQL with as little investment as possible, if one of PostgreSQL’s goals is to grow in popularity. Therefore, things like “how do other JDBCs implement the java.sql interfaces?” become relevant. As you say, a framework exists that has expectations of how the ResultSetMetaData should be implemented, and it is likely not the only one with certain expectations of how java.sql interfaces will be implemented.
Whichever implementation they end up choosing is going to be a tradeoff, so I can see why “kick the can down the road” is their choice. Once they choose the tradeoff they want to make, they are locked in.
EDIT: I would suggest that throwing an exception regarding not implemented would be better than just silently failing. I expect that frameworks that rely on a specific implementation of getTableName will not have much use for empty string anyway, and either error or themselves fail silently.