How can I reuse a field in several bridges in Hibernate Search 6? - hibernate-search

In Hibernate Search 5 I had several custom bridges that populated the same field. That was handy so that I can perform a query on just one field. Now if I try to do it I receive this error:
HSEARCH600034: Duplicate index field definition: 'attributes'. Index field names must be unique. Look for two property mappings with the same field name, or two indexed-embeddeds with prefixes that lead to conflicting index field names, or two custom bridges declaring index fields with the same name.
I have not found a way to get an existing field from the PropertyBinding Context when implementing the PropertyBinder, only the documented way to add new fields:
IndexFieldReference<String> attributesField = schemaElement
.field("attributes", f -> f.asString())
.toReference();
Am I missing something or is no longer possible and I need to add new fields?

How can I reuse a field in several bridges in Hibernate Search 6?
At the moment, you cannot.
This limitation is a side effect from the (many) sanity checks that Hibernate Search 6 performs on startup, which prevent common mistakes and indirectly allow a more intuitive behavior in the Search DSL.
Your options are pretty much this:
Either you refactor your bridges to regroup all code that contributes to the same field in a single bridge (either a TypeBridge, or a PropertyBridge applied to a non-persisted getter that returns an aggregated list of all values you're interested in).
Or you change your bridges to each contribute to its own field, and change your search code to target all these fields at once; most (if not all) predicates allow targeting multiple fields in the same predicate.
The second solution is also the recommended way of indexing, since it produces more accurate relevance scores.
EDIT: If you go for solution 2, this (not yet implemented) feature might be of interest to you: https://hibernate.atlassian.net/browse/HSEARCH-3926

Related

Adding fields transparently to all types

Is there any integration point allowing to add a meta-field on all indexed documents transparently, right before they are indexed, similarly to _hibernate_class?
Currently using Hibernate 5.11
As discussed over the chat, the only option in Search 5 is to use the programmatic mapping API to add a class bridge to every single indexed entity type.
In Search 6, you can use the new programmatic mapping API to add a type bridge to the Object type, and it will be applied to every type. It will also be applied to embedded types, though, so that may not be what you're after.

hibernate-search for one-directional associations

According to the spec, when #IndexedEmbedded points to an entity, the association has to be directional and the other side has to be annotated with #ContainedIn. If not, Hibernate Search has no way to update the root index when the associated entity is updated.
Am I right to assume the word directional should be bi-directional? I have exactly the problem that my index is not updated. I have one-directional relationships, e.g. person to order but the order does not know the person. Now when I change the order the index is not updated.
If changing the associations to become bi-directional is no option which possibilities would I have to still use hibernate-search? Would it be possible to create two separate indices and to combine queries?
Am I right to assume the word directional should be bi-directional?
Yes. I will fix this typo.
If changing the associations to become bi-directional is no option which possibilities would I have to still use hibernate-search?
If Person is indexed and embeds Order, but Order doesn't have an inverse association to Person, then Hibernate Search cannot retrieve the Persons that have to be reindexed when an Order changes.
Thus you will have to reindex manually: https://docs.jboss.org/hibernate/search/5.11/reference/en-US/html_single/#manual-index-changes .
You can adopt one of two strategies:
The easy path: reindex all the Person entities periodically, e.g. every night.
The hard path: reindex the affected Person entities whenever an Order changes. This basically means adding code to your services so that whenever an order is created/updated/deleted, you run a query to retrieve all the corresponding persons, and reindex them manually.
The first solution is fairly simple, but has the big disadvantage that the Person index will be up to 24 hours out of date. Depending on your use case, that may be ok or that may not.
The second solution is prone to errors and you would basically be doing Hibernate Search's work.
All in all, you really have to ask yourself if adding the inverse side of the association to your model wouldn't be better.
Would it be possible to create two separate indices and to combine queries?
Technically, if you are using the Lucene integration (not the Elasticsearch one), then yes, it would be possible.
But:
you would need above-average knowledge of Lucene.
you would have to bypass Hibernate Search APIs, and would need to write code to do what Hibernate Search usually does.
you would have to use experimental (read: unstable) Lucene APIs.
I am unsure as to how well that would perform, as I never tried it.
So I wouldn't recommend it if you're not familiar with Lucene's APIs. If you really want to take that path, here are a few pointers:
How to use the index readers directly: https://docs.jboss.org/hibernate/search/5.11/reference/en-US/html_single/#IndexReaders
Lucene's documentation for joins (what you're looking for is query-time joins): https://lucene.apache.org/core/5_5_5/join/org/apache/lucene/search/join/package-summary.html

OData REST API where table has columns unique to customer

We would like to create an OData REST API. Our data model is such that each customer has their own database. All database objects have the same definition across all customer databases, with the exception of a single table.
The customer specific table we will call Contact. When a customer adds a column the system creates a column with a standardised name with a definition translated from options selected by the user in the UI. The user only refers to the column data by a field name they have specified to enable the user to be able to generate friendly queries.
It seems to me that the following approaches could be used to enable OData for the model described:
1) Create an OData open type to cater for the dynamic properties. This has the disadvantage of user requests for a customer not providing an indication of the dynamic properties that can be queried against. Even though they will be known for the user (via token authentication). Also, because dynamic properties are a dictionary, some data pivoting and inefficient query writing would be required. Not sure how to implement the IQueryable handling of query options for the dynamic properties to enable our own custom field querying.
2) Create a POCO class with e.g. 50 properties; CustomField1, CustomField2... Then somehow control which fields are exposed for use in OData calls. We would then include a separate API call to expose the custom field mapping. E.g. custom field friendly name of MobileNumber = CustomField12.
3) At runtime, check to see if column definitions of table changed since last check. If have, generate class specific to customer using CodeDom and register it with OData. Aiming for a unique URL for each customer. E.g. http://domain.name/{customer guid}/odata
I think the ideal for us is option 2. However, the fact the CustomField1 could be an underlying SQL data type of nvarchar, int, decimal, datetime, etc, there are added complications.
Has anyone a working example of how to achieve what has been described, satisfactorily?
Thanks in advance for any help.
Rik
We have run into a similar situation but with our entire dataset being unknown until runtime. Using the ODataConventionModelBuilder and EdmModel classes, you can add properties dynamically to the model at runtime.
I'm not sure whether you will have to manually add all of the properties for this object type even though only some of them are unknown or whether you can add your main object and then add your dynamic ones afterwards, but I guess either would be workable.
If you can get hold of which type of user it is on the server, you could then add only the properties that you are interested in (like option 3 but not having to CodeDom).
There is an example of this kind of untyped OData server in the OData samples here that should get you started: https://github.com/OData/ODataSamples/tree/master/WebApi/v4/ODataUntypedSample
The research we carried out actually posed Option 1 as the most suitable approach for some operations. i.e. Create an SQL view that unpivots the data in a table to a key/value pair of column name/column value for each column in the table. This was suitable for queries returning small datasets. This was far less effort than Option 3 and less confusing for the user than Option 2. The unpivot query converted the field values to nvarchar (string) values and thus meant that filtering in the UI by column value data types was not simple to achieve. (If we decide to implement this ability, I believe this can be achieved by creating a custom attribute that derives from EnablQueryAttribute, marking the controller action with it and manipulate the IQueryable before execution).
However, we wanted to expose a /Contacts/Export endpoint that when called would output the columns from a table with a fixed schema joined on a table with a client specific schema and output to a CSV file. All the while utilising the OData supported filter syntax. One of our customer databases has more than 12 million rows of data and is made up of approximately 30 columns.
To achieve this it looks like our best bet would have been to work with the Microsoft.OData.Core.UriParser.UriQueryExpressionParser class, unfortunately Microsoft in their wisdom have declared this as internal, as well as many of it's dependants.
Walking an abstract syntax tree built from OData supported query options and applying our own visitor to each node to build some dynamic Linq query/SQL seems like a possible solution.
For the time-being we will simply implement a cut-down set of supported $filter criteria without the support for grouping parenthesis.

JPA 2.0 Eclipselink OrderColumn support

I was reading over the docs regarding Eclipselink's support for #OrderColumn. It looks like this only applies to List and not Set. The reason I ask is because I have a ManyToMany bi-directional relationship (using a join table) which is a Set and is implemented with a HashSet because the collection can't have duplicates.
I wanted to order the entries in this set using #OrderColumn, but it appears I can only apply this to List, however using List will break my unique requirement. Is this understanding correct?
If so what is the recommended strategy for this case?
Thanks,
-Noah
This looks similar to the following question:
Why cannot a JPA mapping attribute be a LinkedHashset?
The Set interface does not define ordering of elements, so your set needs to be a concrete implementation like a TreeSet or LinkedHashSet implementation, not just any old Set. But your JPA provider is generally going to use its own collection implementations with special magic to handle lazy loading.
The above answer suggests that there may be some EclipseLink-specific workaround if you are willing to give up lazy loading.
I can think of two options, neither one perfect:
just use a List and rely on business logic to enforce uniqueness, with DB UNIQUE constraints as a backstop. Honestly, I end up using List for collections almost reflexively, even when Set would have been more appropriate; I admit it's sloppy but has yet to cause any significant problems for me in years of practice.
use a Set and change #ManyToMany to #OneToMany, and make your join table w/order column an actual entity that implements Comparable using the order column. Then, overload your getter method to do something like
if (! this.set instanceof TreeSet)
this.set = new TreeSet<T>(this.set);
return this.set;

Check if attribute exists

Is there better way rather than to fetch with predicate and see the number of results in order to check that the attribute exists when adding it into managed context? I'm trying to make an attribute unique for given entity...
I think you may have scrambled your nomenclature. You don't add attributes to context. You add managed objects which are defined by entities which have attributes. You could be asking about two different types of test.
If you're asking whether a means exist of testing if a managed object already exist with the exact same attributes of the one you planning on inserting, the answer is no. Since entities can be arbitrarily complex and since it takes only literally one bit different to make them logically distinct, there is no means of testing whether two objects are logically identical i.e. have the same attributes and relationships, without fetching them and testing them.
If you're asking whether you can test for a unique value of an attribute of a particular entity then you can. First you fetch on a property using [NSFetchRequest setProperty:] and then set you're predicate for the sought value. When walking relationships, you can use the Set and Array Operators to find managed objects with unique values.