How to dynamically create index on JSON Object properties (JSON Object props are also dynamic) - hibernate-search

I have a scenario where I want to dynamically create index on keys of JSON Object (JSON Object attributes will vary). I am able to store the JSON Object as index (by implementing FieldBridge).
eg1: preference:{"sport":"football", "music":"pop")
eg2: preference:{"sport":"cricket", "music":"jazz", "cuisine":"mexican"}
But I am unable to query the individual fields like:
preference.sport
or preference.cuisine
Is there any way / configuration in hibernate search through which we can achieve that?

If your fields are dynamic, there is no pre-defined schema and Hibernate Search is unable to determine how to query these fields. There are significant differences in how a match query should be executed on a text field or a date field, for example.
For that reason, you cannot use the Hibernate Search Query DSL to build your queries.
However, you can use native APIs.
If you're using the Lucene integration, just creating the relevant queries yourself will work fine (as long as you create the right one):
new TermQuery(new Term("sport", "value"))
If you're using the experimental Elasticsearch integration, you can use org.hibernate.search.elasticsearch.ElasticsearchQueries.fromJson( ... ). You will have to write the whole query as JSON, though, and will not be able to take advantage of the Hibernate Search QueryBuilder at all, even for queries on statically defined fields. See https://docs.jboss.org/hibernate/search/5.11/reference/en-US/html_single/#_queries
Better support for native queries, as well as dynamic fields with pre-defined types, which would be targetable in the Query DSL, is planned for Hibernate Search 6, but it's not there yet. See HSEARCH-3273.

Related

How can I prevent SQL injection with arbitrary JSONB query string provided by an external client?

I have a basic REST service backed by a PostgreSQL database with a table with various columns, one of which is a JSONB column that contains arbitrary data. Clients can store data filling in the fixed columns and provide any JSON as opaque data that is stored in the JSONB column.
I want to allow the client to query the database with constraints on both the fixed columns and the JSONB. It is easy to translate some query parameters like ?field=value and convert that into a parameterized SQL query for the fixed columns, but I want to add an arbitrary JSONB query to the SQL as well.
This JSONB query string could contain SQL injection, how can I prevent this? I think that because the structure of the JSONB data is arbitrary I can't use a parameterized query for this purpose. All the documentation I can find suggests I use parameterized queries, and I can't find any useful information on how to actually sanitize the query string itself, which seems like my only option.
For example a similar question is:
How to prevent SQL Injection in PostgreSQL JSON/JSONB field?
But I can't apply the same solution as I don't know the structure of the JSONB or the query, I can't assume the client wants to query a particular path using a particular operator, the entire JSONB query needs to be freely provided by the client.
I'm using golang, in case there are any existing libraries or code fragments that I can use.
edit: some example queries on the JSONB that the client might do:
(content->>'company') is NULL
(content->>'income')::numeric>80000
content->'company'->>'name'='EA' AND (content->>'income')::numeric>80000
content->'assets'#>'[{"kind":"car"}]'
(content->>'DOB')::TIMESTAMP<'2000-01-30T10:12:18.120Z'::TIMESTAMP
EXISTS (SELECT FROM jsonb_array_elements(content->'assets') asset WHERE (asset->>'value')::numeric > 100000)
Note that these don't cover all possible types of queries. Ideally I want any query that PostgreSQL supports on the JSONB data to be allowed. I just want to check the query to ensure it doesn't contain sql injection. For example, a simplistic and probably inadequate solution would be to not allow any ";" in the query string.
You could allow the users to specify a path within the JSON document, and then parameterize that path within a call to a function like json_extract_path_text. That is, the WHERE clause would look like:
WHERE json_extract_path_text(data, $1) = $2
The path argument is just a string, easily parameterized, which describes the keys to traverse down to the given value, e.g. 'foo.bars[0].name'. The right-hand side of the clause would be parameterized along the same rules as you're using for fixed column filtering.

Mybatis dynamic select query with prevention of 'no column exits' errors

What's the best approch to have a dynamic query like
select $dynamic_columns from table
But also prevent error like column not found and get result with available columns. considering $dynamic_columns is given by end users.
One approach would be to store the schema in java object and filter it. Again if schema is update in DB we will need to update the schema java object cache. is there any better way to handle this?
Be careful with this as it is more vulnerable to SQL injection.
Never let the user type something into a text field, instead build a
list for them to select from.
For building the list, I think the best approach is to use the JDBC method DatabaseMetaData.getColumns(...) to retrieve a list of columns for a table. I don't think there's a need to cache anything.

Discard values while inserting and updating data using slick

I am using slick with play2.
I have multiple fields in the database which are managed by the database. I don't want to create or update them, however I want to get them while reading the values.
For example, suppose I have
case class MappedDummyTable(id: Int, .. 20 other fields, modified_time: Optional[Timestamp])
which maps Dummy in the database. modified_time is managed by the database.
The problem is during insert or update, I create an instance of MappedDummyTable without the modified time attribute and pass it to slick for create/update like
TableQuery[MappedDummyTable].insert(instanceOfMappedDummyTable)
For this, Slick creates query as
Insert INTO MappedDummyTable(id,....,modified_time) Values(1,....,null)
and updates the modified_time as NULL, which I don't want. I want Slick to ignore the fields while updating and creating.
For updating, I can do
TableQuery[MappedDummyTable].map(fieldsToBeUpdated).update(values)
but this leads to 20 odd fields in the map method which looks ugly.
Is there any better way?
Update:
The best solution that I found was using multiple projection. I created one projection to get the values and another to update and insert the data
maybe you need to write some triggers in table if you don't want to write code like row => (row.id,...other 20 fields)
or try use None instead of null?
I believe that the solution with mapping non-default field is the only way to do it with Slick. To make it less ugly you can define function ignoreDefaults on MappedDummyTable that will return only non default value and function in companion object to MappedDummyTable case class that returns projection
TableQuery[MappedDummyTable].map(MappedDummyTable.ignoreDefaults).insert(instanceOfMappedDummyTable.ignoreDefaults)

Spring CRUD repository: is there findOneByMaxXYZColumn()?

My requirement:
fetch ONE object (e.g RetainInfo ) from table RETAIN_INFO if VERSION column has max value
Does CRUD repository support for an interface method like
findOneByMaxRetVersionAndCountry("DEFAULT")
Equivalent db2 sql:
select RET_ID, max(ri.RET_VERSION) from RETAIN_INFO ri where ri. COUNTRY='DEFAULT' group by RET_ID fetch first 1 rows only;
This query selects an ID, but I would actually want the RetainInfo object corresponding the SINGLE row returned by the query.
I prefer to get that without using custom query, i.e using findBy or some other method/interface supported by Spring CRUD.
You could use limiting in combination with sorting (spring data reference:limit query results). Declare a method similar to the following in your CrudRepository interface :
RetainInfo findTopByCountryOrderByRetVersionDesc(String country);
You can also use findFirst to get the first result. Before getting the result, make sure to use Orderby and then the ascending(Asc) or descending(Desc). As an example if you want to order by version and retrieve based on productName
RetainInfo findFirstByProductNameOrderByVersionDesc(String productName);
Spring Data doesn't provide an expression to select a max value. All supported query parts could be found in the Spring 1.2.0.RELEASE docs: Appendix A. Namespace reference or line 182 of org.springframework.data.repository.query.parser.Part.
Also feel free to create a feature request at Spring's Jira page.

Rogue query orderAsc with variable field according to its name

I am using Rogue/Lift Mongo record to query MongoDb. I am trying to create different query according to the sort field name. I have therefore a string name of the field that I want to use to sort the results.
I have tried to use Record.fieldByName in OrderAsc:
...query.orderAsc (elem => elem.fieldByName(columnName).open_!)
but I obtain "no type parameter for orderAsc".
How can I make it working? Honestly all the type programming in Rogue is quite difficult to follow.
Thanks
The problem is that you cannot dynamically generate a query with Rogue easily. As solution I used Lift Mongo Db that allows the usage of strings (without compile checking) for these kind of operations that requires dynamic sorting.