Mybatis dynamic select query with prevention of 'no column exits' errors - mybatis

What's the best approch to have a dynamic query like
select $dynamic_columns from table
But also prevent error like column not found and get result with available columns. considering $dynamic_columns is given by end users.
One approach would be to store the schema in java object and filter it. Again if schema is update in DB we will need to update the schema java object cache. is there any better way to handle this?

Be careful with this as it is more vulnerable to SQL injection.
Never let the user type something into a text field, instead build a
list for them to select from.
For building the list, I think the best approach is to use the JDBC method DatabaseMetaData.getColumns(...) to retrieve a list of columns for a table. I don't think there's a need to cache anything.

Related

How can I prevent SQL injection with arbitrary JSONB query string provided by an external client?

I have a basic REST service backed by a PostgreSQL database with a table with various columns, one of which is a JSONB column that contains arbitrary data. Clients can store data filling in the fixed columns and provide any JSON as opaque data that is stored in the JSONB column.
I want to allow the client to query the database with constraints on both the fixed columns and the JSONB. It is easy to translate some query parameters like ?field=value and convert that into a parameterized SQL query for the fixed columns, but I want to add an arbitrary JSONB query to the SQL as well.
This JSONB query string could contain SQL injection, how can I prevent this? I think that because the structure of the JSONB data is arbitrary I can't use a parameterized query for this purpose. All the documentation I can find suggests I use parameterized queries, and I can't find any useful information on how to actually sanitize the query string itself, which seems like my only option.
For example a similar question is:
How to prevent SQL Injection in PostgreSQL JSON/JSONB field?
But I can't apply the same solution as I don't know the structure of the JSONB or the query, I can't assume the client wants to query a particular path using a particular operator, the entire JSONB query needs to be freely provided by the client.
I'm using golang, in case there are any existing libraries or code fragments that I can use.
edit: some example queries on the JSONB that the client might do:
(content->>'company') is NULL
(content->>'income')::numeric>80000
content->'company'->>'name'='EA' AND (content->>'income')::numeric>80000
content->'assets'#>'[{"kind":"car"}]'
(content->>'DOB')::TIMESTAMP<'2000-01-30T10:12:18.120Z'::TIMESTAMP
EXISTS (SELECT FROM jsonb_array_elements(content->'assets') asset WHERE (asset->>'value')::numeric > 100000)
Note that these don't cover all possible types of queries. Ideally I want any query that PostgreSQL supports on the JSONB data to be allowed. I just want to check the query to ensure it doesn't contain sql injection. For example, a simplistic and probably inadequate solution would be to not allow any ";" in the query string.
You could allow the users to specify a path within the JSON document, and then parameterize that path within a call to a function like json_extract_path_text. That is, the WHERE clause would look like:
WHERE json_extract_path_text(data, $1) = $2
The path argument is just a string, easily parameterized, which describes the keys to traverse down to the given value, e.g. 'foo.bars[0].name'. The right-hand side of the clause would be parameterized along the same rules as you're using for fixed column filtering.

Fetching only one field using sorm framework

Is it possible to fetch only one field from the database using the SORM Framework?
What I want in plain SQL would be:
SELECT node_id FROM messages
I can't seem to be able to reproduce this in sorm. I know this might be against how sorm is supposed to work, but right now I have two huge tables with different kind of messages. I was asked to get all the unique node_ids from both tables.
I know I could just query both tables using sorm and parse through all the data but I would like to put the database to work. Obviously, this would be even better if one can get only unique node_ids in a single db call.
Right now with just querying everything and parsing it, it takes way too long.
There doesn't seem to be ORM support for what you want to do, unless node_id happens to be the primary key of your Message object:
val messages = Db.query[Message].fetchIds()
In this case you shouldn't need to worry about it being UNIQUE, since primary keys are by definition unique. Alternatively, you can run a custom SQL query:
Db.fetchWithSql[Message]("SELECT DISTINCT node_id FROM messages")
Note this latter might be typed wrong: you'd have to try it against your database. You might need fetchWithSql[Int], or some other variation: it is unclear what SORM does in the eventuality that the primary key hasn't been queried.

Talend Data Itegration: Avoid nulls coming out of tExtractXMLField?

I have this simple flow in Talend DI 6 (simplified for posting on SO):
The last step crashes with a NullPointerException, because missing XML attributes are returned as null.
Is there a way to get empty string values instead of nulls?
For now I'm using a tReplace step to remove nulls as a work-around, but it's tedious and adds to the cost of maintenance by creating one more place where the list of attributes needs to be maintained.
In Talend DI 5.6.2 it is possible to add default data values to the schema. The column in the schema is called "Default". If you expect strings, you can set an empty string, which is set if the column value is null:
Talend schema view with Default column
Works also for other data types. Talend DI 6 should still be able to do this, although the field might be renamed.

How to optimize generic SQL to retrieve DDL information

I have a generic code that is used to retrieve DDL information from a Firebird database (FB2.1). It generates SQL code like
SELECT * FROM MyTable where 'c' <> 'c'
I cannot change this code. Actually, if that matters, it is inside Report Builder 10.
The fact is that some tables from my database are becoming a litle too populated (>1M records) and that query is starting to take too long to execute.
If I try to execute
SELECT * FROM MyTable where SomeIndexedField = SomeImpossibleValue
it will obviously use that index and run very quickly.
Well, it wouldn´t be that hard to the database find out that that is an impossible matcher and make some sort of optimization and avoid testing it against each row.
Is there any way to make my firebird database to optimize that search?
As the filter condition is a negative proposition (and also doesn't refer a column to search, but only a value to compare to another value), Firebird need to do a full table scan (without use any index) to confirm that aren't any record that meet your criteria.
If you can't change you need to wait for the upcoming 3.0 version, that will implement the Boolean data type, and therefore should start to evaluate "constant" fake comparisons in advance (maybe the client library will do this evaluation before send the statement to the server?).

Using table names as parameters in t-sql (eg from #tblname)

Is it possible to use the name of a table as a parameter in t-sql?
I want to insert data into a table, but I want one method in C# which has a parameter for the table.
Is this a good approach? I think if I have one form and I am choosing the table and fields to insert data into, I am essentially looking to write my own dynamic sql query built on the fly. This is another thing altogether which I am sure has its catches?
Thanks
Not directly. The only way to do this is through dynamic SQL - either EXEC or sp_ExecuteSQL. The latter has the advantage of query cache/re-use, and avoiding injection via parameters for the values - but you will have to concatenate the table-name itself into the query (you can't parameterise it), so be sure to white-list it against a list of known-good table names.