How to solve SQL injection for Athena? - sql-injection

I am working on writing a Spring Java program accessing data from Athena, but I found that Athena JDBC driver does not support PreparedStatement, does anyone have idea about how to avoid SQL injection on Athena?

Update: I originally answered this question in 2018, and since then Athena now supports query parameters.
Below is my original answer:
You'll have to format your SQL query as a string before you prepare the query, and include variables by string concatenation.
In other words, welcome to PHP programming circa 2005! :-(
This puts the responsibility on you and your application code to ensure the variables are safe, and don't cause SQL injection vulnerabilities.
For example, you can cast variables to numeric data types before you interpolate them into your SQL.
Or you can create an allowlist when it's possible to declare a limited set of values that may be allowed. If you accept input, check it against the whitelist. If the input is not in the allowlist, don't use it as part of your SQL statement.
I recommend you give feedback to the AWS Athena project and ask them when they will provide support for SQL query parameters in their JDBC driver. Email them at Athena-feedback#amazon.com
See also this related question: AWS Athena JDBC PreparedStatement

Athena now has support for prepared statements (this was not the case when the question was asked).
That being said, prepared statements aren't the only way to guard against SQL injection attacks in Athena, and SQL injection attacks aren't as serious as they are in a database.
Athena is just a query engine, not a database. While dropping a table can be disruptive, tables are just metadata, and the data is not dropped along with it.
Athena's API does not allow multiple statements in the same execution, so you can't sneak a DROP TABLE foo into a statement without completely replacing the query.
Athena does not, by design, have any capability of deleting data. Athena has features that can create new data, such as CTAS, but it will refuse to write into an existing location and cannot overwrite existing data.

Related

Is it possible to evaluate a Postgres expression without connecting to a database?

PostgreSQL has excellent support for evaluating JSONPath expressions against JSON data.
For example, this query returns true because the value of the nested field is indeed "foo".
select '{"header": {"nested": "foo"}}'::jsonb #? '$.header ? (#.nested == "foo")'
Notably this query does not reference any schemas or tables. Ideally, I would like to use this functionality of PostgreSQL without creating or connecting to a full database instance. Is it possible to run PostgreSQL in such a way that it doesn't have schemas or tables, but is still able to evaluate "standalone" queries?
Some other context on the project, we need to evaluate JSONPath expressions against JSON data in both a Postgres database and Python application. Unfortunately, Python does not have any JSONPath libraries that support enough of the spec to be useful to us.
Ideally, I would like to use this functionality of PostgreSQL without creating or connecting to a full database instance.
Well, it is open source. You can always pull out the source code for this functionality you want and adapt it to compile by itself. But that seems like a large and annoying undertaking, and I probably wouldn't do it. And short of that, no.
Why do you need this? Are you worried about scalability or ease of installation or performance or what? If you are already using PostgreSQL anyway, firing up a dummy connection to just fire some queries at the JSONB engine doesn't seem too hard.

db2look from SQL

Is it possible to get the table structure like db2look from SQL?
Or the only way is from command line? Thus, by wrapping a external stored procedure in C I could call the db2look, but that is not what I am looking for.
Clarification added later:
I want to know which tables have the non logged option from SQL.
It is possible to create the table structure from regular SQL and the public DB2 catalog - however, it is complex and requires some deeper skills.
The metadata is available in the DB2 catalog views in the SYSCAT schema. For a regular table you would first start off by looking into the values in SYSCAT.TABLES and SYSCAT.COLUMNS. From there you would need to branch off to other views depending on what table and column options you are after, whether time-travel tables, special partitioning rules, or many other options are involved.
Serge Rielau published an article on developerWorks called Backup and restore SQL schemas for DB2 Universal Database that provides a set of stored procedures that will do exactly what you're looking for.
The article is quite old (2006) so you may need to put some time in to update the procedures to be able to handle features that were added to DB2 since the date of publication, but the procedures may work for you now and are a nice jumping off point.

SQL Azure TEXTPTR is not supported

According to the Features that are not support in SQL Database, TEXTPTR and UPDATETEXT aren't supported in Azure's SQL Database. Is there another way that can be used to insert/update big text/image fields, so that we can avoid having the entire column contents in memory?
UPDATETEXT is depricated since SQL Server 2005. You should use UPDATE ... SET ... .WRITE instead.

Execute function returning data in remote PostgreSQL database from local PostgreSQL database

Postgres version: 9.3.4
I have the need to execute a function which resides in a remote database. The function returns a table of statistic data based on the parameters given.
I am in effect only mirroring the function in my local database to lock down access to this function using my database roles and grants.
I have found the following which seem to only provide table-based access.
http://www.postgresql.org/docs/9.3/static/postgres-fdw.html
http://multicorn.org/foreign-data-wrappers/#idsqlalchemy-foreign-data-wrapper
First question: is that correct or are there ways to use these libraries for non-table based operations?
I have found the following which seems to provide me with any SQL operation on the foreign database. The negative seems to be increased complexity and reduced performance due to manual connection and error handling.
http://www.postgresql.org/docs/9.3/static/dblink.html
Second question: are these assumptions correct, and are there any ways to bypass these concerns or libraries/samples one can begin from?
The fdw interface provides a way to make a library which can allow a postgresql database to query any external data source as though it was a table. From that point of view, it could do what you want.
The inbuilt postgresql_fdw driver, however, does not allow you to specify a function as a remote table.
You could write your own fdw driver, possibly using the multicorn library, or some other language. That is likely to be a bit of work though, and would have some specific disadvantages, in particular I don't know how you would pass parameters to the function.
dblink is probably going to be the easiest solution. It allows you to execute arbitrary SQL on the remote server, returning a set of records.
SELECT *
FROM dblink('dbname=mydb', 'SELECT * FROM thefunction(1,2,3)')
AS t1(col1 INTEGER, col2 INTEGER);
There are other potential solutions but they would all be more effort to set up.

NoSql Injection in Python

when trying to make this question, i got this one it is using Java, and in the answer it gave a Ruby example, and it seems that the injection happens only when using Json? because i've an expose where i'll try to compare between NoSQL and SQL and i was trying to said: be happy, nosql has no sql injection since it's not sql ...
can you please explain me:
how sql injection happens when using Python driver (pymongo).
how to avoid it.
the comparison using the old way sql injection using the comment in the login form.
There are a couple of concerns with injection in MongoDB:
$where JS injection - Building JavaScript functions from user input can result in a query that can behave differently to what you expect. JavaScript functions in general are not a responsible method to program MongoDB queries and it is highly recommended to not use them unless absolutely needed.
Operator injection - If you allow users to build (from the front) a $or or something they could easily manipulate this ability to change your queries. This of course does not apply if you just take data from a set of text fields and manually build a $or from that data.
JSON injection - Quite a few people recently have been trying to convert a full JSON document sent (saw this first in JAVA, ironically) from some client side source into a document for insertion into MongoDB. I shouldn't need to even go into why this is bad. A JSON value for a field is fine since, of course, MongoDB is BSON.
As #Burhan stated injection comes from none sanitized input. Fortunately for MongoDB it has object orientated querying.
The problem with SQL injection comes from the word "SQL". SQL is a querying language built up of strings. On the other hand MongoDB actually uses a BSON document to specify a query (an Object). If you keep to the basic common sense rules I gave you above you should never have a problem with an attack vector like:
SELECT * FROM tbl_user WHERE ='';DROP TABLE;
Also MongoDB only supports one operation per command atm (without using eval, don't ever do that though) so that wouldn't work anyway...
I should add that this does not apply to data validation only injection.
SQL injection has nothing to do with the database. It is a type of vulnerability that allows for execution of arbitrary SQL commands because the target system does not sanitize the SQL that is given to the SQL server.
It doesn't matter if you are on NoSQL or not. If you have a system running on mongodb (or couchdb, or XYZ db), and you provide a front end where users can enter records - and you don't correctly escape and sanitize the input coming from the front end; you are open to SQL injection.