Increase security of DB INSERT - sql-injection

Currently, I have a PostGIS DB, and I have the basics working. However, I am inserting directly into the database using pg_query. I have been recommended to use pg_query_params to help prevent against SQL injections, but am unsure how to implement this. Below is a cut-down example of my current insert statement for a site location. How would I, for example, utilise pg_query_params with this example? I know I will have to implement further security, but it is a starting point.
EDIT: I was going to use the drupal form API but it gave me headaches. I realize that would do a lot of this stuff automatically.
$sql = "INSERT INTO sites_tbl (river_id ,sitename ,the_geom) VALUES ('$_POST[river_id]','$_POST[sitename]',st_geomfromtext('POINT($geomstring)',27700))";
$result = pg_query($sql);

Because you are using strings rather than parameters, your example is vulnerable to SQL injection. It's best to avoid pg_ functions. In your case there are two things you need to take into account:
Learn the Drupal API (considering you are using Drupal this would be the best for code consistency
or
Use stored procedures
Use a library like PDO or pg_query_params which takes care of parameterized queries
Normally you use stored procedures in addition to PDO, unfortunately sometimes this is not manageable because you have too much code. My advice is to use as much stored procedures as possible.

Related

ADO.NET escaping SQL without parameterization

I'm trying to prove something to a friend of mine re: escaping SQL strings. What are the recommendations for escaping user-input parts of a SQL Server query when parameterization cannot be used (like when someone saves a chunk of SQL text in the database and it will be used as part of a where clause later)? Basically, it's a way to save pre-canned queries in a single text column. And even if he parameterized the where clause, it could potentially have an arbitrary number of parameters (that would still have to be stored in the database as text). Is there a way to do this that doesn't open up a SQL injection risk?
In this case, you can't really parameterize up front because you don't have a clean way of making sure that parameter names don't clash and the like. Is there some library in System.Data.SqlClient that just escapes strings to make them SQL safe without requiring parameterization? My buddy seems to think this is a thing and I don't, and I'm trying to keep him from stepping on himself. Oh, and to make things more fun, the SQL gets jammed into the database by .NET, but is executed dynamically by SQL Server, so there's no good way to rig it with EF or something like that either. For his approach to work, he'll have to sanitize the SQL some way.

How does data.stackexchange.com allow queries securely?

https://data.stackexchange.com/ lets me query some (all?) of
stackexchange's data/tables using arbitrary SQL queries, including
parametrization.
What program do they use to do this and is it published?
I want to create something like this myself (different data), but am
constantly worried that I'll miss an injection attack or set
permissions incorrectly.
Obviously, data.stackexchange.com has figured out how to do this
securely. How do I replicate what they've done?
This follows up my earlier question: Existing solution to share database data usefully but safely?

When to use t-SQL over the Entity Framework

Could someone tell me if there are any times when it is more advantageous to use t-SQL over the Entity Framework? I'm aware of the N+1 issue, but is there any other gotchas I should be aware of? For instance, do Linq-to-EF queries cache as well as stored procedures? Are there instances where the SQL generated by EF is less than optimal?
Thanks!
Whenever you need to do the work "inside" the DB server and not go back and forth between your code and Server.
Also - when you use stored procedures, you can alter the code without recompiling/deploying, it might be easier on production environments.
IMHO it sometimes easier to code complex SQL statements in T-SQL rather than using LINQ....

Accessing non-public schema in PostgreSQL with Pentaho

Let me start by saying, what I know about Pentaho wouldn't fill up a single paragraph. I'm more knowledgeable about PostgreSQL. I'm working with some contractors that are building a set of monthly reports in Pentaho (v. 4.5) for my company. Some of the data needs to go through a ETL process and get rolled up for reporting purposes. From a dba(ish) point of view, I would like to move these tables into a separate PostgreSQL schema.
I know that Pentaho is often times used with MySQL (which doesn't have schemas) and I'm concerned this might cause problems. I've done some "googlin'" and I don't turn up a lot of hits on the topic, but I did find a closed bug from a few years ago - thus implying that the functionality should be supported.
before I do this, I would like to see if anyone knows of a reason this will fail or be a bad idea. (or if you've done it an it works great, please let me know that, too).
Final notes: I'm using PostgreSQL 9.1.5, and I don't have access to a Pentaho instance to even test this myself. And I'm hoping the good folks in the Stackoverflow community will share their expertise and save me from having to install one and the hours of playing/testing to get an idea of this is a bad idea.
EDIT:
I sort of knew this question was a bit vague, but I was hoping that some one would read it and share any experience they have. So, Let me spell it out more clearly and ask more explicit questions.
I have not done anything. I don't know Pentaho. I don't want to learn Pentaho (not that there is anything wrong with Pentaho... It's just not where my interests are right now). My company hired contractors (I did not hire them). They have experience with Pentaho, but with MySQL. They don't really know anything about PostgreSQL. There are some important difference between PostgreSQL and MySQL. Including the fact that PostgreSQL supports schemas (whereas MySQL uses separate database... similar in concept be behave differently in some ways). Some ORMs (and tools) don't really like this... for example, the Django framework still doesn't really fully support schemas in Postgresql (I know this because I use Python and Django often and my life is much better when I keep things in the "public" schema). Because of my experience with Django and PostgreSQL schemas, I'm a bit leery of moving this data to a new schema.
I do understand that where ever the tables are, they will need permissions to be able to access the data.
My explicit questions:
Do you use Pentaho to access a PostgreSQL database to access tables in schemas other than "public" (the default).
If so, does it just work (no problems)?
If you had problems, would you please be willing to share with me (and the Stackoverflow community) any online resources that helped you? Or would you be willing to detail what you remember here?
Do you know of anything that just won't work correctly? For example, an open bug in Pentaho related to this topic.
Again, it's not your standard kind of question. I'm hoping that someone out there has experience and is willing to share it here and save me from having to spend time setting up a new Pentaho instance and trying to learn Pentaho well enough to test it, etc.
Thanks.
Two paths you can take:
1) What previous post said ("Pentaho steps (table inputs, outputs, etc.) usually allow you to specify a database schema.")
2) In database connection, advanced tab, "The preferred schema name".
If you're working with different schemas, you can create one database connection per schema. With this approach you can leave schema field in input/output steps empty.
We use MS SQL server and I can tell you that Pentaho does struggle with the idea of a schema. Many of their apps allow you to select a schema but Pentaho, like you said, is built to use something like mySQL.
Make you pentaho database user work like it would be working in mySQL.
We made the database user default to dbo then we structured our tables like dbo.dimDimension,
dbo.factFactTable etc. Basically, only use dbo for Pentaho purposes. (Or whatever schema you want to default to.)
I use PDI and PgSQL extensively every day with a bunch of different schemas. It works fine. The only trouble you might run into is Pg's troublesome practice of forcing unquoted identifiers to lower instead of upper case. I soon realized everything was easier when I set the Advanced connection property to "Quote all in database".
Yes, you have to quote everything when you type SQL if PDI doesn't do it for you, but it works quite well. Haven't experimented with forcing all identifiers to lower case, but I expect that would work as well.
And yes, use the "Preferred schema nanme" as well, but be aware that some steps use that option and others don't. You can't, for example, expect it to add schema names to SQL you type into a Table Input step.
The only other issues you might run into are the limits of Pg's JDBC driver. It's not as good as SQL Server's or DB2's, but the only thing I've every had trouble with was sending error rows from a Table Output step to another step when the Table Output step was in batch mode.
Have fun learning PDI. It makes a great complement to your DBA skills.
Brian
Pentaho steps (table inputs, outputs, etc.) usually allow you to specify a database schema.
I did a quick test using PDI and our 8.4 Postgres instance and was able to explore, read from and write to tables in different schemas.
So, I think this is a reasonable direction. Hope this helps.

NoSql Injection in Python

when trying to make this question, i got this one it is using Java, and in the answer it gave a Ruby example, and it seems that the injection happens only when using Json? because i've an expose where i'll try to compare between NoSQL and SQL and i was trying to said: be happy, nosql has no sql injection since it's not sql ...
can you please explain me:
how sql injection happens when using Python driver (pymongo).
how to avoid it.
the comparison using the old way sql injection using the comment in the login form.
There are a couple of concerns with injection in MongoDB:
$where JS injection - Building JavaScript functions from user input can result in a query that can behave differently to what you expect. JavaScript functions in general are not a responsible method to program MongoDB queries and it is highly recommended to not use them unless absolutely needed.
Operator injection - If you allow users to build (from the front) a $or or something they could easily manipulate this ability to change your queries. This of course does not apply if you just take data from a set of text fields and manually build a $or from that data.
JSON injection - Quite a few people recently have been trying to convert a full JSON document sent (saw this first in JAVA, ironically) from some client side source into a document for insertion into MongoDB. I shouldn't need to even go into why this is bad. A JSON value for a field is fine since, of course, MongoDB is BSON.
As #Burhan stated injection comes from none sanitized input. Fortunately for MongoDB it has object orientated querying.
The problem with SQL injection comes from the word "SQL". SQL is a querying language built up of strings. On the other hand MongoDB actually uses a BSON document to specify a query (an Object). If you keep to the basic common sense rules I gave you above you should never have a problem with an attack vector like:
SELECT * FROM tbl_user WHERE ='';DROP TABLE;
Also MongoDB only supports one operation per command atm (without using eval, don't ever do that though) so that wouldn't work anyway...
I should add that this does not apply to data validation only injection.
SQL injection has nothing to do with the database. It is a type of vulnerability that allows for execution of arbitrary SQL commands because the target system does not sanitize the SQL that is given to the SQL server.
It doesn't matter if you are on NoSQL or not. If you have a system running on mongodb (or couchdb, or XYZ db), and you provide a front end where users can enter records - and you don't correctly escape and sanitize the input coming from the front end; you are open to SQL injection.