Prepared statements on Select statements - mysqli

I’m just starting to convert all of my site’s code into prepared statements for that extra security cushion but I find myself running into the same questions.
After some reading, I’ve decided to use prepared statements on all select queries, however I’m not sure if all of variables in these queries require to be used as “parameters” in the prepared statement.
For example:
Where some_column IS NULL
Where some_column = $_SESSION[‘some-session-var’]
Where some_column IN ($someArray)
Also, is there some way to give each condition a “name” rather than using the question mark? I feel like I’ve seen this before in documentation, but I’ve had no luck finding it since.
For example: Where city_name = :cityName. If so, how would I go about binding the parameters here?
Thanks,
Evan

Yes. All data going to the query should be added via placeholders.
Otherwise there will be no security at all.
Though prepared statements extremely limited and support only scalar values, so, your first and third examples require extra coding (examples can be found in plenty under the tag)
Named placeholders you mentioned belongs to PDO, mysqli don't support them

Related

meta: why do I have to specify a group by clause

Just curious why I really have to specify a group by clause since if I use a function that requiers a group by clause(can't remember the general name of those functions), eg. SUM().
Because if I use one of those I have to specify every column that doesn't use one in the group by clause.
Why doesn't sql just automatically group on all columns that isn't using an aggregation function? It seems redundant since as soon as I'm using an aggregation I'm grouping on all other columns that is not using it.
Probably for the same reason a C compiler would not automatically assume and insert a variable declaration if you are using one that has not been previously declared. There are programming languages which do that sort of things, SQL is not one of them.
Editors, on the other hand, may be aware of this and at least auto-complete functionally dependent parts of the syntax for you. Oracle SQL developer will by default automatically append a GROUP BY clause as soon as it detects you're writing a select column list that needs it. IMO this is a pain, and I usually keep it turned off, but it will be as far as you get - on an IDE/editor level.
Edit: Based on your last comment, there is an option in MySQL (not Microsoft's T-SQL) meant to relax the rule by implementing optional feature T301 of the standard SQL99. I think this is exactly what you're after:
MySQL 5.7.5 and up implements detection of functional dependence. If the ONLY_FULL_GROUP_BY SQL mode is enabled (which it is by default), MySQL rejects queries for which the select list, HAVING condition, or ORDER BY list refer to nonaggregated columns that are neither named in the GROUP BY clause nor are functionally dependent on them.
Source: https://dev.mysql.com/doc/refman/5.7/en/group-by-handling.html
Could not find much information on the status of this feature in future versions of T-SQL, though. The only reference is this, with the very cryptic remark that T-SQL would "partially support this feature".

Any one know how to hack mysql_real_escape_string(strip_tags($_GET['query']))?

I have one question about SQL injection, this is using strip_tags() inner mysql_real_escape_string().
have you ever try to inject mysql_real_escape_string(strip_tags()) code? so the full code would like this one
SELECT * FROM table WHERE query='".mysql_real_escape_string(strip_tags())."'";
I want to know if that can be safe, because I don't want to change the current code using PDO if it's safe :)
strip_tags() has absolutely nothing to do neither with sql injections nor with mysql_real_escape_string function, and utterly irrelevant to the question.
So, it adds nothing to sql security.
Hope your confusions now cleared and you are going to use prepared statements from now on.

CHECK CONSTRAINT text is changed by SQL Server. How do I avoid or work around this?

When I define a CHECK CONSTRAINT on a table, I find the condition clause stored can be different than what I entered.
Example:
Alter table T1 add constraint C1 CHECK (field1 in (1,2,3))
Looking at what is stored:
select cc.Definition from sys.check_constraints cc
inner join sys.objects o on o.object_id = cc.parent_object_id
where cc.type = 'C' and cc.name = 'T1';
I see:
([field1]=(3) OR [field1]=(2) OR [field1]=(1))
Whilst these are equivalent, they are not the same text.
(A similar behaviour occurs when using a BETWEEN clause).
My reason for wishing this did not happen is that I am trying to programatically ensure that all my CHECK constraints are correct by comparing the text I would use to define
the constraint with that stored in sys.check_constraints - and if different then drop and recreate the constraint.
However, in these cases, they are always different and so the program would always think it needs to recreate the constraint.
Question is:
Is there any known reason why SQL Server does this translation? Is it just removing a bit of syntactic sugar and storing the clause in a simpler form?
Is there a way to avoid the behaviour (other than to write my constraint clauses in the long form to match what SQL Server would change it to)?
Is there another way to tell if my check constraint is 'out of date' and needs recreating?
Is there any known reason why SQL Server does this translation? Is it just removing a bit of syntactic sugar and storing the clause in a simpler form?
I'm not aware of any reasons documented in the Books Online, or elsewhere. However, my guess is that it's normalized for some purposes that are internal to SQL Server. It might allow SQL Server to be a bit lenient in defining the expression (such as using Database for a column name), but guaranteeing that the column names are always appropriately escaped for whatever engine needs to parse the expression (ie, [Database]).
Is there a way to avoid the behaviour (other than to write my constraint clauses in the long form to match what SQL Server would change it to)?
Probably not. But if your constraints aren't terribly complicated, is re-writing the constraint clauses in the long form such a bad idea?
Is there another way to tell if my check constraint is 'out of date' and needs recreating?
Before I answer this directly, I'd point out that there's a bit of programming philosophy involved here. The API that SQL Server provides for the text of a CHECK constraint only guarantees that you'll get something equivalent to the original expression. While you could certainly build some fancy methods to try to ensure that you'll always be able to reproduce SQL Server's normalized version of the expression, there's no guarantee that Microsoft won't change its normalization rules in the future. And indeed, there's probably no guarantee that two equivalent expressions will always be normalized identically!
So, I'd first advise you to re-examine your architecture, and see if you can accomplish the same result without having to rely on undocumented API behavior.
Having said that, there are a number of methods outlined in this question (and answer).
Another alternative, which is a bit more brute-force but perhaps acceptable, would be to always assume that the expression is "out of date" and simply drop/re-create the constraint every time you check. Unless you're expecting these constraints to frequently become out-of-date (or the tables are quite large), it seems this would be a decent solution. You could probably even run it in a transaction, so that if the new constraint is already violated, simply roll-back the transaction and report the error.

In what scenarios would I need to use the CREATEREF, DEREF and REF keywords?

This question is about why I would use the above keywords. I've found plenty of MSDN pages that explain how. I'm looking for the why.
What query would I be trying to write that means I need them? I ask because the examples I have found appear to be achievable in other ways...
To try and figure it out myself, I created a very simple entity model using the Employee and EmployeePayHistory tables from the AdventureWorks database.
One example I saw online demonstrated something similar to the following Entity SQL:
SELECT VALUE
DEREF(CREATEREF(AdventureWorksEntities3.Employee, row(h.EmployeeID))).HireDate
FROM
AdventureWorksEntities3.EmployeePayHistory as h
This seems to pull back the HireDate without having to specify a join?
Why is this better than the SQL below (that appears to do exactly the same thing)?
SELECT VALUE
h.Employee.HireDate
FROM
AdventureWorksEntities3.EmployeePayHistory as h
Looking at the above two statements, I can't work out what extra the CREATEREF, DEREF bit is adding since I appear to be able to get at what I want without them.
I'm assuming I have just not found the scenarios that demostrate the purpose. I'm assuming there are scenarios where using these keywords is either simpler or is the only way to accomplish the required result.
What I can't find is the scenarios....
Can anyone fill in the gap? I don't need entire sets of SQL. I just need a starting point to play with i.e. a brief description of a scenario or two... I can expand on that myself.
Look at this post
One of the benefits of references is that it can be thought as a ‘lightweight’ entity in which we don’t need to spend resources in creating and maintaining the full entity state/values until it is really necessary. Once you have a ref to an entity, you can dereference it by using DEREF expression or by just invoking a property of the entity
TL;DR - REF/DEREF are similar to C++ pointers. It they are references to persisted entities (not entities which have not be saved to a data source).
Why would you use such a thing?: A reference to an entity uses less memory than having the DEFEF'ed (or expanded; or filled; or instantiated) entity. This may come in handy if you have a bunch of records that have image information and image data (4GB Files stored in the database). If you didn't use a REF, and you pulled back 10 of these entities just to get the image meta-data, then you'd quickly fill up your memory.
I know, I know. It'd be easier just to pull back the metadata in your query, but then you lose the point of what REF is good for :-D

variable table or column names in a function

I'm trying to search all tables and columns in a database, a la here. The suggested technique is to construct SQL query strings and then EXEC them. This works well, as a stored procedure. (Another example of variable table/column names is here. Again, EXEC is used to execute "dynamic SQL".)
However, my app requires that I do this in a function, not an SP. (Our development framework has trouble obtaining results from an SP.) But in a function, at least on SQL Server 2008 R2, you can't use EXEC; I get this error:
Invalid use of a side-effecting operator 'INSERT EXEC' within a function.
According to the answer to this post, apparently by a Microsoft developer, this is by design; it has nothing to do with the INSERT, only the fact that when you execute dynamically-constructed SQL code, the parser cannot guarantee a lack of side effects. Therefore it won't allow you to create such a function.
So... is there any way to iterate over many tables/columns within a function?
I see from BOL that
The following statements are valid in a function: ...
EXECUTE
statements calling extended stored procedures.
Huh - How could extended SP's be guaranteed side-effect free?
But that doesn't help me anyway:
The extended stored procedure, when it is called from inside a
function, cannot return result sets to the client. Any ODS APIs that
return result sets to the client will return FAIL. The extended stored
procedure could connect back to an instance of SQL Server; however, it
should not try to join the same transaction as the function that
invoked the extended stored procedure.
Since we need the function to return the results of the search, an ESP won't help.
I don't really want to get into extended SP's anyway: incrementing the number of programming languages in the environment would complicate our development environment more than it's worth.
I can think of a few solutions right now, none of which is very satisfactory:
First call an SP that produces the needed data and puts it in a table, then select from the function which merely reads the result from the table; this could be trouble if the search takes a while and two users' searches overlap. Or,
Have the application (not the function) generate a long query naming every table and column name from the db. I wonder if the JDBC driver can handle a query that long. Or,
Have the application (not the function) generate a long series of short queries naming every table and column name from the db. This will make the overall search a lot slower.
Thanks for any suggestions.
P.S. Upon further searching, I stumbled across this question which is closely related. It has no answers.
Update: No longer needed
I think this question is still valid, and we may again have a situation where we need it. However, I don't need an answer anymore for the present problem. After much trial-and-error I managed to get our application framework to retrieve row results from the RDBMS via the JDBC driver from the stored procedure. Therefore getting the thing to work as a function is unnecessary.
But if anyone posts an answer here that helps with the stated problem, I will be happy to upvote and/or accept it as appropriate.
An sp is basically a predefined sql statment with some add ons.
So if you had
PSEUDOCODE
Create SP_DoSomething As
Select * From MyTable
END
And you can't use the SP
Then you just execute the SQL as in "Select * From MyTable"
As for that naff sql code.
For start you could join table to column with a where clause, which would get rid of that line by line if stuff.
Ask another question. Like How could this be improved, there's lots of scope for more attempts than mine.