Azure Data Factory Expression Builder string formatting Error: unrecognised token (new line) - azure-data-factory

I have a Mapping Data Flow, where I want to use a custom SQL query for the Source, but I cannot break it on multiple lines, I get an error stating:
token recognition error at: ''
If I remove the newline and put the whole query on a single line it works, but it looks bearly readable. I would like to preserve the query formatting.
Does anyone have an idea how to do this?
LE the same happens with a simple statement like
select
1
This is how it looks in ADF:

You can enter a query directly into the SQL Query box as multi-lines (see image). You only need to use the expression builder or dynamic content if you are going to use expressions or parameters.

You cannot use CTEs in ADF data flow source queries, that's the issue.

Related

Substring of column name in Copy Activity in ADF v2

Is there a way in the V2 Copy Activity to operate upon one of the input columns (of type string) with an expression? Before I load rows to the destination, I need to limit the number of characters in the column.
My hope was to simply switch from something like this:
"ColumnMappings": "inColumn: outColumn"
to something like this:
"ColumnMappings": "#substring(inColumn, 1, 300): outColumn"
If anyone can point me to where I can read-up on where & when string expressions can be used, I could use the guidance.
This is the official documentation on expressions and functions: https://learn.microsoft.com/en-us/azure/data-factory/control-flow-expression-language-functions
And this is the documentation on mappings: https://learn.microsoft.com/en-us/azure/data-factory/copy-activity-schema-and-type-mapping
Also remember that if you are using a defined query in the copy activity, you can use sql functions like CAST([fieldName] as varchar(300)) to limit the amount of characters on a particular field.
Hope this helped!
When you don't have a SQL Source, but your destination is a SQL sink, you can use a Stored Procedure to insert your data into the final table. That way, you can define these kinds of transformations in the stored procedure. I don't think the Data Factory can handle these kinds of activities, it is more intended as an orchestrator.
Have a look here:
https://learn.microsoft.com/en-us/azure/data-factory/connector-sql-server#invoke-stored-procedure-from-sql-sink

PHP and sanitizing strings for use in dynamicly created DB2 queries

I'm relatively new to DB2 for IBMi and am wondering the methods of how to properly cleanse data for a dynamically generated query in PHP.
For example if writing a PHP class which handles all database interactions one would have to pass table names and such, some of which cannot be passed in using db2_bind_param(). Does db2_prepare() cleanse the structured query on its own? Or is it possible a malformed query can be "executed" within a db2_prepare() call? I know there is db2_execute() but the db is doing something in db2_prepare() and I'm not sure what (just syntax validation?).
I know if the passed values are in no way effected by the result of user input there shouldn't be much of an issue, but if one wanted to cleanse data before using it in a query (without using db2_prepare()/db2_execute()) what is the checklist for db2? The only thing I can find is to escape single quotes by prefixing them with another single quote. Is that really all there is to watch out for?
There is no magic "cleansing" happening when you call db2_prepare() -- it will simply attempt to compile the string you pass as a single SQL statement. If it is not a valid DB2 SQL statement, the error will be returned. Same with db2_exec(), only it will do in one call what db2_prepare() and db2_execute() do separately.
EDIT (to address further questions from the OP).
Execution of every SQL statement has three stages:
Compilation (or preparation), when the statement is parsed, syntactically and semantically analyzed, the user's privileges are determined, and the statement execution plan is created.
Parameter binding -- an optional step that is only necessary when the statement contains parameter markers. At this stage each parameter data type is verified to match what the statement text expects based on the preparation.
Execution proper, when the query plan generated at step 1 is performed by the database engine, optionally using the parameter (variable) values provided at step 2. The statement results, if any, are then returned to the client.
db2_prepare(), db2_bind_param(), and db2_execute() correspond to steps 1, 2 and 3 respectively. db2_exec() combines steps 1 and 3, skipping step 2 and assuming the absence of parameter markers.
Now, speaking about parameter safety, the binding step ensures that the supplied parameter values correspond to the expected data type constraints. For example, in the query containing something like ...WHERE MyIntCol = ?, if I attempt to bind a character value to that parameter it will generate an error.
If instead I were to use db2_exec() and compose a statement like so:
$stmt = "SELECT * FROM MyTab WHERE MyIntCol=" . $parm
I could easily pass something like "0 or 1=1" as the value of $parm, which would produce a perfectly valid SQL statement that only then will be successfully parsed, prepared and executed by db2_exec().

Is there any logical reason to use CFQUERYPARAM in Query of Queries?

I primarily use CFQUERYPARAM to prevent SQL injection. Since Query-of-Queries (QoQ) does not touch the database, is there any logical reason to use CFQUERYPARAM in them? I know that values that do not match the cfsqltype and maxlength will throw an exception, but, these values should already be validated before that and display friendly messages (from a UX viewpoint).
Since Query-of-Queries (QoQ) does not touch the database, is there any logical reason to use CFQUERYPARAM in them? Actually, it does touch the database, the database that you currently have stored in memory. The data in that database could still theoretically be tampered with via some sort of injection from the user. Does that affect your physical database - no. Does that affect the use of the data within your application - yes.
You did not give any specific details but I would err on the side of caution. If ANY of the data you are using to build your query comes from the client then use cfqueryparam in them. If you can guarantee that none of the elements in your query comes from the client then I think it would be okay to not use the cfqueryparam.
As an aside, using cfqueryparam also helps optimize the query for the database although I'm not sure if that is true for query of queries. It also escapes characters for you like apostrophes.
Here is a situation where it's simpler, in my opinion.
<cfquery name="NoVisit" dbtype="query">
select chart_no, patient_name, treatment_date, pr, BillingCompareField
from BillingData
where BillingCompareField not in
(<cfqueryparam cfsqltype="cf_sql_varchar"
value="#ValueList(FinalData.FinalCompareField)#" list="yes">)
</cfquery>
The alternative would be to use QuotedValueList. However, if anything in that value list contained an apostrophe, cfqueryparam will escape it. Otherwise I would have to.
Edit starts here
Here is another example where not using query parameters causes an error.
QueryAddRow(x,2);
QuerySetCell(x,"dt",CreateDate(2001,1,1),1);
QuerySetCell(x,"dt",CreateDate(2001,1,11),2);
</cfscript>
<cfquery name="y" dbtype="query">
select * from x
<!---
where dt in (<cfqueryparam cfsqltype="cf_sql_date" value="#ValueList(x.dt)#" list="yes">)
--->
where dt in (#ValueList(x.dt)#)
</cfquery>
The code as written throws this error:
Query Of Queries runtime error.
Comparison exception while executing IN.
Unsupported Type Comparison Exception:
The IN operator does not support comparison between the following types:
Left hand side expression type = "DATE".
Right hand side expression type = "LONG".
With the query parameter, commented out above, the code executes successfully.

SQL injection? CHAR(45,120,49,45,81,45)

I just saw this come up in our request logs. What were they trying to achieve?
The full request string is:
properties?page=2side1111111111111 UNION SELECT CHAR(45,120,49,45,81,45),CHAR(45,120,50,45,81,45),CHAR(45,120,51,45,81,45),CHAR(45,120,52,45,81,45),CHAR(45,120,53,45,81,45),CHAR(45,120,54,45,81,45),CHAR(45,120,55,45,81,45),CHAR(45,120,56,45,81,45),CHAR(45,120,57,45,81,45),CHAR(45,120,49,48,45,81,45),CHAR(45,120,49,49,45,81,45),CHAR(45,120,49,50,45,81,45),CHAR(45,120,49,51,45,81,45),CHAR(45,120,49,52,45,81,45),CHAR(45,120,49,53,45,81,45),CHAR(45,120,49,54,45,81,45) -- /*
Edit: As a google search didn't return anything useful I wanted to ask the question for people who encounter the same thing.
This is just a test for injection. If an attacker can see xQs in the output then they'll know injection is possible.
There is no "risk" from this particular query.
A developer should pay no attention to whatever injection mechanisms, formats or meanings - these are none of his business.
There is only one cause for for all the infinite number of injections - an improperly formatted query. As long as your queries are properly formatted then SQL injections are not possible. Focus on your queries rather than methods of SQL injection.
The Char() function interprets each value as an integer and returns a string based on given the characters by the code values of those integers. With Char(), NULL values are skipped. The function is used within Microsoft SQL Server, Sybase, and MySQL, while CHR() is used by RDBMSs.
SQL's Char() function comes in handy when (for example) addslashes() for PHP is used as a precautionary measure within the SQL query. Using Char() removes the need of quotation marks within the injected query.
An example of some PHP code vulnerable to an SQL injection using Char() would look similar to the following:
$uname = addslashes( $_GET['id'] );
$query = 'SELECT username FROM users WHERE id = ' . $id;
While addslashes() has been used, the script fails properly sanitize the input as there is no trailing quotation mark. This could be exploited using the following SQL injection string to load the /etc/passwd file:
Source: http://hakipedia.com/index.php/SQL_Injection#Char.28.29

Parameterized SQL Columns?

I have some code which utilizes parameterized queries to prevent against injection, but I also need to be able to dynamically construct the query regardless of the structure of the table. What is the proper way to do this?
Here's an example, say I have a table with columns Name, Address, Telephone. I have a web page where I run Show Columns and populate a select drop-down with them as options.
Next, I have a textbox called Search. This textbox is used as the parameter.
Currently my code looks something like this:
result = pquery('SELECT * FROM contacts WHERE `' + escape(column) + '`=?', search);
I get an icky feeling from it though. The reason I'm using parameterized queries is to avoid using escape. Also, escape is likely not designed for escaping column names.
How can I make sure this works the way I intend?
Edit:
The reason I require dynamic queries is that the schema is user-configurable, and I will not be around to fix anything hard-coded.
Instead of passing the column names, just pass an identifier that you code will translate to a column name using a hardcoded table. This means you don't need to worry about malicious data being passed, since all the data is either translated legally, or is known to be invalid. Psudoish code:
#columns = qw/Name Address Telephone/;
if ($columns[$param]) {
$query = "select * from contacts where $columns[$param] = ?";
} else {
die "Invalid column!";
}
run_sql($query, $search);
The trick is to be confident in your escaping and validating routines. I use my own SQL escape function that is overloaded for literals of different types. Nowhere do I insert expressions (as opposed to quoted literal values) directly from user input.
Still, it can be done, I recommend a separate — and strict — function for validating the column name. Allow it to accept only a single identifier, something like
/^\w[\w\d_]*$/
You'll have to rely on assumptions you can make about your own column names.
I use ADO.NET and the use of SQL Commands and SQLParameters to those commands which take care of the Escape problem. So if you are in a Microsoft-tool environment as well, I can say that I use this very sucesfully to build dynamic SQL and yet protect my parameters
best of luck
Make the column based on the results of another query to a table that enumerates the possible schema values. In that second query you can hardcode the select to the column name that is used to define the schema. if no rows are returned then the entered column is invalid.
In standard SQL, you enclose delimited identifiers in double quotes. This means that:
SELECT * FROM "SomeTable" WHERE "SomeColumn" = ?
will select from a table called SomeTable with the shown capitalization (not a case-converted version of the name), and will apply a condition to a column called SomeColumn with the shown capitalization.
Of itself, that's not very helpful, but...if you can apply the escape() technique with double quotes to the names entered via your web form, then you can build up your query reasonably confidently.
Of course, you said you wanted to avoid using escape - and indeed you don't have to use it on the parameters where you provide the ? place-holders. But where you are putting user-provided data into the query, you need to protect yourself from malicious people.
Different DBMS have different ways of providing delimited identifiers. MS SQL Server, for instance, seems to use square brackets [SomeTable] instead of double quotes.
Column names in some databases can contain spaces, which mean you'd have to quote the column name, but if your database contains no such columns, just run the column name through a regular expression or some sort of check before splicing into the SQL:
if ( $column !~ /^\w+$/ ) {
die "Bad column name [$column]";
}