Airflow parameters to Postgres Operator - amazon-redshift

I am trying to pass the execution date as runtime parameter to the postgres operator
class MyPostgresOperator(PostgresOperator):
template_fields = ('sql','parameters')
task = MyPostgresOperator(
task_id='test_date',
postgres_conn_id='redshift',
sql="test_file.sql",
parameters={'crunch_date':'{{ ds }}'},
dag=dag
)
Then I try to use this parameter in the sql query to accept the value as passed by the dag
select
{{ crunch_date }} as test1,
The dag sends the parameter correctly, however the query is just taking a null value instead of the execution date that is passed. Is there a way to have the postgresql with redshift accept the correct value for this parameter?

You will have to update your sql query as below:
select
{{ ds }} as test1,
You won't be able to use one templated field in other. If you want to pass a param in task and use it in Jinja template, use params parameter.
UPDATE:
But do note that params is not a templated field. And if you template it, it won't render as nested templating won't work.
task = MyPostgresOperator(
task_id='test_date',
postgres_conn_id='redshift',
sql="test_file.sql",
params={'textstring':'abc'},
dag=dag
)
where test_file.sql is:
select
{{ params.textstring }} as test1,
Check 4th point in https://medium.com/datareply/airflow-lesser-known-tips-tricks-and-best-practises-cf4d4a90f8f to understand more about params.

You can use the airflow macros inside the query string - which needs to be passed to the redshift.
Example:
PostgresOperator(task_id="run_on_redshift",
dag=dag,
postgres_conn_id=REDSHIFT_CONN_ID,
sql="""
UNLOAD ('select * from abc.xyz') TO 's3://path/{{ds}}/' iam_role 's3_iam_role' DELIMITER AS '^' ALLOWOVERWRITE addquotes ESCAPE HEADER parallel off;
"""
)

Related

Azure Data Factory - Lookup Activity

I'm calling a procedure using a lookup activity in Azure Data Factory.
NOTE: The reason to use Lookup here is, I wanted to store the OUTPUT parameter value from procedure into a variable in ADF for future use.
Below works,
DECLARE #ADFOutputMsg [VARCHAR](500);
EXEC Test.spAsRunTVA #ReportDate = '2022-06-01', #OutputMsg = #ADFOutputMsg OUTPUT;
SELECT #ADFOutputMsg As OutputMsg;
But when I want to pass dynamic parameters, it doesn't like,
DECLARE #ADFOutputMsg [VARCHAR](500);
EXEC #{pipeline().parameters.SchemaName}.spAsRunTVA #ReportDate = #{substring(pipeline().parameters.FileName,8,10)}, #OutputMsg = ADFOutputMsg OUTPUT;
SELECT #ADFOutputMsg As OutputMsg;
I also tried to keep the date As-Is and just updated SchemaName to be dynamic but I still get the error.
Please Provide single quote ' ' at your dynamic content
'#{substring(pipeline().parameters.FileName,8,10)}'
I tried to reproduce similar kind of approach in my environment and I got below results:
Use the below dynamic content in the query with lookup activity. and also Added dynamic content with single quotes ' '
select * from for_date where date1='#{string(pipeline().parameters.Date)}'
Added Date Parameter
Got this Output:

using concat in ADF with a pipeline parameter value

I have a pipeline with a copy activity from storage.
I'm using the concat method to combine number of parameters to create the folder path in the Storage.
I have a wildcardFolderPath field which gets its data from the parameters file.
Part of the data is string and the other is a pipeline parameter
"wildcardFolderPath": {
"value": "[concat(parameters('folderPath'), '/', parameters('folderTime')]",
"type": "Expression"
}
When the pipeline runs, the string param folderPath is retrieved as is but the value of folderTime is not evaluated and this is what I see.
formatDateTime(pipeline().parameters.currentScheduleDateTime) instead of the datetime string.
I also tried using:
#concat(parameters('folderPath'), '/', parameters('folderTime')
and
#{concat(parameters('folderPath'), '/', parameters('folderTime')}
but I get: The workflow parameter 'folderPath' is not found.
Anyone encountered such an issue?
Create a parameter at pipeline level and pass in the expression builder with the following syntax.
#pipeline().parameters.parametername
Example:
You can add the parameter inside Add dynamic content if its not created before and select the parameters created to build an expression.
#concat(pipeline().parameters.Folderpath, '/', pipeline().parameters.Filedate)
Code:

Is there a way to use User Activity Variables to store SQL in Datastage

I am considering using RCP to run a generic datastage job, but the initial SQL changes each time it's called. Is there a process in which I can use a User Activity Variable to inject SQL from a text file or something so I can use the same datastage?
I know this Routine can read a file to look up parameters:
Routine = ‘ReadFile’
vFileName = Arg1
vArray = ”
vCounter = 0
OPENSEQ vFileName to vFileHandle
Else Call DSLogFatal(“Error opening file list: “:vFileName,Routine)
Loop
While READSEQ vLine FROM vFileHandle
vCounter = vCounter + 1
vArray = Fields(vLine,’,’,1)
vArray = Fields(vLine,’,’,2)
vArray = Fields(vLine,’,’,3)
Repeat
CLOSESEQ vFileHandle
Ans = vArray
Return Ans
But does that mean I just store the SQL in one Single line, even if it's long?
Thanks.
Why not just have the SQL within the routine itself and propagate parameters?
I have multiple queries within a single routine that does just that (one for source and one for AfterSQL statement)
This is an example and apologies I'm answering this on my mobile!
InputCol=Trim(pTableName)
If InputCol='Table1' then column='Day'
If InputCol='Table2' then column='Quarter, Day'
SQLCode = ' Select Year, Month, '
SQLCode := column:", Time, "
SQLCode := " to_date(current_timestamp, 'YYYY-MM-DD HH24:MI:SS'), "
SQLCode := \ "This is example text as output" \
SQLCode := "From DATE_TABLE"
crt SQLCode
I've used the multiple encapsulations in the example above, when passing out to a parameter make sure you check the ', " have either been escaped or are displaying correctly
Again, apologies for the quality but I hope it gives you some ideas!
You can give this a try
As you mentioned ,maintain the SQL in a file ( again , if the SQL keeps changing , you need to build a logic to automate populating the new SQL)
In the Datastage Sequencer , use a Execute Command Activity to open the SQL file
eg : cat /home/bk/query.sql
In the job activity which calls your generic job . you should map the command output of your EC activity to a job parameter
so if EC activity name is exec_query , then the job parameter will be
exec_query.$CommandOuput
When you run the sequence , your query will flow from
SQL file --> EC activity-->Parameter in Job activity-->DB stage( query parameterised)
Has you thinked to invoke a shellscript who connect to database and execute the SQL script from the sequential job? You could use sqlplus to connect in the shellscript and read the file with the SQL and use it. To execute the shellscript from the sequential job use a ExecCommand Stage (sh, ./, ...), it depends from the interpreter.
Other way to solve this, depends of the modification degree of your SQL; you could invoke a routine base who handle the parameters and invokes your parallel job.
The principal problem that I think you could have, is the limit of the long of the variable where you could store the parameter.
Tell me what option you choose and I could help you more.

Specifying a DB table dynamically? Is it possible?

I am writing a BSP and based on user-input I need to select data from different DB tables. These tables are in different packages. Is it possible to specify the table I want to use, based on its path, like this:
data: path1 type string value 'package1/DbTableName',
path2 type string value 'package2/OtherDbTableName',
table_to_use type string.
if some condition
table_to_use = path1.
elseif some condition
table_to_use = path2.
endif.
select *
from table_to_use
...
endselect
I am new to ABAP & Open SQL and am aware this could be an easy/silly question :) Any help at all would be very much appreciated!
You can define the name of the table to use in a variable, and then use the variable in the FROM close of your request :
data tableName type tabname.
if <some condition>.
tableName='PA0001'.
else.
tableName='PA0002'.
endif.
select * from (tableName) where ...
there are a few limitation to this method, as the stable can not contains fields of type RAWSTRING, STRING or SSTRING.
as for the fact that the table are in different package, i don't think it matters.
Regards,

Replacing contents of xml attribute in xml column in DB2

I have an xml document saved in a DB2 Table with XML datatype and I want to update the value of a node. I tried this:
XQUERY replace value of node db2-fn:sqlquery('select my_xml_column from myTable where someId = someValue)/some/xpath/with/#attribute with "foobar"
(and I tried several variants, everything that google hinted that it could do the job).
But unfortunatelly I am just getting error messages. Here:
SQL16002N An XQuery expression has an unexpected token "value" following "replace ". Expected tokens may include: "
What am I doing wrong?
update myTable SET myXmlColumn = XMLQUERY('
transform copy $copy := $original
modify do replace value of $copy/some/xpath/with/#attribute with "FOOBAR"
return $copy
'
PASSING myXmlColumn AS "original"
) WHERE someId = someValue
This works and has the desired effect. It hoped for somebody to come up with a pure XQuery solution, but the problem is solved...