Is there a way to use User Activity Variables to store SQL in Datastage

Is there a way to use User Activity Variables to store SQL in Datastage - datastage

I am considering using RCP to run a generic datastage job, but the initial SQL changes each time it's called. Is there a process in which I can use a User Activity Variable to inject SQL from a text file or something so I can use the same datastage?
I know this Routine can read a file to look up parameters:
Routine = ‘ReadFile’
vFileName = Arg1
vArray = ”
vCounter = 0
OPENSEQ vFileName to vFileHandle
Else Call DSLogFatal(“Error opening file list: “:vFileName,Routine)
Loop
While READSEQ vLine FROM vFileHandle
vCounter = vCounter + 1
vArray = Fields(vLine,’,’,1)
vArray = Fields(vLine,’,’,2)
vArray = Fields(vLine,’,’,3)
Repeat
CLOSESEQ vFileHandle
Ans = vArray
Return Ans
But does that mean I just store the SQL in one Single line, even if it's long?
Thanks.

Why not just have the SQL within the routine itself and propagate parameters?
I have multiple queries within a single routine that does just that (one for source and one for AfterSQL statement)
This is an example and apologies I'm answering this on my mobile!
InputCol=Trim(pTableName)
If InputCol='Table1' then column='Day'
If InputCol='Table2' then column='Quarter, Day'
SQLCode = ' Select Year, Month, '
SQLCode := column:", Time, "
SQLCode := " to_date(current_timestamp, 'YYYY-MM-DD HH24:MI:SS'), "
SQLCode := \ "This is example text as output" \
SQLCode := "From DATE_TABLE"
crt SQLCode
I've used the multiple encapsulations in the example above, when passing out to a parameter make sure you check the ', " have either been escaped or are displaying correctly
Again, apologies for the quality but I hope it gives you some ideas!

You can give this a try
As you mentioned ,maintain the SQL in a file ( again , if the SQL keeps changing , you need to build a logic to automate populating the new SQL)
In the Datastage Sequencer , use a Execute Command Activity to open the SQL file
eg : cat /home/bk/query.sql
In the job activity which calls your generic job . you should map the command output of your EC activity to a job parameter
so if EC activity name is exec_query , then the job parameter will be
exec_query.$CommandOuput
When you run the sequence , your query will flow from
SQL file --> EC activity-->Parameter in Job activity-->DB stage( query parameterised)

Has you thinked to invoke a shellscript who connect to database and execute the SQL script from the sequential job? You could use sqlplus to connect in the shellscript and read the file with the SQL and use it. To execute the shellscript from the sequential job use a ExecCommand Stage (sh, ./, ...), it depends from the interpreter.
Other way to solve this, depends of the modification degree of your SQL; you could invoke a routine base who handle the parameters and invokes your parallel job.
The principal problem that I think you could have, is the limit of the long of the variable where you could store the parameter.
Tell me what option you choose and I could help you more.

Related

Azure Data Factory - Lookup Activity

I'm calling a procedure using a lookup activity in Azure Data Factory.
NOTE: The reason to use Lookup here is, I wanted to store the OUTPUT parameter value from procedure into a variable in ADF for future use.
Below works,
DECLARE #ADFOutputMsg [VARCHAR](500);
EXEC Test.spAsRunTVA #ReportDate = '2022-06-01', #OutputMsg = #ADFOutputMsg OUTPUT;
SELECT #ADFOutputMsg As OutputMsg;
But when I want to pass dynamic parameters, it doesn't like,
DECLARE #ADFOutputMsg [VARCHAR](500);
EXEC #{pipeline().parameters.SchemaName}.spAsRunTVA #ReportDate = #{substring(pipeline().parameters.FileName,8,10)}, #OutputMsg = ADFOutputMsg OUTPUT;
SELECT #ADFOutputMsg As OutputMsg;
I also tried to keep the date As-Is and just updated SchemaName to be dynamic but I still get the error.

Please Provide single quote ' ' at your dynamic content
'#{substring(pipeline().parameters.FileName,8,10)}'
I tried to reproduce similar kind of approach in my environment and I got below results:
Use the below dynamic content in the query with lookup activity. and also Added dynamic content with single quotes ' '
select * from for_date where date1='#{string(pipeline().parameters.Date)}'
Added Date Parameter
Got this Output:

creating job with ssis step using tsql

I would like to create sql server job using stored procedure and I can't seem to get it right.
Integration Service Catologs -> SSIDB -> Cat1 ->Projects->999->Packages->999.dtsx
In step 1 properties of below script on Package tab "Server: and Package:" are empty, I need to populate these as well as set 32bit to true
Below is what I got, thanks in advance
EXECUTE msdb..sp_add_job #job_name = 'Job 1', #owner_login_name = SUSER_NAME(), #job_id = #JobId OUTPUT
EXECUTE msdb..sp_add_jobserver #job_id = #JobId, #server_name = ##SERVERNAME
EXECUTE msdb..sp_add_jobstep #job_id = #JobId, #step_name = 'Step1',#database_name = DB_NAME(), #on_success_action = 3 ,#subsystem = N'ssis'
, #command = N' "\SSISDB\Cat1\999\999.dtsx" #SERVER=N"#ServerName"'
EXECUTE msdb..sp_add_jobstep #job_id = #JobId, #step_name = 'Step2', #command = 'execute msdb..sp_delete_job #job_name="Job 1"'
EXECUTE msdb..sp_start_job #job_id = #JobId

if anyone else comes across similar situation, easiest way to figure out how to create a job pragmatically is to create it using UI (Server Agent -> New Job). create everything you want to see, save it, then right click at the job Script Job As -> Create To -> New query and sql server will export the job as a query so you can see what you need to do.

While we wait for clarification on the existing syntax, the two arguments to msdb..sp_add_jobstep that you need to be concerned with are the #subsystem and #command.
, #subsystem = N'SSIS'
, #command = N'/ISSERVER "\"\SSISDB\POC\SSISConfigMixAndMatch\Package.dtsx\"" /SERVER "\".\dev2014\"" /X86 /Par "\"$ServerOption::LOGGING_LEVEL(Int16)\"";1 /Par "\"$ServerOption::SYNCHRONIZED(Boolean)\"";True /CALLERINFO SQLAGENT /REPORTING E'
The GUI will build out these options happily but you can read the dtexec documentation and come to the same script.
/ISSERVER This specifies that we're using the fancy new execution engine built into the SSISDB
We pass in the package we want to execute to this option
/SERVER where will these packages be found
Specify the server name and optional instance
/X86 As the fine documentation notes, this option only works for invocation from SQL Agent but this is how you specify you need to use the 32 bit dtexec.exe
/Par Specify parameter values as needed
Indicates our standard, Basic, level of logging
The next instance of /Par specifies whether the caller should wait for the process to complete (synchronous versus asynchronous process). Yes, the job steps should wait for the process to complete.
/Reporting What information should be reported. This is odd because the useful information you used to get in an SQL Agent job report is no longer there. It will just say Consult the SSISDB reports for more information
E, report Errors only.

MVS OS-390 - How do I Capture Job Information from CA-JOBTRAC programmatically

I am using REXX to invoke JOBTRAC programmatically which works however I am unable to pass JOBNAME arguments using this approach. Can this be done using REXX?
The idea is to find the history of the job run using the program jobtrac. We use jobtrac's schedule to find the history of when job runs happened. We invoke jobtrac using
‘TSO JOBTRAC’ AND SUPPLY history command ‘H XXXXXX’ in the command line (XXXXX – jobname)
I was thinking to route the jobtrac info to a flat file and parse it so that I can do some reporting real time during the job run. The above problem is also linked to this following scenario:
If I give dslist 'DSLIST A.B.C.*'’ in the ISPF panel
It gives the series of datasets ...
A.B.C.A,
A.B.C.D
A.B.C.E
When I give
"SAVE ORANGE"
it stores this list under
MYUSERID.ORANGE.DATASETS.
I know this can be automated pro grammatically and I have seen that . But I don’t have the code base to do that right now. This is much similar to the jobtrack issue I have.
Here is some REXX CODE to help with understanding. I know this code is wrong…we cannot use outtrap for this as it is used to get console output.
say 'No. of month end jobs considered for history :'jobnames.0
if jobnames.0 > 0 then do
do i = 1 to jobnames.0
say jobnames.i
jobname = Word(jobnames.i,1);
say 'jobname under consideration is ' jobname;
tsocmd="JOBTRAC;ADDLOC=000;H "|| strip(jobname);
say 'tso command is ' tsocmd;
y = outtrap(jobdetails.)
Address TSO "'tsocmd'" ------------------> wrong…I believe I have to use ispexec
say 'job details are ' jobdetails.6;
end;

SQLAlchemy, Psycopg2 and Postgresql COPY

It looks like Psycopg has a custom command for executing a COPY:
psycopg2 COPY using cursor.copy_from() freezes with large inputs
Is there a way to access this functionality from with SQLAlchemy?

accepted answer is correct but if you want more than just the EoghanM's comment to go on the following worked for me in COPYing a table out to CSV...
from sqlalchemy import sessionmaker, create_engine
eng = create_engine("postgresql://user:pwd#host:5432/db")
ses = sessionmaker(bind=engine)
dbcopy_f = open('/tmp/some_table_copy.csv','wb')
copy_sql = 'COPY some_table TO STDOUT WITH CSV HEADER'
fake_conn = eng.raw_connection()
fake_cur = fake_conn.cursor()
fake_cur.copy_expert(copy_sql, dbcopy_f)
The sessionmaker isn't necessary but if you're in the habit of creating the engine and the session at the same time to use raw_connection you'll need separate them (unless there is some way to access the engine through the session object that I don't know). The sql string provided to copy_expert is also not the only way to it, there is a basic copy_to function that you can use with subset of the parameters that you could past to a normal COPY TO query. Overall performance of the command seems fast for me, copying out a table of ~20000 rows.
http://initd.org/psycopg/docs/cursor.html#cursor.copy_to
http://docs.sqlalchemy.org/en/latest/core/connections.html#sqlalchemy.engine.Engine.raw_connection

If your engine is configured with a psycopg2 connection string (which is the default, so either "postgresql://..." or "postgresql+psycopg2://..."), you can create a psycopg2 cursor from an SQL Alchemy session using
cursor = session.connection().connection.cursor()
which you can use to execute
cursor.copy_from(...)
The cursor will be active in the same transaction as your session currently is. If a commit or rollback happens, any further use of the cursor with throw a psycopg2.InterfaceError, you would have to create a new one.

You can use:
def to_sql(engine, df, table, if_exists='fail', sep='\t', encoding='utf8'):
# Create Table
df[:0].to_sql(table, engine, if_exists=if_exists)
# Prepare data
output = cStringIO.StringIO()
df.to_csv(output, sep=sep, header=False, encoding=encoding)
output.seek(0)
# Insert data
connection = engine.raw_connection()
cursor = connection.cursor()
cursor.copy_from(output, table, sep=sep, null='')
connection.commit()
cursor.close()
I insert 200000 lines in 5 seconds instead of 4 minutes

It doesn't look like it.
You may have to just use psycopg2 to expose this functionality and forego the ORM capabilities. I guess I don't really see the benefit of ORM in such an operation anyway since it's a straight bulk insert and dealing with individual objects a la an ORM would not really make a whole lot of sense.

If you're starting from SQLAlchemy, you need to first get to the connection engine (also known by the property name bind on some SQLAlchemy objects):
engine = create_engine('postgresql+psycopg2://myuser:password#localhost/mydb')
# or
engine = session.engine
# or any other way you know to get to the engine
From the engine you can isolate a psycopg2 connection:
# get a psycopg2 connection
connection = engine.connect().connection
# get a cursor on that connection
cursor = connection.cursor()
Here are some templates for the COPY statement to use with cursor.copy_expert(), a more complete and flexible option than copy_from() or copy_to() as it is indicated here: https://www.psycopg.org/docs/cursor.html#cursor.copy_expert.
# to dump to a file
dump_to = """
COPY mytable
TO STDOUT
WITH (
FORMAT CSV,
DELIMITER ',',
HEADER
);
"""
# to copy from a file:
copy_from = """
COPY mytable
FROM STDIN
WITH (
FORMAT CSV,
DELIMITER ',',
HEADER
);
"""
Check out what the options above mean and others that may be of interest to your specific situation https://www.postgresql.org/docs/current/static/sql-copy.html.
IMPORTANT NOTE: The link to the documentation of cursor.copy_expert() indicates to use STDOUT to write out to a file and STDIN to copy from a file. But if you look at the syntax on the PostgreSQL manual, you'll notice that you can also specify the file to write to or from directly in the COPY statement. Don't do that, you're likely just wasting your time if you're not running as root (who runs Python as root during development?) Just do what's indicated in the psycopg2's docs and specify STDIN or STDOUT in your statement with cursor.copy_expert(), it should be fine.
# running the copy statement
with open('/path/to/your/data/file.csv') as f:
cursor.copy_expert(copy_from, file=f)
# don't forget to commit the changes.
connection.commit()

You don't need to drop down to psycopg2, use raw_connection nor a cursor.
Just execute the sql as usual, you can even use bind parameters with text():
engine.execute(text('''copy some_table from :csv
delimiter ',' csv'''
).execution_options(autocommit=True),
csv='/tmp/a.csv')
You can drop the execution_options(autocommit=True) if this PR will be accepted

Using SSIS Task Exec SQL task with sp_send_mail

I have a SSIS package which loops through a number of people then attaches a set of links to reports as attachments.
This all works fine with the Send mail task until I hit the 4,000 character limit :(
So I am trying to get this to work with the Execute SQL task, using sp_send_mail
I am trying something simple first but I cannot get it to work
Paramater : User::strPersonName
Direction : Input
Data Type : VarChar
Size : -1
SQL Statement =
DECLARE #bodytext AS VARCHAR(200)
SET #bodytext = 'Good Morning' + ?
EXEC msdb.dbo.sp_send_dbmail
#profile_name = 'Shoop',
#recipients = 'moonbase#hatstand.com',
#subject = '1',
#body = #bodytext
I am getting the resultset not properly set up generic error
Any ideas? :(

Instead of using sp_send_mail, you could still use the send mail task from SSIS. The 4000 character limit that I believe you are talking about, when populating your message, is on the expression, not on the variable itself. If you are using a MessageSourceType of variable in the send mail task, you can use the script task to build your message body (allowing you to create a string larger than 4000 characters).
Edit: Since the problem is with your attachments, I see one thing that could be a problem with the SQL for your Execute SQL task, according to http://technet.microsoft.com/en-us/library/ms140355.aspx , you should use "?" as the parameter marker for ADO connections and "#" for ADO.NET, but it seems like you're using both.
As another alternative, here's a blog on how to use the script task and .NET to send your email. http://www.mssqltips.com/sqlservertip/1753/sending-html-formatted-email-in-sql-server-using-the-ssis-script-task/

You shoudn't need a result set, but if you do make sure it is set correctly in the General tab in the SQL Task. From my understanding you are just passing in parameter data and not returning anything, so I think you have it set to Single Row instead of None.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Is there a way to use User Activity Variables to store SQL in Datastage - datastage

Related

Azure Data Factory - Lookup Activity

creating job with ssis step using tsql

MVS OS-390 - How do I Capture Job Information from CA-JOBTRAC programmatically

SQLAlchemy, Psycopg2 and Postgresql COPY

Using SSIS Task Exec SQL task with sp_send_mail

Categories

Resources