How to add Alter session statement in DataStage globally instead of touching each job and writing in before sql statement? #ibm-datastage - datastage

I have a requirement where I need to add alter session statement in Before sql statement for each job. It is impossible to touch each and every job and enter this considering the time constraint and number of job involved.
Is there a global variable or place where we could add this directly in Data Stage instead of touching each and every jon?
Any suggestions will be helpful.
Regards,
Raj

It is not impossible; it is only tedious. The answer to your direct question is "no". You could export the entire project and perform a search-and-replace exercise; XML format might be easier for this exercise.

Related

AZURE DATA FACTORY - Can I set a variable from within a CopyData task or by using the output?

I have simple pipeline that has a Copy activity to populate a table. That task is based on a query and will only ever return 1 row.
The problem I am having is that I want to reuse the value from one of the columns (batch number) to set a variable so that at the end of the pipeline I can use a Stored Procedure to log that the batch was processed. I would rather avoid running the query a second time in a lookup task so can I make use of the data already being returned?
I have tried duplicating the column in the Copy activity and then mapping that to something like #BatchNo but that fails and have even tried to add a Set Variable task but can't figure out how to take a single column #{activity('Populate Aleprstw').output} does not error but not sure what that will actually do in this case.
Thanks and sorry if its a silly question.
Cheers
Mark
I always do it like this:
Generate a batch number (usually with a proc)
Use a lookup to grab it into a variable
Use the batch number in all activities (might be multiple copes, procs etc.)
Write the batch completion
From your description it seems you have the batch embedded in the data copy from the start which is not typical.
If you must do it this way, is there really an issue with running a lookup again?
Copy activity doesn't return data like that, so you won't be able to capture the results that way. With this design, running the query again in a Lookup is the best option.
Is the query in the Source running on the same Server as the Sink? If so, you could collapse the entire operation into a Stored Procedure that returns the data point you are trying to capture.

Deploying DB2 user define functions in sequence of dependency

We have about 200 user define functions in DB2. These UDF are generated by datastudio into a single script file.
When we create a new DB, we need to run the script file several times because some UDF are dependent on other UDF and cannot be create until the precedent functions are created first.
Is there a way to generate a script file so that the order they are deployed take into account this dependency. Or is there some other technique to arrange the order efficiently?
Many thanks in advance.
That problem should only happen if the setting of auto_reval is not correct. See "Creating and maintaining database objects" for details.
Db2 allows to create objects in an "unsorted" order. Only when the object is used (accessed), the objects and its depending objects are checked. The behavior was introduced a long time ago. Only some old, migrated databases keep auto_reval=disabled. Some environments might set it based on some configuration scripts.
if you still run into issues, try setting auto_reval=DEFERRED_FORCE.
The db2look system command can generate DDL by by object creation time with the -ct option, so that can help if you don't want to use the auto_reval method.

Best way to track the progress of a long-running function (from outside) - PostgreSQL 11?

What is the best way to track progress of a
long-running function in PostgreSQL 11?
Since every function executes in a single transaction, even if the function writes to some "log" table no other session/transaction can see this output unless the function completes with SUCCESS.
I read about some attempts here but they are from 2010.
https://www.endpointdev.com/blog/2010/04/viewing-postgres-function-progress-from/
Also, this approach looks terribly inconvenient.
As of today what is the best way to track progress?
One approach that I know... is to turn the func to a procedure and then do partial commits in the SP. But what if I want to return some result set from the func... In that case I cannot turn it into a SP, right? So... how to proceed in that case?
Many thanks in advance.
NOTE: The function is written in PL/pgSQL, the most common procedural SQL language available in PostgreSQL.
I don't know that there's a great way to do it built-in to postgres yet, but there are a couple ways to achieve logging that will be visible outside of a function.
You can use the pg_background extension to run an insert in the background that will be visible outside of the function. This requires compiling and installing this extension.
Use dblink to connect to the same database and insert data. This will most likely require setting up some permissions.
Neither option is ideal, but hopefully one can work for you. Converting your function to a procedure may also work, but you won't be able to call the procedure from within a transaction.

Is it possible to prevent the SQL Producer from overwriting just one of the tables columns?

Scenario: A computed property needs to available for RAW methods. The IsComputed property set in the model will not work as its value will not be available to RAW methods.
Attempted Solution: Create a computed column directly on the SQL table as opposed to setting the IsComputed property in the model. Specify that CodefluentEntities not overwrite the computed column. I would than expect the BOM to read the computed SQL field no differently than if it was a normal database field.
Problem: I can't figure out how to prevent Codefluent Entities from overwriting the computed column. I attempted to use the production flags as well as setting produce="false" for the property in the .cfp. Neither worked.
Question: Is it possible to prevent Codefluent Entities from overwriting my computed column and if so, how?
The solution youre looking for is here
You can execute whatever custom T-SQL scripts you like, the only premise is to give the script a specific name so the Producer knows when to execute it.
i.e. if you want your custom script to execute after the tables are generated, name your script
after_[ProjectName]_tables.
Save your custom t-sql file alongside the codefluent generated files and build the project.
In my specific case, i had to enable full-text index in one of my table columns, i wrote the SQL script for the functionality, saved it as
`after_[ProjectName]_relations_add`
Heres how they look in my file directory
file directory
Alternate Solution: An alternate solution is to execute the following the TSQL script after the SQL Producer finishes generating.
ALTER TABLE PunchCard DROP COLUMN PunchCard_CompanyCodeCalculated
GO
ALTER TABLE PunchCard
ADD PunchCard_CompanyCodeCalculated AS CASE
WHEN PunchCard_CompanyCodeAdjusted IS NOT NULL THEN PunchCard_CompanyCodeAdjusted
ELSE PunchCard_CompanyCode
END
GO
Additional Configuration Needed to Make Solution Work: In order for this solution to work one must also configure the BOM so that it does not attempt to save the data associated with the computed columns. This can be done through Model using the advanced properties. In my case I selected the CompanyCodeCalculated property. Went to advanced settings. And set the Save setting to False.
Question: Somewhere in the Knowledge Center there is a passing reference on how to automate the execution SQL Scripts after the SQL Producer finishes but I can not find it. Anybody now how this is done?
Post Usage Comments: Just wanted to let people know I implemented this approach and am so far happy with the results.

PostgreSQL transaction variables

This question is sort of a follow up to this question, but it's different enough of a topic that I feel like it merits it's own discussion. For a bit of background, you can refer to it.
As a part of a new file importing system, I am building an audit system based on this wiki page. But, one of the things that I would like to include in the audit trail is the file name of the file that the data came from (these files are archived for long term storage so if there are questions, I can always go back).
One way I could go it is to create a import_batch record and record the name of the file there and then just stamp records when they update. Which is the path that I'm going down. But, it feels a bit clunky in a way. I'm been pondering the idea of trying to have the audit trigger be able to get the import_batch_id without it having to be in the NEW.* record. It seems like to me there are at least a couple of ways I might be able to accomplish this.
I could have a function that could create a temp table and store any information in it that I want (such as batch # or file name or whatever). This seem pretty clean and as I understand it would only live for the duration of the transaction. And as I understand it, it wouldn't have to worry about naming collisions. Each transaction would have a temp file named "tmp_import_info".
If I only care about the import_batch_id (which has a seq), I could probably just get the current value of the sequencer. I'm not a 100% sure how this would behave in a multi-user setting. I would think it would be possible for trans#1 to create import_batch_id #222 and then trans#2 to start and get #223. And then my audit trail would record the wrong data.
Are there other options that I'm not seeing here? Is there a way to add a transaction/session variable? Basically, something like pg_settings (but, that does allow for inserts, updates and deletes of values).
It feels like the best option might be the temp table.
The main good news for variant 2. is - quoting the manual here:
currval
Return the value most recently obtained by nextval for this sequence in the current session. (An error is reported if nextval has never been called for this sequence in this session.) Because this is returning a session-local value, it gives a predictable answer whether or not other sessions have executed nextval since the current session did.
Store your import file names in a table with a serial primary key. You can refer to your last value from the sequence with currval or lastval. Concurrent users cannot interfere. As long as you don't foil this path inside your own transaction yourself, this is safe.