Datastage- Loop throught the file to read email ID and send email - datastage

I have to read Input file to get email id of employees and send each employee email.
How can I do this using Datastage job?
File looks like this,
PERSON_ID|FName|LName|Email_ID

DataStage itself offers a Notification Stage which is only available at the Sequence level.
As your information is in the data stream of a job you could use a Wrapped Stage in order to send the mail from within a job.
A wrapped stage allows to call a OS command for each row in your stream. Sendmail etc. could be used to send the mails like you wish.
I have implemented this recently. The wrapped stage is tricky so I would recommend to use it in a very simple way - use it to call the bash (or any other shell) and prepare the mail command upfront and simply send it to that stage.

There are some more options.
First is using the Wrapped Stage like Michael mentioned. Another method is writing a Parallel Routine to use in an ordinary parallel transformer, which is quite similar.
The simplest way of sending an email per row that I know of is using a server routine in a transformer.
Drawback is that server routines are deprecated and we're not yet sure
how well they can be migrated to future versions of DataStage (CP4D).
This should be considerd when doing this.
In each project you should have a folder Routines/Built-In/Utilities containing the server routines DSSendMailAttachmentTester and DSSendMailTester. These are originally meant to be used in the Routine Editor just for testing the backend wether it's actually able to send mail.
But you can also use them in a Transformer as well, as long as it's a BASIC Transformer. That means you can either write a server job using all old school stuff (which is probably not what you want), or you can use the BASIC Transformer in a parallel job. (Follow the link on how to enable it.) It gives access to BASIC transforms and functions.
I suggest copying the mentioned server routines to make your own custom one and maybe modify it to your needs.

Related

Best practices for Informatica Webservice workflow

I have created a Informatica webservice workflow which takes 1 parameter as input. A Webservice provider source definition is used for this and mapping is a one-way type.
Workflow works fine when parameter is being passed. But when the same workflow is triggered from Informatica Power center directly (in which case no parameters are passed), mapping that contains webservice provider source definition takes 3 minutes to complete (Gives Timeout based commit point in the log).
Is it a good practice to run the webservice workflow from power center directly? And is there a way to improve its performance when triggered from power center directly?
Note: I am trying to use 1 workflow for both - 1) Pass the parameter from web 2) Schedule the workflow in Informatica
Answers to your questions below.
Is it a good practice to run the webservice workflow from power center directly?
Of course it depends on requirement - whether you need to extract data automatically from WS or not. If you pass parameter using some session then i dont see much issue here and your session is completing within time.
So, you can create a new session/command task/shell script to create a param file and then use it in original session so it is passed on to WS.
In a complex scenario, you may have to pass multiple values, in such case, i would recommend to use a parent workflow to call original workflow multiple times and change param every time before call.
Is there a way to improve its performance when triggered from power center directly?
It is really depends on few factors.
The web service - Make sure you are using correct input and output columns. Most of the time WS are sensitive to outside call and you need to choose optimized column to extract data for better performance. You can work with WS admin to know correct column.
If informatica flow is complex then depending on bottle neck transformation/s (source, target, expression, lookup, aggregator, sorter), we can check and take actions.
For lookup, you can add new filter to exclude unwanted data, remove unwanted columns etc.
For aggregator, you can use sorter before to improve perf.
... like this

need a repeatable method for SSIS package OnError Event Handler

All the SSIS packages here at work have an identical OnError event handler and I'm looking for a way to avoid creating the same handler for every single package. The Event Handler first queries a table for a list of email addresses and then sends an email to the list of recipients, including in the body the package name, package error, error date & time, etc. The Execute SQL query and Email Task are literally identical in every package Event Handler. Is there some way to modularize this routine? Perhaps by calling another package that handles it all? I want to eliminate (or nearly eliminate) chances for developers to make a mistake while creating, recreating, and recreating yet again this identical process. They way it's done now will be a miserable task to make a simple change to our error handling process in all our packages.
After extensive searching, I think I've settled on my best options: a custom SSIS task, a child package, or a stored procedure (in an Execute SQL task). I don't know how to make a custom task, so I'm going to opt for the child package and pass various execution state variables (like error description, error number, package name, package start time, etc.) into the child package as parameters. I assume I could make a custom task that only needs to be dropped into the error handler to work properly, but I don't have time to learn how.

Can Tableau return non-UI results programmatically?

Tableau is an excellent tool for visualizing data. However, it is designed to be the final stop in a data (ETL) pipeline.
My Tableau workbook uses a bunch of Table Calcs to generate a list of "recommended orders". Rather than view these, I want to automate and execute them. This would make Tableau the engine of a quasi-ML process.
In other words, I would like to make Tableau a part of my ETL pipeline and send data to another tier. How can I write a back-end program that executes my Tableau workbook and receives a results dataset?
See the end of this article for example data I want to automate:
http://robm26.blogspot.com/2015/10/keep-your-factory-humming-with-tableau.html
Any ideas?
You're not not going to like the answer I'm going to give you -- "Don't do this".
Tableau isn't meant to be a task in a larger ETL pipeline and the reason you're having problems making it behave the way you want is it's not meant to be done.
Above and beyond the fact that you've figured out how to get a result that you want in Tableau ("the work is done"), Tableau isn't offering you any real value in the scenario you're describing. Use a tool (like Alteryx) that is really purpose built for this sort of work.
The above answer is correct that tabcmd is the way to pull it out. We use a function in python to generate the tabcmd requests so that they can be batched.
import subprocess
def runTabCmd(cmd):
# run tableau command and display the output
print cmd
if run_tabcmd == 'yes':
p = subprocess.Popen(
cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
for line in p.stdout.readlines():
print line
You probably already knew that, but for us it was a way to completely automate the pulling and loading into another python package like scikit-learn for a streamlined ML solution
I'm editing this answer to agree with Russell's answer. Tableau is not an ETL tool and should not be used as such. If you absolutely have to do something, you can use what I provided. Otherwise, the best practice is to use a tool designed for the job.
You can easily use tabcmd to get the results of a view in CSV, which can be used later in your ETL process. If you need to automate it, you can write a script and execute it with a cron job. I, myself, have a few views that are exported to CSV and used later in my ETL stream to feed our CRM.
Just remember to create the view exactly as you want it to be exported to CSV - usually including the order of the fields. Another tip is that I don't let it use the default "Measure Names" and "Measure Values" - to make sure everything is good on my CSV, I have the fields added manually in the row/columns section.

How to send an email from IBM iSeries DB2 v7.1

I am trying to create a trigger that sends an email based on a database event, specifically, when a record is INSERTed in a certain table, I want an email stating that fact to go to the SysAdmin.
I can successfully do the following from a SQL window in iSeries Navigator:
CL:SNDDST TYPE(*LMSG)
TOINTNET(('sysadmin#mycompany.com'))
DSTD('this is the Subject Line')
LONGMSG('This is an Email sent from iSeries box via Navigator')
...and an email gets sent. Which means that the necessary SMTP stuff is there and working.
So all I'm trying to do is encapsulate this code, perhaps with some data changes (e.g. "A record has been added to the XYZ table on whatever-the-sysdate-is"). Navigator has some tantalizing examples that call CL to do some plain-vanilla things, but no clue as to how to make it work in a trigger. I know how to write triggers that do "database stuff", but not this CL stuff. And this is iSeries DB2, so I don't have access to UTL_MAIL.
I know next to nothing about CL, DDS or other iSeries internals... I would prefer not to have to create an external Java program, but will do that as a last resort...but even then, I'm having a hard time finding straightforward examples.
thanks in advance.
First off, note that SNDDST isn't the best choice for internet mail from the IBM i. Basically, SNDDST is a relic from the SNADS networking days that IBM hacked into supporting SMTP emails. There are free alternatives, or if you're reasonably current on fixes for 7.1 then you should have the Send SMTP E-mail (SNDSMTPEMM) command available.
The Run SQL Scripts window of iNav does indeed support CL commands using the CL: prefix. But that's not the same thing as having the query engine itself understand CL.
The CL: prefix isn't going to work inside an SQL trigger.
You could however,use the QCMDEXC stored procedure to call a CL command. But I wouldn't necessarily call that the best option.
The IBM i supports using "external" stored procedures and triggers. Theoretically, you could use a CL program that invokes the SNDSMTPEMM command directly. But given you desires to include data from the table, I wouldn't recommend that approach as you'd be tied to the table structure.
Instead, create your own UTLMAILSND CL program that invokes SNDSMTPEMM. Then defined the UTLMAILSND program as an external stored procedure (you can even give it a longer SQL name of UTIL_MAIL_SEND).
Now you can call your UTIL_MAIL_SEND() procedure from your SQL trigger.
You need to try the SNDSMTPEMM command. It's like sliced bread compared to SNDDST TYPE(*LMSG) It supports HTML too which makes for a lot of fun.
Yes, I used SNDSMPTEMM (skipping the html for now...).
One big note, however: using this command in a CL program doesn't work when being called from SQL. I had to change it to a CLLE program.
So the final answer is as follows: a) an INSERT trigger on the table in question, which calls: b) an (external) PROCEDURE created in the database, which in turn calls: c) the compiled CLLE program object. Works like a charm.
p.s. I create the whole body of the email in the INSERT trigger, and pass it along, eventually to the CLLE program. This allows me to have just this one CLLE program to report on any INSERT/UPDATE/DELETE anywhere in the database.

Should Command / Handles hold the full aggregates or only its id?

I'm trying to play around with DDD and CQRS.
And I got this two solutions :
add AggregateId to my command / event. It's nice beauce I can use my command as my web service's parameter, and I can as well return some instance of my command to my forms for saying "you can do this command,t his one and this one"
add my full Aggregate to my command / event. It's nice because I'm sure that I won't load my aggregate 100 times if there is a lot of event going on, I'll just pass my reference around (for instance I won't load it in my command's validator and in my command handler). But i'd add to create a parameter class for each command wih only the id.
For now I have the id in the commands and the full model in the events (I trust my unit of work for caching the Load(aggregateId) so i won't execute the same request 100 for 1 command).
Is there a right / better way ?
Yes your current approach is correct - reference the aggregate with an identity value on the command. A command is meant to be serialized and sent across process boundaries. Also, a command is normally constructed by a client who may not have enough information to create an entire aggregate instance. This is also why an identity should be used. And yes, your unit of work should take care of caching an aggregate for the duration of a unit of work, if need be.