I'm working with DataStage 11.3 with parallel jobs.
My input is a date and I use the "DateOffsetByComponents" function in a -transformer- stage to obtain 4 dates with different rules, these 4 results ends in different -sequential file- stages.
The next step of my transformation is to make a query in Sybase but, the conditional clause "Where" uses the 4 dates mentioned before as parameters to get the proper information from the DB.
Does anybody have an idea of how can I read the date in each sequential file and put these as a parameter in the next step?
I read a similar question of it in which, he suggested use the -execute command- stage in a Sequence job but I'm new using DataStage and it isn't clear for me on how can I achieve this, although, I can see that, in this type of job (Sequence) you can select parameters that are contained in others -job activities- stages.
Thanks a lot in advance
Yes, an Execute Command activity to read the date from each file is the way to go. Figure out an operating system command to extract the information you need, perhaps some combination of grep and sed, and use this as the command. The command output is available in an activity variable called $CommandOutput.
For example, if your file contains the line "Date: 2021-11-17" then a suitable command might be grep Date: filename | cut -f2
You might like to add a tr command to the pipeline to remove the trailing newline character.
You'll need one Execute Command activity per file. Access the required date from the $CommandOutput activity variable for each.
Related
I have multiple Date and Store values in an excel, I need to loop the datastage parallel job based on Date and Store. Parallel job has SQL query based on Date and Store so i need to pass these values from Sequence job.
I developed a Sequence job with looping condition but i was able to loop only with 1 column(Either Date or Store). Is there anyway i can pass both Date and store to the parallel job?
I clubbed both Date and store into a single column and try to pass to the parallel job but i am not able to split the parameter value and run the SQL query.
Is there any suggestions on this please?
I assume you 'clubbed' your values by concatenating them with a split character of your choice. In the sequence job, use a User Variable Stage in the loop (between the start-loop stage and your parallel job) to split the pair into two variables e.g. by using the field function. You can then pass them as two seperate params to the parallel job.
I am trying to download from SOAP API. The supplier has set a limit to download 1000 records per select. I would like to set up "for loop" container to iterate all records in one file.
The Url expect two variables for data i.e Select and Skip. For this I have added two variables #select and #skip and create a web service task and set up the variable with #Select and #Skip.
this webservice task runs ok, but i am stuck on working out a logic to loop through and download all rows in one file.
for the first run the #skip should be 0 for the consecutive run #skip should be previous value of #skip+1000.
Can someone please help to achieve this?
Thanks
I am creating a Data Comparison/Verification script using SQL and Spoon PDI. We're moving data between two servers, and to make sure we've got all the data we have SQL queries showing a date then the quantity of rows transferred.
Example:
Serv1: 20150522 | 100
Serv2: 20150522 | 100
The script will then try to union these values, and if it fails we'll get a fail email. However, we wish to change this setup to write the outcome to a text file, and based on that text file send either a pass or fail email.
The idea behind this is we have multiple tables we're comparing, so we wish to write all the outcomes of each comparison (eight) to a text file and based off the final text file, send the outcome - rather than spamming our email inbox if multiple steps fail.
The format of the text file we wish to have is either match -> send email or mismatch [step-name] [date] -> send email.
Usually I wouldn't ask a question if I haven't tried anything first, but I've searched everywhere on Google, tried the knowledge I currently have and nothing is going the way I wish it to. I believe this is due to the logic I am using.
I am not asking for a solution to this, or for someone to do it for me. I am simply asking for guidance along the correct path.
I would do this in a transformation where there are steps for each union where the result of each step is the comparison_name and the result. This would result in a data set at the end that looks something like this:
comparison_name | result
Union A | true
Union B | false
Union C | true
You would then be able to output those results to a text file in another step to get your result file to sent out regardless of whether the job passed or failed.
Lastly you would loop through the result row in the stream, and if all are true, you could do an email step to send out a "pass" email, and if one is false, send out a "fail" email.
EDIT:
To get the date of the pass or fail you could either get the date from each individual union query result by adding it to the query like so:
SELECT CURRENT_DATE
Or you could use the Get System Info step in spoon which has multiple ways of injecting the current date into the data stream. (system date fixed, start date range of the transformation, today 00:00:00, etc.)
1) I need to pass a date (billing) to the RFC, but I am not sure how to map using tmap. How to set it up (see screenshots).
2) I need to run this job daily (M-F) and I am not sure how to automate the date input
3) For the date input, I thought of using a joblet, but I can't find it in Talend. Most screenshots shows the Joblets in the same window as job designs and metadata, but I don't have it. Seen Joblet image.
As you might guessed, I am very new to Talend.
Use a tMap, and inside it use the function TalendDate.parseDate("yyyy-MM-dd", sap_data.date) in the expression field where you want the output. Also, note that the output type must be Date. The Date Pattern in the type definition (on the bottom of the tmap) is irrelevant.
Something like that:
I know nothing about Perl. I have looked at some online tutorials and am at a loss for the following.
I do a query in PostgreSQL that saves to a CSV file. However, one element needs to be changed after the CSV file is created, and I have no idea how to do it.
The existing query results are like this
phone date time staff email and customer ID -- my explanation
1112223333,10/21/2013,3:00 AM,sklund#myemail.comSMIB010170 -- data in csv
After query is completed, the data in the time field must be converted to:
1112223333,10/21/2013,03:00am,sklund#myemail.comSMIB010170
As you can see, the time needs to be ammended to include a 0 if the hour is less than ten, and the AM must be changed to am.
Is there a simple Perl script that can do this? The lines of data, of course, will be different, as each line would reflect results of the query for the day.
If someone can point me to a tutorial, link, or help in this I'd be very grateful.
This will do what you need.
I assume you want the space before AM removed as well? You don't mention it in your question.
perl -pe 's/,(\d{1,2}):(\d\d)\s+([AP]M),/sprintf ",%02d:%02d%s,",$1,$2,lc $3/ei' mylogfile > newlogfile