How to convert the records in a column to Camel Case in Datastage? - datastage

I have a table in which i have a name column, The names in that column are all lower case .I want to convert the string to camel case .The table headers are a follows
Table
|Id|Name |Phone Number|
|01|bob wheeler|999999999 |

If you are trying to do this in a server job, then use Oconv(InLink.MyString,"MCT").
There is no directly-available function in a parallel Transformer stage. You could:
(a) put a server Transformer in a server Shared Container and use that in your parallel job,
(b) write your own routine in C++ and refer to it (called a "parallel routine"), or
(c) investigate whether the database in which the table resides has any kind of camel case conversion function.

I don't think there is a specific function for that.
UpCase is available.
You migh consider a workaround to convert the first and every other first character ater a blank to upper case.

Assuming name column only contains first name and Last name.
If there is middle name as well, you can simulate the formula to handle that.
The formula will convert first char after first occurrence of space to upper case.
It will concatenate first name, first character of second name with upper case and then rest of the second name. You can play with positions if getting wrong output. But this would definitely work-
left(NAME,Index(NAME,' ',1)) : Upcase(Name[(Index(NAME,' ',1))+1,1]) : Name[(Index(NAME,' ',1))+2,length(NAME)-Index(NAME,' ',1)+1]

Related

Redshift how to update the last octec of IP address column to another value

a. 123.12.1 -> 123.12.999
b. 123.12.100.0 -> 123.12.100.999
c. 123.123 -> 123.999
I have a Redshift table with one IP address column, cases as above, I used substring and position function nested many time to match the requirement, but I want to learn if is there any other method to do
a cleaner way is using a Python UDF that splits the string by dot symbol, and returns all elements but the last one, with 999 appended. the body of the function is below (val is the parameter, look how to create the function in official Redshift docs)
return '.'.join(val.split('.')[:-1])+'.999'
No need to use split. the easiest way to update your values is:
update table set ip = regexp_replace(ip, '[.][0-9]{1,3}$','.999');
see redshift regexp_replace function
the $ character makes sure that only the last octet is replaced.

Informatica SQ returns different result

I am trying to pull data from DB2 via informatica, I have a SQ query that pulls few fields based on joins for 4 different tables.
When I run the query directly in the database, it returns the expected result, however when I run it in informatica and run a debugger, I see something else.
Please note all the columns data perfectly match, except one single column.
Weird thing is, this is a calculated field from the table based on a case statement:
CASE WHEN Column1='3' THEN 'N' ELSE 'Y' END.
Since this is a calculated field with a length of one string, I have connected from the source to SQ from one of the sources having 1 character length.
This returns 'Y' when executed in the database, the same query when I copy paste in SQ of information and run it, I get a data 'E', and this data can never be possible as I expect only a N or a Y. I have verified the column order, that its in the right place. This is very strange, is something going wrong because of the CASE Statement?
Save yourself the hassle, put an expression transformation after tge source qualifier and calculate, port value there then forget about it
I think i got the issue. We use Informatica PowerExchange to connect to a as400 system(DB2), and it seems that when we are trying to set a flag information in AS400, and pass it to informatica via PowerExchange, it converts it to binary, and to solve this, there needs to be an entry in the PowerExchange configuration file.
Unfortunately, i myself was not aware that it could be related to PowerExchange instead of powercenter itself.!!
Thanks for your assistance! Below is the KB about it.
https://kb.informatica.com/solution/4/Pages/17498.aspx

Transpose data using Talend

I have this kind of data:
I need to transpose this data into something like this using Talend:
Help would be much appreciated.
dbh's suggestion should work indeed, but I did not try it.
However, I have another solution which doesn't require to change input format and is not too complicated to implement. Indeed the job has only 2 transformation components (tDenormalize and tMap).
The job looks like the following:
Explanation :
Your input is read from a CSV file (could be a database or any other kind of input)
tDenormalize component will Denormalize your column value (column 2), based on value on id column (column 1), separating fields with a specific delimiter (";" in my case), resulting as shown in 2 rows.
tMap : split the aggregated column into multiple columns, by using java's String.split() method and spreading the resulting array into multiple columns. The tMap should like like this:
Since Talend doesn't accept to store Array objects, make sure to store the splitted String in Object format. Then, cast that object into Array on the right side of the Map.
That approach should give you the expected result.
IMPORTANT:
tNormalize might shuffle the rows, meaning for bigger input, you might encounter unsorted output. Make sure to sort it if needed or use tDenormalizeSortedRow instead.
tNormalize is similar to an aggregation component meaning it scans the whole input before processing, which results into possible performance issues with particularly big inputs (tens of millions of records).
Your input is probably wrong (you have 5 entries with 1 as id, and 6 entries with 2 as id). 6 columns are expected meaning you should always have 6 lines per id. If not, then you should implement dbh's solution, and you probably HAVE TO add a column with a key.
You can use Talend's tPivotToColumnsDelimited component to achieve this. You will most likely need an additional column in your data to represent the field name.
Like "Identifier, field name, value "
Then you can use this component to pivot the data and write a file as output. If you need to process the data further, read the resulting file with tFileInoutDelimited .
See docs and an example at
https://help.talend.com/display/TalendOpenStudioComponentsReferenceGuide521EN/13.43+tPivotToColumnsDelimited

GROUP BY CLAUSE using SYNCSORT

I have some content in a file on which I must generate statistics such as how many of records are of type - 1, type - 2 etc. Number of types can change and is unknown to the code until file arrives. In a SQL system, I can do this using COUNT and GROUP BY clause. But I am not sure if I can do this using SYNCSORT or COBOL program. Would anyone here have an idea on how I can implement 'GROUP BY' type query on a file using SYNCSORT.
Sample Data:
TYPE001 SUBTYPE001 TYPE01-DESC
TYPE001 SUBTYPE002 TYPE01-DESC
TYPE001 SUBTYPE003 TYPE01-DESC
TYPE002 SUBTYPE001 TYPE02-DESC
TYPE002 SUBTYPE004 TYPE02-DESC
TYPE002 SUBTYPE008 TYPE02-DESC
I want to get the information such as TYPE001 ==> 3 Records, TYPE002 ==> 3 Records. What the code doesn't know until runtime is the TYPENNN value
You show data already in sequence, so there is no need to sort the data itself, which makes SUM FIELDS= with SORT a poor solution if anyone suggests it (plus code for the formatting).
MERGE with a single input file and SUM FIELDS= would be better, but still require the code for formatting.
The simplest way to produce output which may suit you is to use OUTFIL reporting functions:
OPTION COPY
OUTFIL NODETAIL,
REMOVECC,
SECTIONS=(1,7,
TRAILER3=(1,7,
' ==> ',
COUNT=(M10,LENGTH=3),
' Records'))
The NODETAIL says "remove all the data lines". The REMOVECC says "although it is a report, don't use printer-control characters on position one of the output records". The SECTIONS says "we're going to use control-breaks, and here they (it in this case) are". In this case, your control-field is 1,7. The TRAILER3 defines the output which will be produced at each control-break: COUNT here is the number of records in that particular break. M10 is an editing mask which will change leading zeros to blanks. The LENGTH gives a length to the output of COUNT, three is chosen from your sample data with sub-types being unique and having three digits as the unique part of the data. Change to whatever suits your actual data.
You've not been clear, and perhaps you want the output "floating" (3bb instead of bb3, where b represents a blank)? That would require more code...

Autoincrement using Sequences is not working as expected

I am currently working on a job something like this
The design is to,extract some data from customers,(say first name,last name) to one excel file,other data (say address) is to goto other excel file,i added a identity to tMap Numeric("s1",1,1) but it is starting from 1,3,5,7,9,11,13.... and on other excel it getting 2,4,6,8,10,12,...
but i need both excel to have same identity 1,2,3,4,5,6,....N
so that i can map the records
so can somebody guide me on this?
edit:
The autoincrement returns 1,2,3,4,5,6,... this is fine when thers only one tMap component in the job,but not similar when 2 tMaps are used ?
This is because the numeric sequence is static. Since you have only one sequence called "s1", it will be incremented twice at every iteration (one time for each tMap it's invoked in).
Just use some unique labels (ie. "s1" and "s2") to force the use of two independent sequences, thus the solution of your problem.