I have input as 123 and output I am looking is 123 123 123 123 123. How to achieve it in Datastage? - datastage

I have input as 123 and output I am looking is 123 123 123 123 123. How to achieve it in Datastage?
input: 123
output :123
123
123
123
123

The Str() function returns multiples of the input string.
Str(InLink.MyString, 5)

Related

Split String Alphanumeric 'AAA111BBB222' into specific pattern AAA 111 bbb 222

I have string 'AAA111BBB222' and try to get output for each cell as below :-
AAA
111
BBB
222
Please help.

Remove table duplicates under certain conditions

I have a table like below that shows me some pnl by instrument (code) for some shifts, maturity, etc.
Instrument 123 appears two times (2 sets of shift, booknumber, insmat but different pnl). I would like to clean the table to only keep the first set (3 first rows).
> code | shift | pnl | booknumber | insmat
123 -20% 5 1234 2021.01.29
123 -0% 7 1234 2021.01.29
123 +20% 9 1234 2021.01.29
123 -20% 4 1234 2021.01.29
123 -0% 6 1234 2021.01.29
123 +20% 8 1234 2021.01.29
456 -20% 1 1234 2021.01.29
456 -0% 2 1234 2021.01.29
456 +20% 3 1234 2021.01.29
If there were no shifts involved I would do something like this:
select first code, first pnl, first booknumber, first insmat by code from t
Would love to hear if you have a solution!
Thanks!
If the shift pattern is consistently 3 shifts, you could use
q)select from t where 0=i mod 3
code shift pnl booknumber insmat
------------------------------------
123 20 5 1234 2021.01.29
123 20 4 1234 2021.01.29
456 -20 1 1234 2021.01.29
Alternative solution with an fby
q)select from t where shift=(first;shift)fby code
code shift pnl booknumber insmat
------------------------------------
123 20 5 1234 2021.01.29
123 20 4 1234 2021.01.29
456 -20 1 1234 2021.01.29
This will only work if the first shift value is unique within the shift pattern however.

Flag data when value from one column is in another column

I'm trying to create a flag in my dataset based on 2 conditions, the first is simple. Does CheckingCol = CheckingCol2.
The second is more complicated. I have a column called TranID and a column called RevID.
For nay row if RevID is in TranID AND CheckingCol = CheckingCol2 then the flag should return "Yes". Otherwise the flag should return "No".
My data looks like this:
TranID RevID CheckingCol CheckingCol2
1 2 ABC ABC
2 1 ABC ABC
3 6 ABCDE ABCDE
4 3 ABCDE ABC
5 7 ABCDE ABC
The expected result would be:
TranID RevID CheckingCol CheckingCol2 Flag
1 2 ABC ABC Yes
2 1 ABC ABC Yes
3 6 ABCDE ABCDE No
4 3 ABCDE ABC No
5 7 ABCDE ABC No
I've tried using:
df.withColumn("TotalMatch", when((col("RevID").contains(col("TranID"))) & (col("CheckingColumn") == col("CheckingColumn2")), "Yes").otherwise("No"))
But it didn't work, and I've not been able to find anything online about how to do this.
Any help would be great!
Obtain the unique values as array from the TranID column, then check for the RevID from that array using isIn() function
from pyspark.sql import functions as sf
unique_values = df1.agg(sf.collect_set("TranID").alias("uniqueIDs"))
unique_values.show()
+---------------+
| uniqueIDs|
+---------------+
|[3, 1, 2, 5, 4]|
+---------------+
required_array = unique_values.take(1)[0].uniqueIDs
['3', '1', '2', '5', '4']
df2 = df1.withColumn("Flag", sf.when( (sf.col("RevID").isin(required_array) & (sf.col("CheckingCol") ==sf.col("CheckingCol2")) ) , "Yes").otherwise("No"))
Note: Check for the nulls and NoneType values in both RevID and TranID columns since they will affect the results

Add unique rows for each group when similar group repeats after certain rows

Hi Can anyone help me please to get unique group number?
I need to give unique rows for each group even when same group repeats after some groups.
I have following data:
id version product startdate enddate
123 0 2443 2010/09/01 2011/01/02
123 1 131 2011/01/03 2011/03/09
123 2 131 2011/08/10 2012/09/10
123 3 3009 2012/09/11 2014/03/31
123 4 668 2014/04/01 2014/04/30
123 5 668 2014/05/01 2016/01/01
123 6 668 2016/01/02 2017/09/08
123 7 131 2017/09/09 2017/10/10
123 8 131 2018/10/11 2019/01/01
123 9 550 2019/01/02 2099/01/01
select *,
dense_rank()over(partition by id order by id,product)
from table
Expected results:
id version product startdate enddate count
123 0 2443 2010/09/01 2011/01/02 1
123 1 131 2011/01/03 2011/03/09 2
123 2 131 2011/08/10 2012/09/10 2
123 3 3009 2012/09/11 2014/03/31 3
123 4 668 2014/04/01 2014/04/30 4
123 5 668 2014/05/01 2016/01/01 4
123 6 668 2016/01/02 2017/09/08 4
123 7 131 2017/09/09 2017/10/10 5
123 8 131 2018/10/11 2019/01/01 5
123 9 550 2019/01/02 2099/01/01 6
Try the following
SELECT
id,version,product,startdate,enddate,
1+SUM(v)OVER(PARTITION BY id ORDER BY version) n
FROM
(
SELECT
*,
IIF(LAG(product)OVER(PARTITION BY id ORDER BY version)<>product,1,0) v
FROM TestTable
) q

SQL Server: FAILING extra records

I have a tableA (ID int, Match varchar, code char, status = char)
ID Match code Status
101 123 A
102 123 B
103 123 C
104 234 A
105 234 B
106 234 C
107 234 B
108 456 A
109 456 B
110 456 C
I want to populate status with 'FAIL' when:
For same match, there exists code different than (A,B or C)
or the code exists multiple times.
In other words, code can be only (A,B,C) and it should exists only one for same match, else fail. So, expected result would be:
ID Match code Status
101 123 A NULL
102 123 B NULL
103 123 C NULL
104 234 A NULL
105 234 B NULL
106 234 C NULL
107 234 B FAIL
108 456 A NULL
109 456 B NULL
110 456 C NULL
Thanks
No guarantees on efficiency here...
update tableA
set status = 'FAIL'
where ID not in (
select min(ID)
from tableA
group by Match, code)