How to remove double Quotes In DataStage using a transformer stage? - datastage

We receiving Input data like below
“VENKATA,KRISHNA”
I want output like below
VENKATA,KRISHNA
Can anyone help me with this

Check out the Ereplace function - it allows to replace certain characters so you could rplace " with '' (empty string).
An alternative is TRIM - you can specify which character the command should trim and also if All occurrences or Both (from both sides of the string) plus more.

Related

How to remove special characters from a string in postgresql

I am trying to remove using REGEXP_REPLACE the following special characters: "[]{}
from the following text field: [{"x":"y","s":"G_1","cn":"C8"},{"cn":"M2","gn":"G_2","cn":"CA99"},{"c":"ME3","gn":"G_3","c":"CA00"}]
and replace them with nothing, not even a space.
*Needless to say, this is just an example string, and I need to find a consistent solution for similar but different strings.
I was trying to run the following: SELECT REGEXP_REPLACE('[{"x":"y","s":"G_1","cn":"C8"},{"cn":"M2","gn":"G_2","cn":"CA99"},{"c":"ME3","gn":"G_3","c":"CA00"}] ','[{[}]":]','')
But received pretty much the same string..
Thanks in advance!
You need to escape the special characters (\), and to specify that you want to repeat the operation for every characters ('g') else it will stop at the 1st match
SELECT REGEXP_REPLACE(
'[{"x":"y","s":"G_1","cn":"C8"},{"cn":"M2","gn":"G_2","cn":"CA99"},{"c":"ME3","gn":"G_3","c":"CA00"}] ',
'[{\[}\]":]',
'',
'g');
regexp_replace
--------------------------------------------------
xy,sG_1,cnC8,cnM2,gnG_2,cnCA99,cME3,gnG_3,cCA00
(1 row)

pyspark dataframe returning different characters \"\" instead of nulls

I was reading a fixed with file from hadoop and doing substr and converting it to delimiter file. Code is working fine but instead of emply values in case of null it is returning \"\". Could you please suggest?
snippet
df.select(
df.value.substr(31, 1).alias('status'),
df.value.substr(32, 1).alias('tin_cert'),
df.value.substr(116, 1).alias('c_notice_flg'),
df.value.substr(117, 2).alias('nbr_non_prime_trlrs'),
df.value.substr(119, 3).alias('aw_related')
).write.option("delimiter", "|").csv(unixFile)
output
|\"\"|0|N|00|\"\"|199|
desired output
||0|N|00||199|
no quotes in the input file
000000000014999999999 281AAAA AAAAAAA AAAA 1NN00
000000000024 200BBBBB BBBBBBBBBBBBBBBBB 0NN00
000000000034 200 0NN00
000000000044 200 0NN00
I think escaped quotes are added because of default arguments for pyspark.sql.DataFrameWriter.csv method. In fact, as you can see from the docs:
quote – sets a single character used for escaping quoted values where the separator can be part of the value. If None is set, it uses the default value, ". If an empty string is set, it uses u0000 (null character).
escape – sets a single character used for escaping quotes inside an already quoted value. If None is set, it uses the default value, \

pyspark replace regex with regex

I am trying to replaces a regex (in this case a space with a number) with
I have a Spark dataframe that contains a string column. I want to replace a regex (space plus a number) with a comma without losing the number. I have tried both of these with no luck:
df.select("A", f.regexp_replace(f.col("A"), "\s+[0-9]", ' ,
').alias("replaced"))
df.select("A", f.regexp_replace(f.col("A"), "\s+[0-9]", '\s+[0-9] ,
').alias("replaced"))
Any help is appreciated.
What you need is another function, regex_extract
So, you have to divide the regex and get the part you need. It could be something like this:
df.select("A", f.regexp_extract(f.col("A"), "(\s+)([0-9])", 2).alias("replaced"))

how to remove # character from national data type in cobol

i am facing issue while converting unicode data into national characters.
When i convert the Unicode data into national using national-of function, some junk character like # is appended after the string.
E.g
Ws-unicode pic X(200)
Ws-national pic N(600)
--let the value in Ws-Unicode is これらの変更は. getting from java end.
move function national-of ( Ws-unicode ,1208 ) to Ws-national.
--after converting value is like これらの変更は #.
i do not want the extra # character added after conversion.
please help me to find out the possible solution, i have tried to replace N'#' with space using inspect clause.
it worked well but failed in some specific scenario like if we have # in input from user end. in that case genuine # also converted to space.
Below is a snippet of code I used to convert EBCDIC to UTF. Before I was capturing string lengths, I was also getting # symbols:
STRING
FUNCTION DISPLAY-OF (
FUNCTION NATIONAL-OF (
WS-EBCDIC-STRING(1:WS-XML-EBCDIC-LENGTH)
WS-EBCDIC-CCSID
)
WS-UTF8-CCSID
)
DELIMITED BY SIZE
INTO WS-UTF8-STRING
WITH POINTER WS-XML-UTF8-LENGTH
END-STRING
SUBTRACT 1 FROM WS-XML-UTF8-LENGTH
What this code does is string the UTF8 representation of the EBCIDIC string into another variable. The WITH POINTER clause will capture the new length of the string + 1 (+ 1 because the pointer is positioned to the next position after the string ended).
Using this method, you should be able to know exactly how long second string is and use that string with the exact length.
That should remove the unwanted #s.
EDIT:
One thing I forgot to mention, in my case, the # signs were actually EBCDIC low values when viewing the actual hex on the mainframe
Use inspect with reverse and stop after first occurence of #

list trigger no system ending with "_BI"

I want to list the trigger no system ending with "_BI" in firebird database,
but no result with this
select * from rdb$triggers
where
rdb$trigger_source is not null
and (coalesce(rdb$system_flag,0) = 0)
and (rdb$trigger_source not starting with 'CHECK' )
and (rdb$trigger_name like '%BI')
but with this syntaxs it gives me a "_bi" and "_BI0U" and "_BI0U" ending result
and (rdb$trigger_name like '%BI%')
but with this syntaxs it gives me null result
and (rdb$trigger_name like '%#_BI')
thank you beforehand
The problem is that the Firebird system tables use CHAR(31) for object names, this means that they are padded with spaces up to the declared length. As a result, use of like '%BI') will not yield results, unless BI are the 30th and 31st character.
There are several solutions
For example you can trim the name before checking
trim(rdb$trigger_name) like '%BI'
or you can require that the name is followed by at least one space
rdb$trigger_name || ' ' like '%BI %'
On a related note, if you want to check if your trigger name ends in _BI, then you should also include the underscore in your condition. And as an underscore in like is a single character matcher, you need to escape it:
trim(rdb$trigger_name) like '%\_BI' escape '\'
Alternatively you could also try to use a regular expressions, as you won't need to trim or otherwise mangle the lefthand side of the expression:
rdb$trigger_name similar to '%\_BI[[:SPACE:]]*' escape '\'