Get the ASCII value of a character in Redshift

Get the ASCII value of a character in Redshift - amazon-redshift

I am trying to find the ASCII value of a character is a string. I am essentially looking for the opposit of CHR, CHR(65)= A. Similarly fn(A) should return 65. I could not find any function in redshift SQL manual.

Closest I could find is:
ASCII is a deprecated leader node–only function.
You could write a Python User-Defined Function that returns a character given the ASCII value.

Related

Azure Data Factory - Dynamic Skip Lines Expression

I am attempting to import a CSV into ADF however the file header is not the first line of the file. It is dynamic therefore I need to match it based on the first column (e.g "TestID,") which is a string.
Example Data (Header is on Line 4)
Date:,01/05/2022
Time:,00:30:25
Test Temperature:,25C
TestID,StartTime,EndTime,Result
TID12345-01,00:45:30,00:47:12,Pass
TID12345-02,00:46:50,00:49:12,Fail
TID12345-03,00:48:20,00:52:17,Pass
TID12345-04,00:49:12,00:49:45,Pass
TID12345-05,00:50:22,00:51:55,Fail
I found this article which addresses this issue however I am struggling to rewrite the expression from using an integer to using a string.
https://kromerbigdata.com/2019/09/28/adf-dynamic-skip-lines-find-data-with-variable-headers
First Expression
iif(!isNull(toInteger(left(toString(byPosition(1)),1))),toInteger(rownum),toInteger(0))
As the article states, this expression looks at the first character of each row and if it is an integer it will return the row number (rownum)
How do I perform this action for a string (e.g "TestID,")
Many Thanks
Jonny

I think you want to consider first line that starts with string as your header and preceding lines that starts with numbers should not be considered as header. You can use isNan function to check if the first character is Not a number(i.e. string) as seen in the below modified expression:
iif(isNan(left(toString(byPosition(1)),1))
,toInteger(rownum)
,toInteger(0)
)
Following is a breakdown of the above expression:
left(toString(byPosition(1)),1): gets first character fron left side of the first column.
isNan: checks if the character is "not a number".
iif: not a number, true then return rownum, false then return 0.
Or you can also use functions like isInteger() to check if the first character is an integer or not and perform actions accordingly.
Later on as explained in the cited article you need to find minimum rownum to skip.
Hope it helps.

pyspark dataframe returning different characters \"\" instead of nulls

I was reading a fixed with file from hadoop and doing substr and converting it to delimiter file. Code is working fine but instead of emply values in case of null it is returning \"\". Could you please suggest?
snippet
df.select(
df.value.substr(31, 1).alias('status'),
df.value.substr(32, 1).alias('tin_cert'),
df.value.substr(116, 1).alias('c_notice_flg'),
df.value.substr(117, 2).alias('nbr_non_prime_trlrs'),
df.value.substr(119, 3).alias('aw_related')
).write.option("delimiter", "|").csv(unixFile)
output
|\"\"|0|N|00|\"\"|199|
desired output
||0|N|00||199|
no quotes in the input file
000000000014999999999 281AAAA AAAAAAA AAAA 1NN00
000000000024 200BBBBB BBBBBBBBBBBBBBBBB 0NN00
000000000034 200 0NN00
000000000044 200 0NN00

I think escaped quotes are added because of default arguments for pyspark.sql.DataFrameWriter.csv method. In fact, as you can see from the docs:
quote – sets a single character used for escaping quoted values where the separator can be part of the value. If None is set, it uses the default value, ". If an empty string is set, it uses u0000 (null character).
escape – sets a single character used for escaping quotes inside an already quoted value. If None is set, it uses the default value, \

how to remove # character from national data type in cobol

i am facing issue while converting unicode data into national characters.
When i convert the Unicode data into national using national-of function, some junk character like # is appended after the string.
E.g
Ws-unicode pic X(200)
Ws-national pic N(600)
--let the value in Ws-Unicode is これらの変更は. getting from java end.
move function national-of ( Ws-unicode ,1208 ) to Ws-national.
--after converting value is like これらの変更は #.
i do not want the extra # character added after conversion.
please help me to find out the possible solution, i have tried to replace N'#' with space using inspect clause.
it worked well but failed in some specific scenario like if we have # in input from user end. in that case genuine # also converted to space.

Below is a snippet of code I used to convert EBCDIC to UTF. Before I was capturing string lengths, I was also getting # symbols:
STRING
FUNCTION DISPLAY-OF (
FUNCTION NATIONAL-OF (
WS-EBCDIC-STRING(1:WS-XML-EBCDIC-LENGTH)
WS-EBCDIC-CCSID
)
WS-UTF8-CCSID
)
DELIMITED BY SIZE
INTO WS-UTF8-STRING
WITH POINTER WS-XML-UTF8-LENGTH
END-STRING
SUBTRACT 1 FROM WS-XML-UTF8-LENGTH
What this code does is string the UTF8 representation of the EBCIDIC string into another variable. The WITH POINTER clause will capture the new length of the string + 1 (+ 1 because the pointer is positioned to the next position after the string ended).
Using this method, you should be able to know exactly how long second string is and use that string with the exact length.
That should remove the unwanted #s.
EDIT:
One thing I forgot to mention, in my case, the # signs were actually EBCDIC low values when viewing the actual hex on the mainframe

Use inspect with reverse and stop after first occurence of #

replace double backslash with single backslash in haskell

I want to replace "\\" from a bytestring sequence (Data.ByteString)
with "\", but due to the internal escaping of "\" it won't work.
Consider following example:
The original bytestring:
"\159\DEL*\150\222/\129vr\205\241=mA\192\184"
After storing in and re-reading from a database I obtain following
bytestring:
"\"\\159\\DEL*\\150\\222/\\129vr\\205\\241=mA\\192\\184\""
Imagine that the bytestring is used as a cryptographic key, which
is now a wrong key due to the invalid characters in the sequence.
This problem actually arises from the wrong database representation
(varchar instead of bytea) because it's otherwise considered as an invalid utf-8 sequence.
I have tried to replace the invalid characters using some sort of
split-modify-concat approach, but all I get is something without
any backslash inside the sequence, because I can't insert a single backslash
into a bytestring.
I really ask for your help.

Perhaps using read will work for you:
import Data.ByteString.Char8 as BS
bad = BS.pack "\"\\159\\DEL*\\150\\222/\\129vr\\205\\241=mA\\192\\184\""
good = read (BS.unpack bad) :: BS.ByteString
-- returns: "\159\DEL*\150\222/\129vr\205\241=mA\192\184"
You can also use readMaybe instead for safer parsing.

possibly you want the postgresql expression
substring(ByteString from e'^\\"(.*)\\"$')::bytea
that will give a bytea result that can be used in queries or in an alter table-using DDL

Perl autoincrement of string not working as before

I have some code where I am converting some data elements in a flat file. I save the old:new values to a hash which is written to a file at the end of processing. On subsequence execution, I reload into a hash so I can reuse previously converted values on additional data files. I also save the last conversion value so if I encounter an unconverted value, I can assign it a new converted value and add it to the hash.
I had used this code before (back in Feb) on six files with no issues. I have a variable that is set to ZCKL0 (last character is a zero) which is retrieved from a file holding the last used value. I apply the increment operator
...
$data{$olddata} = ++$dataseed;
...
and the resultant value in $dataseed is 1 instead of ZCKL1. The original starting seed value was ZAAA0.
What am I missing here?

Do you use the $dataseed variable in a numeric context in your code?
From perlop:
If you increment a variable that is
numeric, or that has ever been used in
a numeric context, you get a normal
increment. If, however, the variable
has been used in only string contexts
since it was set, and has a value that
is not the empty string and matches
the pattern /^[a-zA-Z][0-9]\z/ , the
increment is done as a string,
preserving each character within its
range.

As prevously mentioned, ++ on strings is "magic" in that it operates differently based on the content of the string and the context in which the string is used.
To illustrate the problem and assuming:
my $s='ZCL0';
then
print ++$s;
will print:
ZCL1
while
$s+=0; print ++$s;
prints
1
NB: In other popular programming languages, the ++ is legal for numeric values only.
Using non-intuitive, "magic" features of Perl is discouraged as they lead to confusing and possibly unsupportable code.

You can write this almost as succinctly without relying on the magic ++ behavior:
s/(\d+)$/ $1 + 1 /e
The e flag makes it an expression substitution.

Categories

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Get the ASCII value of a character in Redshift - amazon-redshift

I am trying to find the ASCII value of a character is a string. I am essentially looking for the opposit of CHR, CHR(65)= A. Similarly fn(A) should return 65. I could not find any function in redshift SQL manual.

Closest I could find is: ASCII is a deprecated leader node–only function. You could write a Python User-Defined Function that returns a character given the ASCII value.

Related

Azure Data Factory - Dynamic Skip Lines Expression

pyspark dataframe returning different characters \"\" instead of nulls

how to remove # character from national data type in cobol

replace double backslash with single backslash in haskell

Perl autoincrement of string not working as before

Categories

Resources