Convert the MD5 output into 32 bit integer in Redshift

Convert the MD5 output into 32 bit integer in Redshift - amazon-redshift

I have tried the following in Redshift
SELECT STRTOL(MD5('345793260804895811'), 10);
but I got the following DBCException:
SQL Error [22023]: ERROR: The input cf82576a6dbf9ff63cf9828f990f0673 is not valid to be converted to base 10
org.postgresql.util.PSQLException: PSQLException: ERROR: The input cf82576a6dbf9ff63cf9828f990f0673 is not valid to be converted to base 10
How may I get this done in Redshift?

You have 2 problems:
First, you need to specify the conversion as being base 16
Second, an MD5 string will massively overflow a 64 bit BIGINT value
This works nicely
SELECT STRTOL(LEFT(MD5('345793260804895811'),15), 16);
Shortens the MD5 hex value to 15 leftmost characters and convert to a BIGINT using base 16.

I came up with this to store MD5 in two BIGINT fields instead of CHAR(32) - 2x space saving!
select
convert(bigint,
strtol(substring(hash,1,8),16) * 4294967296.0 +
strtol(substring(hash,9,8),16) - 9223372036854775807
) as hash_part1
,convert(bigint,
strtol(substring(hash,17,8),16) * 4294967296.0 +
strtol(substring(hash,25,8),16) - 9223372036854775807
) as hash_part2
Hope it helps someone.

The result of MD5 is 128 bit long(ref), you can not fit it into a 32 bit integer.

You can try with converting on base 16 instead of 10:
SELECT STRTOL(MD5('cf82576a6dbf9ff63cf9828f990f0673'), 16);

Related

How to extract numbers appearing after decimal from a money format column in tsql?

I have an amount column which is in money format. I have tried using parsename, converting it to varchar to use substring function but unable to extract exact values appearing after decimal. Attaching the screenshot for reference.
select home_currency_amount,
cast(home_currency_amount as varchar(50)) as amt_varchar,
parsename(home_currency_amount, 1) as amt_prsnm
from #temptbl;
---Below is the output:
home_currency_amount amt_varchar amt_prsnm
39396.855 39396.86 86
1112.465 1112.47 47
5635.1824 5635.18 18
E.g. if value is 39396.855, desired output would be 855.
Thanks in advance for the help.

First perform the mod operation on value with 1 to get the decimal part. We will be getting as '0.decimal_part' as we require only decimal part without '0.' so we are replacing it and finally casting as integer. Hope it helps in your case..
select cast(replace(cast( 1.23 % 1 as varchar),'0.','') as int)
enter image description here

How to convert bit type to bytea type in Postgresql

I want to convert bit type into bytea in postgresql.
Like this.
select (b'1010110011001100' & b'1011000011110000')::bytea;
However, error occured
ERROR: cannot cast type bit to bytea
LINE 1: select (b'1010110011001100' & b'1011000011110000')::bytea;
I just wanted to do an operation on bit strings and convert to bytea type.

Convert the bit value to hex and use decode():
select decode(to_hex((b'1010110011001100' & b'1011000011110000')::int), 'hex')
decode
--------
\xa0c0
(1 row)

select decode((b'1010110011001100' & b'1011000011110000')::text,'escape')
or
select ((b'1010110011001100' & b'1011000011110000')::text)::bytea

How to truncate a 2's complement output

I have data written into short data type. The data written is of 2's complement form.
Now when I try to print the data using %04x, the data with MSB=0 is printed fine for eg if data=740, print I get is 0740
But when the MSB=1, I am unable to get a proper print. For eg if data=842, print I get is fffff842
I want the data truncated to 4 bytes so expected output is f842

Either declare your data as a type which is 16 bits long, or make sure the printing function uses the right format for 16 bits value. Or use your current type, but do a bitwise AND with 0xffff. What you can do depends on the language you're doing it in really.
But whichever way you go, check your assumptions again. There seems to be a few issues in your question:
2s-complement applies to signed numbers only. There are no negative numbers in your question.
Assuming you mean C's short - it doesn't have to be 16 bits long.
"I get is fffff842 I want the data truncated to 4 bytes" - fffff842 is 4 bytes long. f842 is 2 bytes long.
2-bytes long value 842 does not have the MSB set.

I'm assuming C (or possibly C++) as the language here.
Because of the default argument promotions involved when calling a variable argument function (such as printf), your use of a short will result in an integer promotion, which states that "If an int can represent all values of the original type (as restricted by the width, for a
bit-field), the value is converted to an int".
A short is converted to an int by means of sign-extension, and 0xf842 sign-extended to 32 bits is 0xfffff842.
You can use a bitwise AND to mask off the most significant word:
printf("%04x", data & 0xffff);
You could also add the h length specifier to state that you only want to print an (unsigned) short worth of bits from an int:
printf("%04hx", data);

decode hex in PostgreSQL - got error "odd number of digits"

I have a problem using this query:
select decode(to_hex(ascii('ل')::int),'hex')
When I execute it, I get:
ERROR: invalid hexadecimal data: odd number of digits

decode(..., 'hex') doesn't mean convert this hexadecimal number to something. Hex encoding is a particular encoding format for bytes, and it requires two hexadecimal digits per octet. On the other hand, to_hex converts an integer to a hexadecimal representation, and that could have an even or odd number of digits.
So the answer is, you can't do that (without some manual fixups). And it's not clear why you would want to, either. It looks like you could just do 'ل'::bytea, but that might not be what you wanted either.

May be it's simpler to use something like this:
select encode('ل','escape');

bytea type & nulls, Postgres

I'm using a bytea type in PostgreSQL, which, to my understanding, contains just a series of bytes. However, I can't get it to play well with nulls. For example:
=# select length(E'aa\x00aa'::bytea);
length
--------
2
(1 row)
I was expecting 5. Also:
=# select md5(E'aa\x00aa'::bytea);
md5
----------------------------------
4124bc0a9335c27f086f24ba207a4912
(1 row)
That's the MD5 of "aa", not "aa\x00aa". Clearly, I'm Doing It Wrong, but I don't know what I'm doing wrong. I'm also on an older version of Postgres (8.1.11) for reasons outside of my control. (I'll see if this behaves the same on the latest Postgres as soon as I get home...)

Try this:
# select length(E'aa\\000aa'::bytea);
length
--------
5
Updated: Why the original didn't work? First, understand the difference between one slash and two:
pg=# select E'aa\055aa', length(E'aa\055aa') ;
?column? | length
----------+--------
aa-aa | 5
(1 row)
pg=# select E'aa\\055aa', length(E'aa\\055aa') ;
?column? | length
----------+--------
aa\055aa | 8
In the first case, I'm writing a literal string, 4 characters unescaped('a') and one escaped. The slash is consumed by the parser in a first pass, which converts the full \055
to a single char ('-' in this case).
In the second case, the first slash just escapes the second, the pair \\ is translated by the parser to a single \ and the 055 is seen as three characters.
Now, when converting a text to a bytea, escape characters (in a already parsed or produced text) are parsed/interpreted again! (Yes, this is confusing).
So, when I write
select E'aa\000aa'::bytea;
in the first parsing, the literal E'aa\000aa' is converted to an internal text with a null character in the third position (and depending on your postgresql version, the null character is interpreted as an EOS, and the text is assumed to be of length two - or in other versions an illegal string error is thrown).
Instead, when I write
select E'aa\\000aa'::bytea;
in the first parsing, the literal string "aa\000aa" (eight characters) is seen, and is asigned to a text; then in the casting to bytea, it is parsed again, and the sequence of characters '\000' is interpreted as a null byte.
IMO postgresql kind of sucks here.

You can use regular strings or dollar-quoted strings instead of escaped strings:
# select length('aa\000aa'::bytea);
length
════════
5
(1 row)
# select length($$aa\000aa$$::bytea);
length
════════
5
(1 row)
I think that dollar-quoted strings are a better option because, if the configuration parameter standard_conforming_strings is off, then PostgreSQL recognizes backslash escapes in both regular and escape string constants. However, as of PostgreSQL 9.1, the default is on, meaning that backslash escapes are recognized only in escape string constants.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Convert the MD5 output into 32 bit integer in Redshift - amazon-redshift

The result of MD5 is 128 bit long(ref), you can not fit it into a 32 bit integer.

You can try with converting on base 16 instead of 10: SELECT STRTOL(MD5('cf82576a6dbf9ff63cf9828f990f0673'), 16);

Related

How to extract numbers appearing after decimal from a money format column in tsql?

How to convert bit type to bytea type in Postgresql

How to truncate a 2's complement output

decode hex in PostgreSQL - got error "odd number of digits"

bytea type & nulls, Postgres

Categories

Resources