How can the PBKDF2 function be done in PostgreSQL? There does not appear to be a native implementation.
Moved Answer out of Question to adhere to Stack Overflow guidelines. See original revision of post.
Original post (Revision link)
Not able to find it natively, and based on PHP code found on the 'net, I came up with this PBKDF2 function for PostgreSQL. Enjoy.
create or replace function PBKDF2
(salt bytea, pw text, count integer, desired_length integer, algorithm text)
returns bytea
immutable
language plpgsql
as $$
declare
hash_length integer;
block_count integer;
output bytea;
the_last bytea;
xorsum bytea;
i_as_int32 bytea;
i integer;
j integer;
k integer;
begin
algorithm := lower(algorithm);
case algorithm
when 'md5' then
hash_length := 16;
when 'sha1' then
hash_length = 20;
when 'sha256' then
hash_length = 32;
when 'sha512' then
hash_length = 64;
else
raise exception 'Unknown algorithm "%"', algorithm;
end case;
block_count := ceil(desired_length::real / hash_length::real);
for i in 1 .. block_count loop
i_as_int32 := E'\\000\\000\\000'::bytea || chr(i)::bytea;
i_as_int32 := substring(i_as_int32, length(i_as_int32) - 3);
the_last := salt::bytea || i_as_int32;
xorsum := HMAC(the_last, pw::bytea, algorithm);
the_last := xorsum;
for j in 2 .. count loop
the_last := HMAC(the_last, pw::bytea, algorithm);
--
-- xor the two
--
for k in 1 .. length(xorsum) loop
xorsum := set_byte(xorsum, k - 1, get_byte(xorsum, k - 1) # get_byte(the_last, k - 1));
end loop;
end loop;
if output is null then
output := xorsum;
else
output := output || xorsum;
end if;
end loop;
return substring(output from 1 for desired_length);
end $$;
I've tested against other implementations without deviation, but be sure to test it yourself.
Skip32 encrypts or decrypts a single int4 (32 bits) value with a 10 bytes (80 bits) key of bytea type.
Is it possible to limit the output to 100,000,000 - 999,999,999 so we always get 9 digits positive integers?
There is pseudo encrypt constrained to an arbitrary range example in the wiki but it doesn't work with a key and only support setting the upper limit.
Do I need to modify the following lines in the original implementation?
...
wl := (val & -65536) >> 16;
wr := val & 65535;
...
RETURN (wr << 16) | (wl & 65535);
...
and wrap in a loop that ensures the result is between the min,max range?
CREATE FUNCTION bounded_skip32(VALUE int4, CR_KEY bytea, ENCRYPT bool, MIN int, MAX int) returns int AS $$
BEGIN
LOOP
VALUE := skip32(VALUE, CR_KEY, ENCRYPT);
EXIT WHEN VALUE <= MAX AND VALUE >= MIN;
END LOOP;
RETURN VALUE;
END
$$ LANGUAGE plpgsql strict immutable;
CREATE OR REPLACE FUNCTION array_replace(INT[]) RETURNS float[] AS $$
DECLARE
arrFloats ALIAS FOR $1;
J int=0;
x int[]=ARRAY[2,4];
-- xx float[]=ARRAY[2.22,4.33];
b float=2.22;
c float=3.33;
retVal float[];
BEGIN
FOR I IN array_lower(arrFloats, 1)..array_upper(arrFloats, 1) LOOP
FOR K IN array_lower(x, 1)..array_upper(x, 1) LOOP
IF (arrFloats[I]= x[K])THEN
retVal[j] :=b;
j:=j+1;
retVal[j] :=c;
j:=j+1;
ELSE
retVal[j] := arrFloats[I];
j:=j+1;
END IF;
END LOOP;
END LOOP;
RETURN retVal;
END;
$$ LANGUAGE plpgsql STABLE RETURNS NULL ON NULL INPUT;
When I run this query
SELECT array_replace(array[1,20,2,5]);
it give me output like this
"[0:8]={1,1,20,20,2.22,3.33,2,5,5}"
Now I do not know why it is coming this duplicate values. I mean it is straight away a nested loop ...
I need a output like this one
"[0:8]={1,20,2.22,3.33,5}"
You have a double loop with the x array having two elements. On every iteration you push elements onto the result array, hence you get twice as many values.
If I understand you logic correctly, you want to scan the input array for values of another array in that same order. If the same, then replace these values with another array, leaving other values intact. There are no built-in functions to help you here, so you have to do this from scratch:
CREATE FUNCTION array_replace(arrFloats float[]) RETURNS float[] AS $$
DECLARE
searchArr float[] := ARRAY[1.,20.];
replaceArr float[] := ARRAY[1.11,1.,111.,20.2,20.222];
retVal float[];
i int;
ndx int;
len int;
upp int;
low int
BEGIN
low := array_lower(searchArr, 1)
upp := array_upper(searchArr, 1);
len := upp - low + 1;
i := array_lower(arrFloats, 1);
WHILE i <= array_upper(arrFloats, 1) LOOP -- Use WHILE LOOP so can update i
ndx := i; -- index into arrFloats for inner loop
FOR j IN low .. upp LOOP
IF arrFloats[ndx] != searchArr[j] THEN
-- No match so put current element of arrFloats in the result and update i
retVal := retVal || arrFloats[i];
i := i + 1;
EXIT; -- No need to look further, break out of inner loop
END IF;
ndx := ndx + 1;
IF j = upp THEN
-- We have a match so append the replaceArr to retVal and
-- increase i by length of search_array
retVal := retVal || replaceArr;
i := i + len;
END IF;
END LOOP;
END LOOP;
RETURN retVal;
END;
$$ LANGUAGE plpgsql STABLE STRICT;
This function would become much more flexible if you made searchArr and replaceArr into parameters as well.
Test
patrick#puny:~$ psql -d test
psql (9.5.0, server 9.4.5)
Type "help" for help.
test=# select array_replace(array[1,20,2,5]);
array_replace
------------------------------
{1.11,1,111,20.2,20.222,2,5}
(1 row)
test=# select array_replace(array[1,20,2,5,1,20.1,1,20]);
array_replace
------------------------------------------------------------
{1.11,1,111,20.2,20.222,2,5,1,20.1,1.11,1,111,20.2,20.222}
(1 row)
As you can see it works for multiple occurrences of the search array.
Because I can't test this easily with billions of table insertions, I want to get help on figuring out how to use the pseudo_encrypt (https://wiki.postgresql.org/wiki/Pseudo_encrypt) function for my table ids that already have sequential ids in them. For example, our users table has approx 10,000 users. Ids go from 1..10,000.
Now I want to use the pseudo_encrypt function to get the 10,001 ID which would look something like this: 1064621387932509969
The problem is that there is a chance that the "random" pseudo encrypt return value may collide at one point with my early 1-10,000 user IDs.
I do not want to change the first 10,000 user IDs as that would cause some pain for the current users (have to re-login again, urls broken, etc.).
My idea was to use some sort of recursive function to handle this... would something like this work, or am I mission something?
CREATE OR REPLACE FUNCTION "pseudo_encrypt"("VALUE" int) RETURNS int IMMUTABLE STRICT AS $function_pseudo_encrypt$
DECLARE
l1 int;
l2 int;
r1 int;
r2 int;
return_value int;
i int:=0;
BEGIN
l1:= ("VALUE" >> 16) & 65535;
r1:= "VALUE" & 65535;
WHILE i < 3 LOOP
l2 := r1;
r2 := l1 # ((((1366.0 * r1 + 150889) % 714025) / 714025.0) * 32767)::int;
r1 := l2;
l1 := r2;
i := i + 1;
END LOOP;
return_value = ((l1::int << 16) + r1); // NEW CODE
// NEW CODE - RECURSIVELY LOOP UNTIL VALUE OVER 10,000
WHILE return_value <= 10000
return_value = pseudo_encrypt(nextval('SEQUENCE_NAME'))
END LOOP
RETURN return_value;
END;
$function_pseudo_encrypt$ LANGUAGE plpgsql;
Great comments! This seems to do the job:
CREATE OR REPLACE FUNCTION get_next_user_id() returns int AS $$
DECLARE
return_value int:=0;
BEGIN
WHILE return_value < 10000 LOOP
return_value := pseudo_encrypt(nextval('test_id_seq')::int);
END LOOP;
RETURN return_value ;
END;
$$ LANGUAGE plpgsql strict immutable;
I updated a bunch of fields in my db using this Feistel Cipher. According to the documentation, the cipher can be undone to get the original value. How can I undo the values if need be?
Here is the original cipher function:
CREATE OR REPLACE FUNCTION pseudo_encrypt(VALUE int) returns bigint AS $$
DECLARE
l1 int;
l2 int;
r1 int;
r2 int;
i int:=0;
BEGIN
l1:= (VALUE >> 16) & 65535;
r1:= VALUE & 65535;
WHILE i < 3 LOOP
l2 := r1;
r2 := l1 # ((((1366.0 * r1 + 150889) % 714025) / 714025.0) * 32767)::int;
l1 := l2;
r1 := r2;
i := i + 1;
END LOOP;
RETURN ((l1::bigint << 16) + r1);
END;
$$ LANGUAGE plpgsql strict immutable;
You could use this self-reversible variant in the first place:
CREATE FUNCTION rev_pseudo_encrypt(VALUE bigint) returns bigint AS $$
DECLARE
l1 int;
l2 int;
r1 int;
r2 int;
i int:=0;
BEGIN
l1:= (VALUE >> 16) & 65535;
r1:= VALUE & 65535;
WHILE i < 3 LOOP
l2 := r1;
r2 := l1 # ((((1366.0 * r1 + 150889) % 714025) / 714025.0) * 32767)::int;
l1 := l2;
r1 := r2;
i := i + 1;
END LOOP;
RETURN ((r1::bigint<<16) + l1);
END;
$$ LANGUAGE plpgsql strict immutable;
It differs from the original version by taking bigint instead of int as input (but the input should still less than 2^32), and by the fact that the two 16 bits blocks are swapped in the 32 bits final result.
It has the property that rev_pseudo_encrypt(rev_pseudo_encrypt(x)) = x in addition to the uniquess of results.
Also there's the advantage that the input type is the same as the output type.
On the other hand, to reverse the values generated by the original version, their 16 bits blocks need to be swapped before being fed to the algorithm, and the result again swapped:
create function swap16(bigint) returns bigint as
'select (($1&65535)<<16)+(($1)>>16)'
language sql stable;
select pseudo_encrypt(1234);
pseudo_encrypt
----------------
223549288
select swap16(pseudo_encrypt(swap16(223549288)::int));
swap16
--------
1234