Which method does Postgres round(v numeric, s int) use?
Round half up
Round half down
Round half away from zero
Round half towards zero
Round half to even
Round half to odd
I'm looking for documentation reference.
It's not documented, so it can change.
Here are my round_half_even(numeric,integer):
create or replace function round_half_even(val numeric, prec integer)
returns numeric
as $$
declare
retval numeric;
difference numeric;
even boolean;
begin
retval := round(val,prec);
difference := retval-val;
if abs(difference)*(10::numeric^prec) = 0.5::numeric then
even := (retval * (10::numeric^prec)) % 2::numeric = 0::numeric;
if not even then
retval := round(val-difference,prec);
end if;
end if;
return retval;
end;
$$ language plpgsql immutable strict;
And round_half_odd(numeric,integer):
create or replace function round_half_odd(val numeric, prec integer)
returns numeric
as $$
declare
retval numeric;
difference numeric;
even boolean;
begin
retval := round(val,prec);
difference := retval-val;
if abs(difference)*(10::numeric^prec) = 0.5::numeric then
even := (retval * (10::numeric^prec)) % 2::numeric = 0::numeric;
if even then
retval := round(val-difference,prec);
end if;
end if;
return retval;
end;
$$ language plpgsql immutable strict;
They manage about 500000 invocations per second, 6 times slower than a standard round(numeric,integer). They also work for zero and for negative precision.
Sorry, I don't see any hint of this in the documentation, but a look at the code indicates it's using round half away from zero; the carry is always added to digits, thereby increasing the absolute value of the variable, regardless of what its sign is. A simple experiment (psql 9.1) confirms this:
test=# CREATE TABLE nvals (v numeric(5,2));
CREATE TABLE
test=# INSERT INTO nvals (v) VALUES (-0.25), (-0.15), (-0.05), (0.05), (0.15), (0.25);
INSERT 0 6
test=# SELECT v, round(v, 1) FROM nvals;
v | round
-------+-------
-0.25 | -0.3
-0.15 | -0.2
-0.05 | -0.1
0.05 | 0.1
0.15 | 0.2
0.25 | 0.3
(6 rows)
Interesting, because round(v dp) uses half even:
test=# create table vals (v double precision);
CREATE TABLE
test=# insert into vals (v) VALUES (-2.5), (-1.5), (-0.5), (0.5), (1.5), (2.5);
INSERT 0 6
test=# select v, round(v) from vals;
v | round
------+-------
-2.5 | -2
-1.5 | -2
-0.5 | -0
0.5 | 0
1.5 | 2
2.5 | 2
(6 rows)
The latter behavior is almost certainly platform-dependent, since it looks like it uses rint(3) under the hood.
You could always implement a different rounding scheme if necessary. See Tometzky's answer for examples.
Related
I have a function written in PL/Python. It is a database function that runs in Python, which is permitted because of a procedural language installed via:
CREATE PROCEDURAL LANGUAGE 'plpythonu' HANDLER plpython_call_handler
(I found a nice trick, to allow non-admin users permission to run, by using a unique name, though it has not much to do with my question, I'm sure some of you will wonder how I am doing this, so below is the answer)
CREATE TRUSTED PROCEDURAL LANGUAGE 'plpythonu2' HANDLER plpython_call_handler
GRANT USAGE ON LANGUAGE plpythonu2 TO admin;
Now to the question at hand, my "hack" above works for me, but if I want to use Amazon's RDS service, I cannot install languages, and PL/Python is not available. SQL however, is.
Therefore, I need help translating the following function, written in Python into pure SQL.
CREATE OR REPLACE FUNCTION "public"."human_readable_bits" (
"b" bigint = 0
)
RETURNS varchar AS
$body$
import math
if b:
exponent = math.floor(math.log(b)/math.log(1024))
val = b/pow(1024, math.floor(exponent))
val = round(val*2)/2 -- This rounds to the nearest HALF (X.5) B, Kb, Mb, Gb, etc.
return "%.2f %s" % (val, ('B','Kb','Mb','Gb','Tb','Pb','Eb','Zb','Yb')[int(exponent)])
else:
return "0 Gb"
$body$
LANGUAGE 'plpythonu2'
VOLATILE
RETURNS NULL ON NULL INPUT
SECURITY INVOKER
COST 100;
This function allows me to perform queries such as:
=> SELECT human_readable_bits(3285824466906);
human_readable_bits
---------------------
3.00 Tb
(1 row)
OR
=> SELECT human_readable_bits(5920466906);
human_readable_bits
---------------------
5.50 Gb
(1 row)
Also, as a side-note/secondary question, after I created the function, when I look at the DDL, it has a line in it that says "SECURITY INVOKER," does anyone know what that means/does?
A conversion for a plain PLPGSQL function would be:
CREATE OR REPLACE FUNCTION public.human_readable_bits(b NUMERIC)
RETURNS VARCHAR AS
$BODY$
declare
exponent integer;
val float;
arr varchar[];
sz VARCHAR(10);
result varchar(20);
BEGIN
if b is null or b = 0 then
return '0 B';
end if;
if b < 1024 then
return b::varchar || ' Bits';
end if;
arr := ARRAY['B','Kb','Mb','Gb','Tb','Pb','Eb','Zb','Yb'];
exponent := floor( log(b) / log(1024));
val := b/power(1024,exponent);
val := round(val*2)/2;
sz := arr[trunc(floor(log(b) / log(1024)))];
if strpos(val::varchar,'.') > 0 then
result := substr(val::varchar, 1, strpos(val::varchar,'.')-1);
result := result || '.' || rpad( substr(val::varchar, strpos(val::varchar,'.')+1), 2, '0' ) || ' ' || sz;
else
result := val::varchar || '.00 ' || sz;
end if;
return result;
END;
$BODY$
LANGUAGE plpgsql VOLATILE;
As a result of this function:
select human_readable_bits(328582446690656456434534453) hrb0,
human_readable_bits(3285824466906) hrb1,
human_readable_bits(5920466906) hrb2,
human_readable_bits(1024) hrb3,
human_readable_bits(512) hrb4,
human_readable_bits(null) hrb5;
Would result in:
hrb0 hrb1 hrb2 hrb3 hrb4 hrb5
272.00 Zb 3.00 Gb 5.50 Mb 1.00 B 512 Bits 0 B
As per your side question the answer can easily be found at the Create Function Documentation
SECURITY INVOKER indicates that the function is to be executed with the privileges of the user that calls it. That is the default. SECURITY DEFINER specifies that the function is to be executed with the privileges of the user that created it.
I'm new to writing triggers and functions in Postgres.
I have written a function that changes prices to .99 or .00 whenever a price is put into the database.
CREATE OR REPLACE FUNCTION public.cents(price numeric)
RETURNS numeric
LANGUAGE plpgsql
LEAKPROOF
AS $function$
DECLARE
dollars text;
cents text;
new_price numeric;
BEGIN
dollars := split_part(price::text, '.', 1);
cents := split_part(price::text, '.', 2);
IF cents != '00' THEN cents := '99';
ENDIF;
new_price := dollars || '.' || cents;
RETURN new_price;
END;
$function$
I've read the doc on triggers and these examples seem more complex.
I'm not sure I understand how to create a trigger that will run this function specifically whenever a record in the price column is updated.
Does this look correct? The price column isn't mentioned in the trigger..
CREATE OR REPLACE FUNCTION public.cents()
RETURNS trigger
LANGUAGE plpgsql
LEAKPROOF
AS $tr_cents$
DECLARE
dollars text;
cents text;
new_price numeric;
BEGIN
dollars := split_part(OLD::text, '.', 1);
cents := split_part(OLD::text, '.', 2);
IF cents != '00' THEN cents := '99';
ENDIF;
new_price := dollars || cents;
RETURN new_price;
END;
$tr_cents$;
CREATE TRIGGER tr_cents BEFORE INSERT OR UPDATE ON microwaves
FOR EACH ROW EXECUTE PROCEDURE cents();
The examples in the docs also use RETURN NEW but I'm not exactly sure how that would work with my code, or if it's necessary?
Assuming missing information:
price is defined numeric NOT NULL.
There is a CHECK constraint enforcing positive prices.
Postgres 9.5. (Solution should work for Postgres 9.0+.)
I read your objective like this:
Leave numbers without (significant) fractional digits (.00) and change all others to .99.
See below about "without (significant) fractional digits" or .00 ...
If that's all the trigger does, the most efficient way is to place the condition in a WHEN clause to the trigger itself. The manual:
In a BEFORE trigger, the WHEN condition is evaluated just before the
function is or would be executed, so using WHEN is not materially
different from testing the same condition at the beginning of the trigger function.
(There is more, read the manual.)
This way, the trigger function is not even called if not needed. The logic can be radically simplified:
CREATE OR REPLACE FUNCTION tr_cents()
RETURNS trigger AS
$tr_cents$
BEGIN
-- only called WHEN (NEW.price % 1 IN (.00, .99)
NEW.price := trunc(NEW.price) + .99;
RETURN NEW;
END
$tr_cents$ LANGUAGE plpgsql LEAKPROOF;
CREATE TRIGGER microwaves_cents
BEFORE INSERT OR UPDATE ON microwaves
FOR EACH ROW
WHEN ((NEW.price % 1) <> ALL ('{.00,.99}'::numeric[]))
EXECUTE PROCEDURE tr_cents();
Note that the trigger kicks in for INSERT and UPDATE with illegal price values. Not just
whenever a record in the price column is updated.
You need RETURN NEW; at the end of the trigger function or the operation on the row will be cancelled. The manual:
A trigger function must return either NULL or a record/row value having exactly the structure of the table the trigger was fired for.
You don't need the function public.cents() at all for this.
Test case
CREATE TABLE microwaves (m_id serial PRIMARY KEY, price numeric);
INSERT INTO microwaves (m_id, price) VALUES
(1, 0.00)
, (2, 0.01)
, (3, 0.02)
, (4, 0.99)
, (5, 1.00)
, (6, 1.01)
, (7, 1.02)
, (8, 1.99)
, (9, 12.34);
UPDATE microwaves SET price = 2.0 WHERE m_id = 7;
UPDATE microwaves SET price = 2.5 WHERE m_id = 8;
UPDATE microwaves SET price = 5.99 WHERE m_id = 9;
TABLE microwaves;
Result:
m_id | price
------+-------
1 | 0.00
2 | 0.99
3 | 0.99
4 | 0.99
5 | 1.00
6 | 1.99
7 | 2.0
8 | 2.99
9 | 5.99
Data type numeric and scale
.. and why your function public.cents(price numeric) is a trap.
Scale being the number of decimal fractional digits.
numeric is an arbitrary precision type. It preserves literal digits exactly as entered - unless you specify precision and scale for the type. Like: numeric(10,2). The manual:
Specifying:
NUMERIC
without any precision or scale creates a column in which numeric
values of any precision and scale can be stored, up to the
implementation limit on precision. A column of this kind (numeric
without precision and scale) will not coerce input values to any
particular scale, whereas numeric columns with a declared scale will
coerce input values to that scale.
Leading zeroes are never stored, but trailing zeroes in the fractional part are kept this way, even if insignificant. The manual can easily be misread in this respect, further down:
Numeric values are physically stored without any extra leading or trailing zeroes.
Note the word "extra". Meaning, Postgres will not add trailing zeros, but it will keep the ones you added - even if those are completely insignificant for the numeric value.
You need to be aware of this when converting numeric to text. A check for "00" in the fractional part will work for numeric with a configured scale like numeric (9,2). But it is unreliable for plain numeric like you use in your function. Consider:
SELECT (numeric(9,2) '1')::numeric AS num_cast_from_num_with_scale
, numeric '1.00' AS num_with_scale
, numeric '1' AS num_without_scale;
num_cast_from_num_with_scale | num_with_scale | num_without_scale
------------------------------+----------------+-------------------
1.00 | 1.00 | 1
This way, insignificant trailing zeros become significant. And I seriously doubt that's how it's supposed to be. The test IF cents != '00' ... in your function public.cents(price numeric) is a loaded footgun. It may work as expected while you pass values from a numeric(9,2) column, but "suddenly" break once you use values from other sources.
You described return value as numeric, but return a string by fact. Also several type conversions not a good point. There is more easy way. F. ex.
CREATE OR REPLACE FUNCTION cents(price numeric) RETURNS numeric AS
$BODY$
begin
IF price IS NOT NULL then
IF price % 1 != 0 then
price := floor(price) + 0.99;
end IF;
END IF;
RETURN price;
end;
$BODY$ LANGUAGE plpgsql;
To execute such update on any insert/update need:
CREATE TABLE test (
price numeric
);
CREATE FUNCTION price_update() RETURNS trigger AS $price_update$
BEGIN
IF NEW.price IS NOT NULL THEN
NEW.price = public.cents(NEW.price);
END IF;
RETURN NEW;
END;
$price_update$ LANGUAGE plpgsql;
CREATE TRIGGER on_price_update BEFORE INSERT OR UPDATE ON test
FOR EACH ROW EXECUTE PROCEDURE price_update();
Lets check:
=# insert into test (price) values (2), (1.1), (5);
INSERT 0 3
=# select * from test;
price
-------
2
1.99
5
(3 rows)
=# update test set price = 5.01 where price = 5;
UPDATE 1
=# select * from test;
price
-------
2
1.99
5.99
(3 rows)
=# update test set price = 3 where price = 1.99;
UPDATE 1
=# select * from test;
price
-------
2
5.99
3
(3 rows)
I have table with real column type with example values:
123456,12
0,12345678
And code in stored procedure:
CREATE OR REPLACE FUNCTION test3()
RETURNS integer AS
$BODY$
DECLARE
rec RECORD;
BEGIN
FOR rec IN
SELECT
gme.abs_km as km,
CAST(gme.abs_km as numeric) as cast,
round(gme.abs_km:: numeric(16,2), 2) as round
FROM gps_entry gme
LOOP
RAISE NOTICE 'Km: % , cast: % , round: %', rec.km, rec.cast, rec.round;
INSERT INTO test (km, casting, rounding) VALUES (rec.km, rec.cast, rec.round);
END LOOP;
RETURN 1;
END;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;
Here is output:
2014-02-05 12:49:53 CET NOTICE: Km: 0.12345678 , cast: 0.123457 , round: 0.12
2014-02-05 12:49:53 CET NOTICE: Km: 123456.12 , cast: 123456 , round: 123456.00
DB table with columns NUMERIC(19,2):
km casting rounding
0.12 0.12 0.12
123456.00 123456.00 123456.00
Why do cast and round functions not work for the value 123456.12?
real is a lossy, inexact floating-point type. It only uses 4 bytes for storage and may not store the presented numeric literals precisely to begin with. In addition, implementation details depend on your platform. Consider the chapter "Floating-Point Types" in the manual.
There is nothing wrong with either round() or cast(). For (more) exact results, use numeric.
Function audit
CREATE OR REPLACE FUNCTION test3()
RETURNS void
LANGUAGE plpgsql AS
$func$
DECLARE
r record;
BEGIN
FOR r IN
SELECT abs_km AS km
, cast(abs_km AS numeric) AS km_cast
, round(abs_km::numeric, 2) AS km_round
FROM gps_entry
LOOP
RAISE NOTICE 'km: % , km_cast: % , km_round: %'
, r.km, r.km_cast, r.km_round;
INSERT INTO test (km, casting, rounding)
VALUES (r.km, r.km_cast, r.km_round);
END LOOP;
END
$func$;
Of course, it would be more efficient to replace the loop with a single multi-row INSERT statement.
Do not quote the language name plpgsql. It's an identifier.
Makes no sense to round to 2 fractional digits after casting to numeric(16,2), which forcibly rounds already. Either / or ..
round(abs_km:: numeric(16,2), 2) as round
round(abs_km::numeric, 2) as round
abs_km::numeric(16,2) as round
I have a PostgreSQL table of this form:
base_id int | mods smallint[]
3 | {7,15,48}
I need to populate a table of this form:
combo_id int | base_id int | mods smallint[]
1 | 3 |
2 | 3 | {7}
3 | 3 | {7,15}
4 | 3 | {7,48}
5 | 3 | {7,15,48}
6 | 3 | {15}
7 | 3 | {15,48}
8 | 3 | {48}
I think I could accomplish this using a function that does almost exactly this, iterating over the first table and writing combinations to the second table:
Generate all combinations in SQL
But, I'm a Postgres novice and cannot for the life of me figure out how to do this using plpgsql. It doesn't need to be particularly fast; it will only be run periodically on the backend. The first table has approximately 80 records and a rough calculation suggests we can expect around 2600 records for the second table.
Can anybody at least point me in the right direction?
Edit: Craig: I've got PostgreSQL 9.0. I was successfully able to use UNNEST():
FOR messvar IN SELECT * FROM UNNEST(mods) AS mod WHERE mod BETWEEN 0 AND POWER(2, #n) - 1
LOOP
RAISE NOTICE '%', messvar;
END LOOP;
but then didn't know where to go next.
Edit: For reference, I ended up using Erwin's solution, with a single line added to add a null result ('{}') to each set and the special case Erwin refers to removed:
CREATE OR REPLACE FUNCTION f_combos(_arr integer[], _a integer[] DEFAULT '{}'::integer[], _z integer[] DEFAULT '{}'::integer[])
RETURNS SETOF integer[] LANGUAGE plpgsql AS
$BODY$
DECLARE
i int;
j int;
_up int;
BEGIN
IF array_length(_arr,1) > 0 THEN
_up := array_upper(_arr, 1);
IF _a = '{}' AND _z = '{}' THEN RETURN QUERY SELECT '{}'::int[]; END IF;
FOR i IN array_lower(_arr, 1) .. _up LOOP
FOR j IN i .. _up LOOP
CASE j-i
WHEN 0,1 THEN
RETURN NEXT _a || _arr[i:j] || _z;
ELSE
RETURN NEXT _a || _arr[i:i] || _arr[j:j] || _z;
RETURN QUERY SELECT *
FROM f_combos(_arr[i+1:j-1], _a || _arr[i], _arr[j] || _z);
END CASE;
END LOOP;
END LOOP;
ELSE
RETURN NEXT _arr;
END IF;
END;
$BODY$
Then, I used that function to populate my table:
INSERT INTO e_ecosystem_modified (ide_ecosystem, modifiers)
(SELECT ide_ecosystem, f_combos(modifiers) AS modifiers FROM e_ecosystem WHERE ecosystemgroup <> 'modifier' ORDER BY ide_ecosystem, modifiers);
From 79 rows in my source table with a maximum of 7 items in the modifiers array, the query took 250ms to populate 2630 rows in my output table. Fantastic.
After I slept over it I had a completely new, simpler, faster idea:
CREATE OR REPLACE FUNCTION f_combos(_arr anyarray)
RETURNS TABLE (combo anyarray) LANGUAGE plpgsql AS
$BODY$
BEGIN
IF array_upper(_arr, 1) IS NULL THEN
combo := _arr; RETURN NEXT; RETURN;
END IF;
CASE array_upper(_arr, 1)
-- WHEN 0 THEN -- does not exist
WHEN 1 THEN
RETURN QUERY VALUES ('{}'), (_arr);
WHEN 2 THEN
RETURN QUERY VALUES ('{}'), (_arr[1:1]), (_arr), (_arr[2:2]);
ELSE
RETURN QUERY
WITH x AS (
SELECT f.combo FROM f_combos(_arr[1:array_upper(_arr, 1)-1]) f
)
SELECT x.combo FROM x
UNION ALL
SELECT x.combo || _arr[array_upper(_arr, 1)] FROM x;
END CASE;
END
$BODY$;
Call:
SELECT * FROM f_combos('{1,2,3,4,5,6,7,8,9}'::int[]) ORDER BY 1;
512 rows, total runtime: 2.899 ms
Explain
Treat special cases with NULL and empty array.
Build combinations for a primitive array of two.
Any longer array is broken down into:
the combinations for same array of length n-1
plus all of those combined with element n .. recursively.
Really simple, once you got it.
Works for 1-dimensional integer arrays starting with subscript 1 (see below).
2-3 times as fast as old solution, scales better.
Works for any element type again (using polymorphic types).
Includes the empty array in the result as is displayed in the question (and as #Craig pointed out to me in the comments).
Shorter, more elegant.
This assumes array subscripts starting at 1 (Default). If you are not sure about your values, call the function like this to normalize:
SELECT * FROM f_combos(_arr[array_lower(_arr, 1):array_upper(_arr, 1)]);
Not sure if there is a more elegant way to normalize array subscripts. I posted a question about that:
Normalize array subscripts for 1-dimensional array so they start with 1
Old solution (slower)
CREATE OR REPLACE FUNCTION f_combos2(_arr int[], _a int[] = '{}', _z int[] = '{}')
RETURNS SETOF int[] LANGUAGE plpgsql AS
$BODY$
DECLARE
i int;
j int;
_up int;
BEGIN
IF array_length(_arr,1) > 0 THEN
_up := array_upper(_arr, 1);
FOR i IN array_lower(_arr, 1) .. _up LOOP
FOR j IN i .. _up LOOP
CASE j-i
WHEN 0,1 THEN
RETURN NEXT _a || _arr[i:j] || _z;
WHEN 2 THEN
RETURN NEXT _a || _arr[i:i] || _arr[j:j] || _z;
RETURN NEXT _a || _arr[i:j] || _z;
ELSE
RETURN NEXT _a || _arr[i:i] || _arr[j:j] || _z;
RETURN QUERY SELECT *
FROM f_combos2(_arr[i+1:j-1], _a || _arr[i], _arr[j] || _z);
END CASE;
END LOOP;
END LOOP;
ELSE
RETURN NEXT _arr;
END IF;
END;
$BODY$;
Call:
SELECT * FROM f_combos2('{7,15,48}'::int[]) ORDER BY 1;
Works for 1-dimensional integer arrays.
This could be further optimized, but that's certainly not needed for the scope of this question.
ORDER BY to impose the order displayed in the question.
Provide for NULL or empty array, as NULL is mentioned in the comments.
Tested with PostgreSQL 9.1, but should work with any halfway modern version.
array_lower() and array_upper() have been around for at least since PostgreSQL 7.4. Only parameter defaults are new in version 8.4. Could easily be replaced.
Performance is decent.
SELECT DISTINCT * FROM f_combos('{1,2,3,4,5,6,7,8,9}'::int[]) ORDER BY 1;
511 rows, total runtime: 7.729 ms
Explanation
It builds on this simple form that only creates all combinations of neighboring elements:
CREATE FUNCTION f_combos(_arr int[])
RETURNS SETOF int[] LANGUAGE plpgsql AS
$BODY$
DECLARE
i int;
j int;
_up int;
BEGIN
_up := array_upper(_arr, 1);
FOR i in array_lower(_arr, 1) .. _up LOOP
FOR j in i .. _up LOOP
RETURN NEXT _arr[i:j];
END LOOP;
END LOOP;
END;
$BODY$;
But this will fail for sub-arrays with more than two elements. So:
For any sub-array with 3 elements one array with just the outer two elements is added. this is a shortcut for this special case that improves performance and is not strictly needed.
For any sub-array with more than 3 elements I take the outer two elements and fill in with all combinations of inner elements built by the same function recursively.
One approach is with a recursive CTE. Erwin's updated recursive function is significantly faster and scales better, though, so this is really useful as an interesting different approach. Erwin's updated version is much more practical.
I tried a bit counting approach (see the end) but without a fast way to pluck arbitrary elements from an array it proved slower then either recursive approach.
Recursive CTE combinations function
CREATE OR REPLACE FUNCTION combinations(anyarray) RETURNS SETOF anyarray AS $$
WITH RECURSIVE
items AS (
SELECT row_number() OVER (ORDER BY item) AS rownum, item
FROM (SELECT unnest($1) AS item) unnested
),
q AS (
SELECT 1 AS i, $1[1:0] arr
UNION ALL
SELECT (i+1), CASE x
WHEN 1 THEN array_append(q.arr,(SELECT item FROM items WHERE rownum = i))
ELSE q.arr END
FROM generate_series(0,1) x CROSS JOIN q WHERE i <= array_upper($1,1)
)
SELECT q.arr AS mods
FROM q WHERE i = array_upper($1,1)+1;
$$ LANGUAGE 'sql';
It's a polymorphic function, so it'll work on arrays of any type.
The logic is to iterate over each item in the unnested input set, using a working table. Start with an empty array in the working table, with a generation number of 1. For each entry in the input set insert two new arrays into the working table with an incremented generation number. One of the two is a copy of the input array from the previous generation and the other is the input array with the (generation-number)'th item from the input set appended to it. When the generation number exceeds the number of items in the input set, return the last generation.
Usage
You can use the combinations(smallint[]) function to produce the results you desire, using it as a set-returning function in combinatin with the row_number window function.
-- assuming table structure
regress=# \d comb
Table "public.comb"
Column | Type | Modifiers
---------+------------+-----------
base_id | integer |
mods | smallint[] |
SELECT base_id, row_number() OVER (ORDER BY mod) AS mod_id, mod
FROM (SELECT base_id, combinations(mods) AS mod FROM comb WHERE base_id = 3) x
ORDER BY mod;
Results
regress=# SELECT base_id, row_number() OVER (ORDER BY mod) AS mod_id, mod
regress-# FROM (SELECT base_id, combinations(mods) AS mod FROM comb WHERE base_id = 3) x
regress-# ORDER BY mod;
base_id | mod_id | mod
---------+--------+-----------
3 | 1 | {}
3 | 2 | {7}
3 | 3 | {7,15}
3 | 4 | {7,15,48}
3 | 5 | {7,48}
3 | 6 | {15}
3 | 7 | {15,48}
3 | 8 | {48}
(8 rows)
Time: 2.121 ms
Zero element arrays produce a null result. If you want combinations({}) to return one row {} then a UNION ALL with {} will do the job.
Theory
It appears you want the k-combinations for all k in a k-multicombination, rather than simple combinations. See number of combinations with repetition.
In other words, you want all k-combinations of elements from your set, for all k from 0 to n where n is the set size.
Related SO question: SQL - Find all possible combination, which has the really interesting answer about bit counting.
Bit operations exist in Pg, so a bit counting approach should be possible. You'd expect it to be more efficient, but because it's so slow to select a scattered subset of elements from an array it actually works out slower.
CREATE OR REPLACE FUNCTION bitwise_subarray(arr anyarray, elements integer)
RETURNS anyarray AS $$
SELECT array_agg($1[n+1])
FROM generate_series(0,array_upper($1,1)-1) n WHERE ($2>>n) & 1 = 1;
$$ LANGUAGE sql;
COMMENT ON FUNCTION bitwise_subarray(anyarray,integer) IS 'Return the elements from $1 where the corresponding bit in $2 is set';
CREATE OR REPLACE FUNCTION comb_bits(anyarray) RETURNS SETOF anyarray AS $$
SELECT bitwise_subarray($1, x)
FROM generate_series(0,pow(2,array_upper($1,1))::integer-1) x;
$$ LANGUAGE 'sql';
If you could find a faster way to write bitwise_subarray then comb_bits would be very fast. Like, say, a small C extension function, but I'm only crazy enough to write one of those for an SO answer.
I am trying to convert hex to decimal using PostgreSQL 9.1
with this query:
SELECT to_number('DEADBEEF', 'FMXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX');
I get the following error:
ERROR: invalid input syntax for type numeric: " "
What am I doing wrong?
Ways without dynamic SQL
There is no cast from hex numbers in text representation to a numeric type, but we can use bit(n) as waypoint. There are undocumented casts from bit strings (bit(n)) to integer types (int2, int4, int8) - the internal representation is binary compatible. Quoting Tom Lane:
This is relying on some undocumented behavior of the bit-type input
converter, but I see no reason to expect that would break. A possibly
bigger issue is that it requires PG >= 8.3 since there wasn't a text
to bit cast before that.
integer for max. 8 hex digits
Up to 8 hex digits can be converted to bit(32) and then coerced to integer (standard 4-byte integer):
SELECT ('x' || lpad(hex, 8, '0'))::bit(32)::int AS int_val
FROM (
VALUES
('1'::text)
, ('f')
, ('100')
, ('7fffffff')
, ('80000000') -- overflow into negative number
, ('deadbeef')
, ('ffffffff')
, ('ffffffff123') -- too long
) AS t(hex);
int_val
------------
1
15
256
2147483647
-2147483648
-559038737
-1
Postgres uses a signed integer type, so hex numbers above '7fffffff' overflow into negative integer numbers. This is still a valid, unique representation but the meaning is different. If that matters, switch to bigint; see below.
For more than 8 hex digits the least significant characters (excess to the right) get truncated.
4 bits in a bit string encode 1 hex digit. Hex numbers of known length can be cast to the respective bit(n) directly. Alternatively, pad hex numbers of unknown length with leading zeros (0) as demonstrated and cast to bit(32). Example with 7 hex digits and int or 8 digits and bigint:
SELECT ('x'|| 'deafbee')::bit(28)::int
, ('x'|| 'deadbeef')::bit(32)::bigint;
int4 | int8
-----------+------------
233503726 | 3735928559
bigint for max. 16 hex digits
Up to 16 hex digits can be converted to bit(64) and then coerced to bigint (int8, 8-byte integer) - overflowing into negative numbers in the upper half again:
SELECT ('x' || lpad(hex, 16, '0'))::bit(64)::bigint AS int8_val
FROM (
VALUES
('ff'::text)
, ('7fffffff')
, ('80000000')
, ('deadbeef')
, ('7fffffffffffffff')
, ('8000000000000000') -- overflow into negative number
, ('ffffffffffffffff')
, ('ffffffffffffffff123') -- too long
) t(hex);
int8_val
---------------------
255
2147483647
2147483648
3735928559
9223372036854775807
-9223372036854775808
-1
-1
uuid for max. 32 hex digits
The Postgres uuid data type is not a numeric type. But it's the most efficient type in standard Postgres to store up to 32 hex digits, only occupying 16 bytes of storage. There is a direct cast from text to uuid (no need for bit(n) as waypoint), but exactly 32 hex digits are required.
SELECT lpad(hex, 32, '0')::uuid AS uuid_val
FROM (
VALUES ('ff'::text)
, ('deadbeef')
, ('ffffffffffffffff')
, ('ffffffffffffffffffffffffffffffff')
, ('ffffffffffffffffffffffffffffffff123') -- too long
) t(hex);
uuid_val
--------------------------------------
00000000-0000-0000-0000-0000000000ff
00000000-0000-0000-0000-0000deadbeef
00000000-0000-0000-ffff-ffffffffffff
ffffffff-ffff-ffff-ffff-ffffffffffff
ffffffff-ffff-ffff-ffff-ffffffffffff
As you can see, standard output is a string of hex digits with typical separators for UUID.
md5 hash
This is particularly useful to store md5 hashes:
SELECT md5('Store hash for long string, maybe for index?')::uuid AS md5_hash;
md5_hash
--------------------------------------
02e10e94-e895-616e-8e23-bb7f8025da42
See:
What is the optimal data type for an MD5 field?
You have two immediate problems:
to_number doesn't understand hexadecimal.
X doesn't have any meaning in a to_number format string and anything without a meaning apparently means "skip a character".
I don't have an authoritative justification for (2), just empirical evidence:
=> SELECT to_number('123', 'X999');
to_number
-----------
23
(1 row)
=> SELECT to_number('123', 'XX999');
to_number
-----------
3
The documentation mentions how double quoted patterns are supposed to behave:
In to_date, to_number, and to_timestamp, double-quoted strings skip the number of input characters contained in the string, e.g. "XX" skips two input characters.
but the behavior of non-quoted characters that are not formatting characters appears to be unspecified.
In any case, to_number isn't the right tool for converting hex to numbers, you want to say something like this:
select x'deadbeef'::int;
so perhaps this function will work better for you:
CREATE OR REPLACE FUNCTION hex_to_int(hexval varchar) RETURNS integer AS $$
DECLARE
result int;
BEGIN
EXECUTE 'SELECT x' || quote_literal(hexval) || '::int' INTO result;
RETURN result;
END;
$$ LANGUAGE plpgsql IMMUTABLE STRICT;
Then:
=> select hex_to_int('DEADBEEF');
hex_to_int
------------
-559038737 **
(1 row)
** To avoid negative numbers like this from integer overflow error, use bigint instead of int to accommodate larger hex numbers (like IP addresses).
pg-bignum
Internally, pg-bignum uses the SSL library for big numbers. This method has none of the drawbacks mentioned in the other answers with numeric. Nor is it slowed down by plpgsql. It's fast and it works with a number of any size. Test case taken from Erwin's answer for comparison,
CREATE EXTENSION bignum;
SELECT hex, bn_in_hex(hex::cstring)
FROM (
VALUES ('ff'::text)
, ('7fffffff')
, ('80000000')
, ('deadbeef')
, ('7fffffffffffffff')
, ('8000000000000000')
, ('ffffffffffffffff')
, ('ffffffffffffffff123')
) t(hex);
hex | bn_in_hex
---------------------+-------------------------
ff | 255
7fffffff | 2147483647
80000000 | 2147483648
deadbeef | 3735928559
7fffffffffffffff | 9223372036854775807
8000000000000000 | 9223372036854775808
ffffffffffffffff | 18446744073709551615
ffffffffffffffff123 | 75557863725914323415331
(8 rows)
You can get the type to numeric using bn_in_hex('deadbeef')::text::numeric.
If anybody else is stuck with PG8.2, here is another way to do it.
bigint version:
create or replace function hex_to_bigint(hexval text) returns bigint as $$
select
(get_byte(x,0)::int8<<(7*8)) |
(get_byte(x,1)::int8<<(6*8)) |
(get_byte(x,2)::int8<<(5*8)) |
(get_byte(x,3)::int8<<(4*8)) |
(get_byte(x,4)::int8<<(3*8)) |
(get_byte(x,5)::int8<<(2*8)) |
(get_byte(x,6)::int8<<(1*8)) |
(get_byte(x,7)::int8)
from (
select decode(lpad($1, 16, '0'), 'hex') as x
) as a;
$$
language sql strict immutable;
int version:
create or replace function hex_to_int(hexval text) returns int as $$
select
(get_byte(x,0)::int<<(3*8)) |
(get_byte(x,1)::int<<(2*8)) |
(get_byte(x,2)::int<<(1*8)) |
(get_byte(x,3)::int)
from (
select decode(lpad($1, 8, '0'), 'hex') as x
) as a;
$$
language sql strict immutable;
Here is a version which uses numeric, so it can handle arbitrarily large hex strings:
create function hex_to_decimal(hex_string text)
returns text
language plpgsql immutable as $pgsql$
declare
bits bit varying;
result numeric := 0;
exponent numeric := 0;
chunk_size integer := 31;
start integer;
begin
execute 'SELECT x' || quote_literal(hex_string) INTO bits;
while length(bits) > 0 loop
start := greatest(1, length(bits) - chunk_size);
result := result + (substring(bits from start for chunk_size)::bigint)::numeric * pow(2::numeric, exponent);
exponent := exponent + chunk_size;
bits := substring(bits from 1 for greatest(0, length(bits) - chunk_size));
end loop;
return trunc(result, 0);
end
$pgsql$;
For example:
=# select hex_to_decimal('ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff');
32592575621351777380295131014550050576823494298654980010178247189670100796213387298934358015
Here is a poper way to convert hex to string... then you can check whether it's a numric type or not
SELECT convert_from('\x7468697320697320612076657279206C6F6E672068657820737472696E67','utf8')
returns
this is a very long hex string
Here is a other version which uses numeric, so it can handle arbitrarily large hex strings:
create OR REPLACE function hex_to_decimal2(hex_string text)
returns text
language plpgsql immutable as $pgsql$
declare
bits bit varying;
result numeric := 0;
begin
execute 'SELECT x' || quote_literal(hex_string) INTO bits;
while length(bits) > 0 loop
result := result + (substring(bits from 1 for 1)::bigint)::numeric * pow(2::numeric, length(bits) - 1);
bits := substring(bits from 2 for length(bits) - 1);
end loop;
return trunc(result, 0);
end
$pgsql$;
For example:
=# select hex_to_decimal('ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff');
32592575621351777380295131014550050576823494298654980010178247189670100796213387298934358015
For example:
=# select hex_to_decimal('5f68e8131ecf80000');
110000000000000000000
Here is another implementation:
CREATE OR REPLACE FUNCTION hex_to_decimal3(hex_string text)
RETURNS numeric
LANGUAGE plpgsql
IMMUTABLE
AS $function$
declare
hex_string_lower text := lower(hex_string);
i int;
digit int;
s numeric := 0;
begin
for i in 1 .. length(hex_string) loop
digit := position(substr(hex_string_lower, i, 1) in '0123456789abcdef') - 1;
if digit < 0 then
raise '"%" is not a valid hexadecimal digit', substr(hex_string_lower, i, 1) using errcode = '22P02';
end if;
s := s * 16 + digit;
end loop;
return s;
end
$function$;
It is a straightforward one that works digit by digit, using the position() function to compute the numeric value of each character in the input string. Its benefit over hex_to_decimal2() is that it seems to be much faster (4x or so for md5()-generated hex strings).
CREATE OR REPLACE FUNCTION numeric_from_bytes(bytea)
RETURNS numeric
LANGUAGE plpgsql
AS $$
declare
bits bit varying;
result numeric := 0;
exponent numeric := 0;
bit_pos integer;
begin
execute 'SELECT x' || quote_literal(substr($1::text,3)) into bits;
bit_pos := length(bits) + 1;
exponent := 0;
while bit_pos >= 56 loop
bit_pos := bit_pos - 56;
result := result + substring(bits from bit_pos for 56)::bigint::numeric * pow(2::numeric, exponent);
exponent := exponent + 56;
end loop;
while bit_pos >= 8 loop
bit_pos := bit_pos - 8;
result := result + substring(bits from bit_pos for 8)::bigint::numeric * pow(2::numeric, exponent);
exponent := exponent + 8;
end loop;
return trunc(result);
end;
$$;
In a future PostgreSQL version, when/if Dean Rasheed's patch 0001-Add-non-decimal-integer-support-to-type-numeric.patch gets committed, this can be simplified:
CREATE OR REPLACE FUNCTION numeric_from_bytes(bytea)
RETURNS numeric
LANGUAGE sql
AS $$
SELECT ('0'||right($1::text,-1))::numeric
$$;