Postgres sql round to 2 decimal places - postgresql

I am trying to round my division sum results to 2 decimal places in Postgres SQL. I have tried the below, but it rounds them to whole numbers.
round((total_sales / total_customers)::numeric,2) as SPC,
round((total_sales / total_orders)::numeric,2) as AOV
How can I round the results to 2 decimal places please?
Kind Regards,
JB78

Assuming total_sales and total_customers are integer columns, the expression total_sales / total_orders yields an integer.
You need to cast at least one of them before you can round them, e.g.: total_sales / total_orders::numeric to get a decimal result from the division:
round(total_sales / total_orders::numeric, 2) as SPC,
round(total_sales / total_orders::numeric, 2) as AOV
Example:
create table data
(
total_sales integer,
total_customers integer,
total_orders integer
);
insert into data values (97, 12, 7), (5000, 20, 30);
select total_sales / total_orders as int_result,
total_sales / total_orders::numeric as numeric_result,
round(total_sales / total_orders::numeric, 2) as SPC,
round(total_sales / total_orders::numeric, 2) as AOV
from data;
returns:
int_result | numeric_result | spc | aov
-----------+----------------------+--------+-------
13 | 13.8571428571428571 | 13.86 | 13.86
166 | 166.6666666666666667 | 166.67 | 166.67

Related

postgresql: datatype numeric with limited digits

I am looking for numeric datatype with limited digits
(before and after the decimal point)
The function kills only digits after the decimal point. (PG version >= 13)
create function num_flex( v numeric, d int) returns numeric as
$$
select case when v=0 then 0
when v < 1 and v > -1 then trim_scale(round(v, d - 1 ) )
else trim_scale(round(v, d - 1 - least(log(abs(v))::int,d-1) ) ) end;
$$
language sql ;
For testing:
select num_flex( 0, 6)
union all
select num_flex( 1.22000, 6)
union all
select num_flex( (-0.000000123456789*10^x)::numeric,6)
from generate_series(1,15,3) t(x)
union all
select num_flex( (0.0000123456789*10^x)::numeric,6)
from generate_series(1,15,3) t(x) ;
It runs,
but have someone a better idea or find a bug (a situation, that is not implemented)?
The next step is to integrate this in PG, so that I can write
select 12.123456789::num_flex6 ;
select 12.123456789::num_flex7 ;
for a num_flex datatype with 6 or 7 digits.
with types from num_flex2 to num_flex9. Is this possible?
There are a few problems with your function:
Accepting negative digit counts (parameter d). num_flex(1234,-2) returns 1200 - you specified you want the function to only kill digits after decimal point, so 1234 would be expected.
Incorrect results between -1 and 1. num_flex(0.123,3) returns 0.12 instead of 0.123. I guess this might also be desired effect if you do want to count 0 to the left of decimal point. Normally, that 0 is ignored when a number's precision and scale are considered.
Your counting of digits to the left of decimal point is incorrect due to how ::int rounding works. log(abs(11))::int is 1 but log(abs(51))::int is 2. ceil(log(abs(v)))::int returns 2 in both cases, while keeping int type to still work as 2nd parameter in round().
create or replace function num_flex(
input_number numeric,
digit_count int,
is_counting_unit_zero boolean default false)
returns numeric as
$$
select trim_scale(
case
when input_number=0
then 0
when digit_count<=0 --avoids negative rounding
then round(input_number,0)
when (input_number between -1 and 1) and is_counting_unit_zero
then round(input_number,digit_count-1)
when (input_number between -1 and 1)
then round(input_number,digit_count)
else
round( input_number,
greatest( --avoids negative rounding
digit_count - (ceil(log(abs(input_number))))::int,
0)
)
end
);
$$
language sql;
Here's a test
select *,"result"="should_be"::numeric as "is_correct" from
(values
('num_flex(0.1234 ,4)',num_flex(0.1234 ,4), '0.1234'),
('num_flex(1.234 ,4)',num_flex(1.234 ,4), '1.234'),
('num_flex(1.2340000 ,4)',num_flex(1.2340000 ,4), '1.234'),
('num_flex(0001.234 ,4)',num_flex(0001.234 ,4), '1.234'),
('num_flex(123456 ,5)',num_flex(123456 ,5), '123456'),
('num_flex(0 ,5)',num_flex(0 ,5), '0'),
('num_flex(00000.00000 ,5)',num_flex(00000.00000 ,5), '0'),
('num_flex(00000.00001 ,5)',num_flex(00000.00001 ,5), '0.00001'),
('num_flex(12345678901 ,5)',num_flex(12345678901 ,5), '12345678901'),
('num_flex(123456789.1 ,5)',num_flex(123456789.1 ,5), '123456789'),
('num_flex(1.234 ,-4)',num_flex(1.234 ,4), '1.234')
) as t ("operation","result","should_be");
-- operation | result | should_be | is_correct
----------------------------+-------------+-------------+------------
-- num_flex(0.1234 ,4) | 0.1234 | 0.1234 | t
-- num_flex(1.234 ,4) | 1.234 | 1.234 | t
-- num_flex(1.2340000 ,4) | 1.234 | 1.234 | t
-- num_flex(0001.234 ,4) | 1.234 | 1.234 | t
-- num_flex(123456 ,5) | 123456 | 123456 | t
-- num_flex(0 ,5) | 0 | 0 | t
-- num_flex(00000.00000 ,5) | 0 | 0 | t
-- num_flex(00000.00001 ,5) | 0.00001 | 0.00001 | t
-- num_flex(12345678901 ,5) | 12345678901 | 12345678901 | t
-- num_flex(123456789.1 ,5) | 123456789 | 123456789 | t
-- num_flex(1.234 ,-4) | 1.234 | 1.234 | t
--(11 rows)
You can declare the precision (total number of digits) of your numeric data type in the column definition. Only digits after decimal point will be rounded. If there are too many digits before the decimal point, you'll get an error.
The downside is that numeric(n) is actually numeric(n,0), which is dictated by the SQL standard. So if by limiting the column's number of digits to 5 you want to have 12345.0 as well as 0.12345, there's no way you can configure numeric to hold both. numeric(5) will round 0.12345 to 0, numeric(5,5) will dedicate all digits to the right of decimal point and reject 12345.
create table test (numeric_column numeric(5));
insert into test values (12345.123);
table test;
-- numeric_column
------------------
-- 12345
--(1 row)
insert into test values (123456.123);
--ERROR: numeric field overflow
--DETAIL: A field with precision 5, scale 0 must round to an absolute value less than 10^5.

Update Numeric field without Decimal point and Zeros

I am trying to update a numeric field. But the field can not have zeros after decimal point. But the table that I am trying to pull values contain data as 87.00,90.00,100.00 etc.. How do I update without decimal point and zeros?
Example :percentage is a numeric field.
Update value available 100.00,90.00 etc.
update table1
set percent =(tmpercent as integer)
from table2
where table2.custid=table1.custoid;
;
gives error.
Table1:
CustID Percent(numeric)
1 90
2 80
Table2:
CustomID tmpPercent(varchar)
1 87.00
2 90.00
i often use typecasting ::FLOAT::NUMERIC to get rid of extra fraction zeros of numerics
or you can use TRUNC() function to force fraction truncation
try
update table1
set percent = tmpercent::FLOAT::NUMERIC
from table2
where table2.custid=table1.custoid;
or
update table1
set percent = TRUNC(tmpercent::NUMERIC)
from table2
where table2.custid=table1.custoid;
It is going to depend on how the numeric field is specified in the table. From here:
https://www.postgresql.org/docs/current/datatype-numeric.html
We use the following terms below: The precision of a numeric is the total count of significant digits in the whole number, that is, the number of digits to both sides of the decimal point. The scale of a numeric is the count of decimal digits in the fractional part, to the right of the decimal point. So the number 23.5141 has a precision of 6 and a scale of 4. Integers can be considered to have a scale of zero.
NUMERIC(precision, scale)
So if your field has a scale > 0 then you will see 0 to the right of the decimal point, unless you set scale to 0. As example:
create table numeric_test (num_fld numeric(5,2), num_fld_0 numeric(5,0));
insert into numeric_test (num_fld, num_fld_0) values ('90.0', '90.0');
select * from numeric_test ;
num_fld | num_fld_0
---------+-----------
90.00 | 90
insert into numeric_test (num_fld, num_fld_0) values ('90.5', '90.5');
select * from numeric_test ;
num_fld | num_fld_0
---------+-----------
90.00 | 90
90.50 | 91
insert into numeric_test (num_fld, num_fld_0) values ('90.0'::float, '90.0'::float);
select * from numeric_test ;
num_fld | num_fld_0
---------+-----------
90.00 | 90
90.50 | 91
90.00 | 90
Using scale 0 means you have basically created an integer field. If you have a scale > 0 then you are going to get decimals in the field.

Generate random values from table

I'd like to generate random values in order to fill a table.
First, I have a city_table :
CREATE TABLE t_city_ci (
ci_id SERIAL PRIMARY KEY,
ci_name VARCHAR(100) NOT NULL
);
So I insert random values like this :
INSERT INTO t_city_ci ("ci_name")
SELECT DISTINCT(d.str)
FROM (
SELECT
(
SELECT string_agg(x, '') as str
FROM (
SELECT chr(ascii('A') + (random() * 25)::integer)
-- reference 'b' so it is correlated and re-evaluated
FROM generate_series(1, 10 + b * 0)
) AS y(x)
)
FROM generate_series(1,10000) as a(b)) as d;
Now, I have a temperature table that looks like this :
CREATE TABLE dw_core.t_temperatures_te (
te_id SERIAL PRIMARY KEY,
ci_id INTEGER,
te_temperature FLOAT NOT NULL,
te_date TIMESTAMP NOT NULL DEFAULT NOW()
);
How can I fill a temperature table with :
Random date from last year
Random temperature between -30 and 50
Random values from t_city table ?
I tried this but the date never changes :
INSERT INTO dw_core.t_temperatures_te ("ci_id","te_temperature","te_date")
SELECT *
FROM (
SELECT (random() * (SELECT MAX(ci_id) FROM dw_core.t_city_ci) + 1)::integer
-- reference 'b' so it is correlated and re-evaluated
FROM generate_series(1, 100000 )
) AS y
,(select random() * -60 + 45 FROM generate_series(1,1005)) d(f),
(select timestamp '2014-01-10 20:00:00' +
random() * (timestamp '2014-01-20 20:00:00' -
timestamp '2016-01-10 10:00:00')) dza(b)
LIMIT 1000000;
Thanks a lot
Something like this?
select * from (
select
(random() * 100000)::integer as ci_id,
-30 + (random() * 80) as temp,
'2014-01-01'::date + (random() * 365 * '1 day'::interval) as time_2014
from generate_series(1,1000000) s
) foo
inner join t_city_ci c on c.ci_id = foo.ci_id;
Here's a sample of the generated data:
select
(random() * 100000)::integer as ci_id,
-30 + (random() * 80) as temp,
'2014-01-01'::date + (random() * 365 * '1 day'::interval) as time_2014
from generate_series(1,10);
ci_id | temp | time_2014
-------+-------------------+----------------------------
84742 | 31.6278865475337 | 2014-10-16 21:36:45.371176
16390 | 10.665458049935 | 2014-11-13 19:59:54.148177
87067 | 43.2082599369847 | 2014-06-01 16:14:43.021094
25718 | -7.78245567240867 | 2014-07-23 05:53:10.036914
99538 | -5.82924078024423 | 2014-06-08 06:44:02.081918
71720 | 22.3102275898262 | 2014-06-15 08:24:00.327841
24740 | 4.65809369210996 | 2014-05-19 02:20:58.804213
56861 | -20.750980894431 | 2014-10-01 06:09:54.117367
47929 | -24.4018202994027 | 2014-11-24 13:39:54.096337
30772 | 46.7239395141247 | 2014-08-27 04:50:46.785239
(10 rows)

Calculate length of a series of line segments

I have a table like the following:
X | Y | Z | node
----------------
1 | 2 | 3 | 100
2 | 2 | 3 |
2 | 2 | 4 |
2 | 2 | 5 | 200
3 | 2 | 5 |
4 | 2 | 5 |
5 | 2 | 5 | 300
X, Y, Z are 3D space coordinates of some points, a curve passes through all the corresponding points from the first row to the last row. I need to calculate the curve length between two adjacent points whose "node" column aren't null.
If would be great if I can directly insert the result into another table that has three columns: "first_node", "second_node", "curve_length".
I don't need to interpolate extra points into the curve, just need to accumulate lengths all the straight lines, for example, in order to calculate the curve length between node 100 and 200, I need to sum the lengths of 3 straight lines: (1,2,3)<->(2,2,3), (2,2,3)<->(2,2,4), (2,2,4)<->(2,2,5)
EDIT
The table has an ID column, which is in increasing order from the first row to the last row.
To get a previous value in SQL, use the lag window function, e.g.
SELECT
x,
lag(x) OVER (ORDER BY id) as prev_x, ...
FROM ...
ORDER BY id;
That lets you get the previous and next points in 3-D space for a given segment. From there you can trivially calculate the line segment length using regular geometric maths.
You'll now have the lengths of each segment (sqlfiddle query). You can use this as input into other queries, using SELECT ... FROM (SELECT ...) subqueries or a CTE (WITH ....) term.
It turns out to be pretty awkward to go from the node segment lengths to node-to-node lengths. You need to create a table that spans the null entries, using a recursive CTE or with a window function.
I landed up with this monstrosity:
SELECT
array_agg(from_id) AS seg_ids,
-- 'max' is used here like 'coalese' for an aggregate,
-- since non-null is greater than null
max(from_node) AS from_node,
max(to_node) AS to_node,
sum(seg_length) AS seg_length
FROM (
-- lengths of all sub-segments with the null last segment
-- removed and a partition counter added
SELECT
*,
-- A running counter that increments when the
-- node ID changes. Allows us to group by series
-- of nodes in the outer query.
sum(CASE WHEN from_node IS NULL THEN 0 ELSE 1 END) OVER (ORDER BY from_id) AS partition_id
FROM
(
-- lengths of all sub-segments
SELECT
id AS from_id,
lead(id, 1) OVER (ORDER BY id) AS to_id,
-- length of sub-segment
sqrt(
(x - lead(x, 1) OVER (ORDER BY id)) ^ 2 +
(y - lead(y, 1) OVER (ORDER BY id)) ^ 2 +
(z - lead(z, 1) OVER (ORDER BY id)) ^ 2
) AS seg_length,
node AS from_node,
lead(node, 1) OVER (ORDER BY id) AS to_node
FROM
Table1
) sub
-- filter out the last row
WHERE to_id IS NOT NULL
) seglengths
-- Group into series of sub-segments between two nodes
GROUP BY partition_id;
Credit to How do I efficiently select the previous non-null value? for the partition trick.
Result:
seg_ids | to_node | from_node | seg_length
---------+---------+---------+------------
{1,2,3} | 100 | 200 | 3
{4,5,6} | 200 | 300 | 3
(2 rows)
To insert directly into another table, use INSERT INTO ... SELECT ....

sum every 3 rows of a table

I have the following query to count all data every minute.
$sql= "SELECT COUNT(*) AS count, date_trunc('minute', date) AS momento
FROM p WHERE fk_id_b=$id_b GROUP BY date_trunc('minute', date)
ORDER BY momento ASC";
What I need to do is get the sum of the count for each row with the count of the 2 past minutes.
For example with the result of the $sql query above
|-------date---------|----count----|
|2012-06-21 05:20:00 | 12 |
|2012-06-21 05:21:00 | 14 |
|2012-06-21 05:22:00 | 10 |
|2012-06-21 05:23:00 | 20 |
|2012-06-21 05:24:00 | 25 |
|2012-06-21 05:25:00 | 30 |
|2012-06-21 05:26:00 | 10 |
I want this result:
|-------date---------|----count----|
|2012-06-21 05:20:00 | 12 |
|2012-06-21 05:21:00 | 26 | 12+14
|2012-06-21 05:22:00 | 36 | 12+14+10
|2012-06-21 05:23:00 | 44 | 14+10+20
|2012-06-21 05:24:00 | 55 | 10+20+25
|2012-06-21 05:25:00 | 75 | 20+25+30
|2012-06-21 05:26:00 | 65 | 25+30+10
Here's a more general solution for the sum of values from current and N previous rows (N=2 in your case).
SELECT "date",
sum("count") OVER (order by "date" ROWS BETWEEN 2 preceding AND current row)
FROM t
ORDER BY "date";
You can change N between 0 and "Unbounded". This approach gives you a chance to have a parameter in your app "count of the N past minutes". Also, no need for handling default values if out of bounds.
You can find more on this in PostgreSQL docs (4.2.8. Window Function Calls)
This is not so tricky with lag() window function (also on SQL Fiddle):
CREATE TABLE t ("date" timestamptz, "count" int4);
INSERT INTO t VALUES
('2012-06-21 05:20:00',12),
('2012-06-21 05:21:00',14),
('2012-06-21 05:22:00',10),
('2012-06-21 05:23:00',20),
('2012-06-21 05:24:00',25),
('2012-06-21 05:25:00',30),
('2012-06-21 05:26:00',10);
SELECT *,
"count"
+ coalesce(lag("count", 1) OVER (ORDER BY "date"), 0)
+ coalesce(lag("count", 2) OVER (ORDER BY "date"), 0) AS "total"
FROM t;
I've double-quoted date and count columns, as these are reserved words;
lag(field, distance) gives me the value of the field column distance rows away from the current one, thus first function gives previous row's value and second call gives the value from the one before;
coalesce() is required to avoid NULL result from lag() function (for the first row in your query there's no “previous” one, thus it's NULL), otherwise the total will also be NULL.
#vyegorov's answer covers it mostly. But I have more gripes than fit into a comment.
Don't use reserved words like date and count as identifiers at all. PostgreSQL allows those two particular key words as identifier - other than every SQL standard. But it's still bad practice. The fact that you can use anything inside double-quotes as identifier, even "; DELETE FROM tbl;" does not make it a good idea. The name "date" for a timestamp is misleading on top of that.
Wrong data type. Example displays timestamp, not timestamptz. Does not make a difference here, but still misleading.
You don't need COALESCE(). With the window functions lag() and lead() you can can provide a default value as 3rd parameter:
Building on this setup:
CREATE TABLE tbl (ts timestamp, ct int4);
INSERT INTO tbl VALUES
('2012-06-21 05:20:00', 12)
, ('2012-06-21 05:21:00', 14)
, ('2012-06-21 05:22:00', 10)
, ('2012-06-21 05:23:00', 20)
, ('2012-06-21 05:24:00', 25)
, ('2012-06-21 05:25:00', 30)
, ('2012-06-21 05:26:00', 10);
Query:
SELECT ts, ct + lag(ct, 1, 0) OVER (ORDER BY ts)
+ lag(ct, 2, 0) OVER (ORDER BY ts) AS total
FROM tbl;
Or better yet: use a single sum() as window aggregate function with a custom window frame:
SELECT ts, sum(ct) OVER (ORDER BY ts ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
FROM tbl;
Same result.
Related:
Group by end of period instead of start date