Update Numeric field without Decimal point and Zeros - postgresql

I am trying to update a numeric field. But the field can not have zeros after decimal point. But the table that I am trying to pull values contain data as 87.00,90.00,100.00 etc.. How do I update without decimal point and zeros?
Example :percentage is a numeric field.
Update value available 100.00,90.00 etc.
update table1
set percent =(tmpercent as integer)
from table2
where table2.custid=table1.custoid;
;
gives error.
Table1:
CustID Percent(numeric)
1 90
2 80
Table2:
CustomID tmpPercent(varchar)
1 87.00
2 90.00

i often use typecasting ::FLOAT::NUMERIC to get rid of extra fraction zeros of numerics
or you can use TRUNC() function to force fraction truncation
try
update table1
set percent = tmpercent::FLOAT::NUMERIC
from table2
where table2.custid=table1.custoid;
or
update table1
set percent = TRUNC(tmpercent::NUMERIC)
from table2
where table2.custid=table1.custoid;

It is going to depend on how the numeric field is specified in the table. From here:
https://www.postgresql.org/docs/current/datatype-numeric.html
We use the following terms below: The precision of a numeric is the total count of significant digits in the whole number, that is, the number of digits to both sides of the decimal point. The scale of a numeric is the count of decimal digits in the fractional part, to the right of the decimal point. So the number 23.5141 has a precision of 6 and a scale of 4. Integers can be considered to have a scale of zero.
NUMERIC(precision, scale)
So if your field has a scale > 0 then you will see 0 to the right of the decimal point, unless you set scale to 0. As example:
create table numeric_test (num_fld numeric(5,2), num_fld_0 numeric(5,0));
insert into numeric_test (num_fld, num_fld_0) values ('90.0', '90.0');
select * from numeric_test ;
num_fld | num_fld_0
---------+-----------
90.00 | 90
insert into numeric_test (num_fld, num_fld_0) values ('90.5', '90.5');
select * from numeric_test ;
num_fld | num_fld_0
---------+-----------
90.00 | 90
90.50 | 91
insert into numeric_test (num_fld, num_fld_0) values ('90.0'::float, '90.0'::float);
select * from numeric_test ;
num_fld | num_fld_0
---------+-----------
90.00 | 90
90.50 | 91
90.00 | 90
Using scale 0 means you have basically created an integer field. If you have a scale > 0 then you are going to get decimals in the field.

Related

postgresql: datatype numeric with limited digits

I am looking for numeric datatype with limited digits
(before and after the decimal point)
The function kills only digits after the decimal point. (PG version >= 13)
create function num_flex( v numeric, d int) returns numeric as
$$
select case when v=0 then 0
when v < 1 and v > -1 then trim_scale(round(v, d - 1 ) )
else trim_scale(round(v, d - 1 - least(log(abs(v))::int,d-1) ) ) end;
$$
language sql ;
For testing:
select num_flex( 0, 6)
union all
select num_flex( 1.22000, 6)
union all
select num_flex( (-0.000000123456789*10^x)::numeric,6)
from generate_series(1,15,3) t(x)
union all
select num_flex( (0.0000123456789*10^x)::numeric,6)
from generate_series(1,15,3) t(x) ;
It runs,
but have someone a better idea or find a bug (a situation, that is not implemented)?
The next step is to integrate this in PG, so that I can write
select 12.123456789::num_flex6 ;
select 12.123456789::num_flex7 ;
for a num_flex datatype with 6 or 7 digits.
with types from num_flex2 to num_flex9. Is this possible?
There are a few problems with your function:
Accepting negative digit counts (parameter d). num_flex(1234,-2) returns 1200 - you specified you want the function to only kill digits after decimal point, so 1234 would be expected.
Incorrect results between -1 and 1. num_flex(0.123,3) returns 0.12 instead of 0.123. I guess this might also be desired effect if you do want to count 0 to the left of decimal point. Normally, that 0 is ignored when a number's precision and scale are considered.
Your counting of digits to the left of decimal point is incorrect due to how ::int rounding works. log(abs(11))::int is 1 but log(abs(51))::int is 2. ceil(log(abs(v)))::int returns 2 in both cases, while keeping int type to still work as 2nd parameter in round().
create or replace function num_flex(
input_number numeric,
digit_count int,
is_counting_unit_zero boolean default false)
returns numeric as
$$
select trim_scale(
case
when input_number=0
then 0
when digit_count<=0 --avoids negative rounding
then round(input_number,0)
when (input_number between -1 and 1) and is_counting_unit_zero
then round(input_number,digit_count-1)
when (input_number between -1 and 1)
then round(input_number,digit_count)
else
round( input_number,
greatest( --avoids negative rounding
digit_count - (ceil(log(abs(input_number))))::int,
0)
)
end
);
$$
language sql;
Here's a test
select *,"result"="should_be"::numeric as "is_correct" from
(values
('num_flex(0.1234 ,4)',num_flex(0.1234 ,4), '0.1234'),
('num_flex(1.234 ,4)',num_flex(1.234 ,4), '1.234'),
('num_flex(1.2340000 ,4)',num_flex(1.2340000 ,4), '1.234'),
('num_flex(0001.234 ,4)',num_flex(0001.234 ,4), '1.234'),
('num_flex(123456 ,5)',num_flex(123456 ,5), '123456'),
('num_flex(0 ,5)',num_flex(0 ,5), '0'),
('num_flex(00000.00000 ,5)',num_flex(00000.00000 ,5), '0'),
('num_flex(00000.00001 ,5)',num_flex(00000.00001 ,5), '0.00001'),
('num_flex(12345678901 ,5)',num_flex(12345678901 ,5), '12345678901'),
('num_flex(123456789.1 ,5)',num_flex(123456789.1 ,5), '123456789'),
('num_flex(1.234 ,-4)',num_flex(1.234 ,4), '1.234')
) as t ("operation","result","should_be");
-- operation | result | should_be | is_correct
----------------------------+-------------+-------------+------------
-- num_flex(0.1234 ,4) | 0.1234 | 0.1234 | t
-- num_flex(1.234 ,4) | 1.234 | 1.234 | t
-- num_flex(1.2340000 ,4) | 1.234 | 1.234 | t
-- num_flex(0001.234 ,4) | 1.234 | 1.234 | t
-- num_flex(123456 ,5) | 123456 | 123456 | t
-- num_flex(0 ,5) | 0 | 0 | t
-- num_flex(00000.00000 ,5) | 0 | 0 | t
-- num_flex(00000.00001 ,5) | 0.00001 | 0.00001 | t
-- num_flex(12345678901 ,5) | 12345678901 | 12345678901 | t
-- num_flex(123456789.1 ,5) | 123456789 | 123456789 | t
-- num_flex(1.234 ,-4) | 1.234 | 1.234 | t
--(11 rows)
You can declare the precision (total number of digits) of your numeric data type in the column definition. Only digits after decimal point will be rounded. If there are too many digits before the decimal point, you'll get an error.
The downside is that numeric(n) is actually numeric(n,0), which is dictated by the SQL standard. So if by limiting the column's number of digits to 5 you want to have 12345.0 as well as 0.12345, there's no way you can configure numeric to hold both. numeric(5) will round 0.12345 to 0, numeric(5,5) will dedicate all digits to the right of decimal point and reject 12345.
create table test (numeric_column numeric(5));
insert into test values (12345.123);
table test;
-- numeric_column
------------------
-- 12345
--(1 row)
insert into test values (123456.123);
--ERROR: numeric field overflow
--DETAIL: A field with precision 5, scale 0 must round to an absolute value less than 10^5.

Retain only 3 highest positive and negative records in a table

I am new to databases and postgres as such.
I have a table called names which has 2 columns name and value which gets updated every x seconds with new name value pairs. My requirement is to retain only 3 positive and 3 negative values at any point of time and delete the rest of the rows during each table update.
I use the following query to delete the old rows and retain the 3 positive and 3 negative values ordered by value.
delete from names
using (select *,
row_number() over (partition by value > 0, value < 0 order by value desc) as rn
from names ) w
where w.rn >=3
I am skeptical to use a conditional like value > 0 in a partition statement. Is this approach correct?
For example,
A table like this prior to delete :
name | value
--------------
test | 10
test1 | 11
test1 | 12
test1 | 13
test4 | -1
test4 | -2
My table after delete should look like :
name | value
--------------
test1 | 13
test1 | 12
test1 | 11
test4 | -1
test4 | -2
demo:db<>fiddle
This works generally as expected: value > 0 clusters the values into all numbers > 0 and all numbers <= 0. The ORDER BY value orders these two groups as expected well.
So, the only thing, I would change:
row_number() over (partition by value >= 0 order by value desc)
remove: , value < 0 (Because: Why should you group the positive values into negative and other? You don't have any negative numbers in your positive group and vice versa.)
change: value > 0 to value >= 0 to ignore the 0 as long as possible
For deleting: If you want to keep the top 3 values of each direction:
you should change w.rn >= 3 into w.rn > 3 (it keeps the 3rd element as well)
you need to connect the subquery with the table records. In real cases you should use id columns for that. In your example you could take the value column: where n.value = w.value AND w.rn > 3
So, finally:
delete from names n
using (select *,
row_number() over (partition by value >= 0 order by value desc) as rn
from names ) w
where n.value = w.value AND w.rn > 3
If it's not a hard requirement to delete the other rows, you could instead select only the rows you're interested in:
WITH largest AS (
SELECT name, value
FROM names
ORDER BY value DESC
LIMIT 3),
smallest AS (
SELECT name, value
FROM names
ORDER BY value ASC
LIMIT 3)
SELECT * FROM largest
UNION
SELECT * FROM smallest
ORDER BY value DESC

Multiply a column with each values of an array

How to multiply a column value with each element present in the array without using loop?
I have tried using the for each loop which iterates over the loop and multiplying each element with the column value.
CREATE OR REPLACE FUNCTION
public.test_p_offer_type_simulation1(offers numeric[])
RETURNS TABLE(sku character varying, cannibalisationrevenue double
precision, cannibalisationmargin double precision)
LANGUAGE plpgsql
AS $function$ declare a numeric []:= offers;
i numeric;
begin
foreach i in array a loop return QUERY
select
base.sku,
i * base.similar_sku,
.................
Suppose I have a column name 'baseline', and have an array [1,2,3], I want to multiply a baseline column value where its id =1 with each element of the array.
Example:::
id | baseline
----+----------
1 | 3
suppose I have an array with values [2,3,4]; I want to multiply baseline= 3 with (3 *2) , (3*3), (3*4). and return 3 rows after multiplication with values 6, 9, 12.
The output should be:
id | result| number
----+-------+---------
1 6 2
1 9 3
1 12 4
OK, according to your description, just use unnest function, the example SQL as below:
with tmp_table as (
select 1 as id, 3 as baseline, '{2,3,4}'::int[] as arr
)
select id,baseline*unnest(arr) as result,unnest(arr) as number from tmp_table;
id | result | number
----+--------+--------
1 | 6 | 2
1 | 9 | 3
1 | 12 | 4
(3 rows)
You can just replace the CTE table_tmp above to your real table name.

Postgres Average calculation ignores null

This is my postgres table
name | revenue
--------+---------
John | 100
Will | 100
Tom | 100
Susan | 100
Ben |
(5 rows)
Here, when I calculate average for revenue, It returns 100, which is clearly not the case and sum/count, which is 400/5 is 80. Is this behaviour by conventional design or am I missing the point?
I know I could change null to 0 and process as usual . But, given the default behaviour, is this intentional and preferred way of calculating average.
This is both intentional and perfectly logical. Remember that NULL means that the value is unknown.
It might, for instance, represent a value which will be filled in at some future date. If the future value turns out to be 0, the average will be 400 / 5 = 80, as you say; but if the future value turns out to be 200, the average value will be 600 / 5 = 120 instead. All we can know right now is that the average of known values is 400 / 4 = 100.
If you actually know that you have 0 revenue for this item, you should store 0 in that column. If you don't know what revenue you have for that item, you should exclude it from your calculations, which is what Postgres, following the SQL Standard, does for you.
If you can't fix the data, but it is in fact a case that all NULLs in this table should be treated as 0 - or as some other fixed value - you can use a COALESCE inside the aggregate:
SELECT AVG(COALESCE(revenue, 0)) as forced_average
You should force a 0 value for null revenues.
create table tbl (name varchar(10), revenue int);
✓
insert into tbl values
('John', 100), ('Will', 100), ('Tom', 100), ('Susan', 100), ('Ben', null);
5 rows affected
select avg(case when revenue is null then 0 else revenue end) from tbl;
| avg |
| ------------------: |
| 80.0000000000000000 |
select avg(coalesce(revenue,0)) from tbl;
| avg |
| ------------------: |
| 80.0000000000000000 |
dbfiddle here

Calculate length of a series of line segments

I have a table like the following:
X | Y | Z | node
----------------
1 | 2 | 3 | 100
2 | 2 | 3 |
2 | 2 | 4 |
2 | 2 | 5 | 200
3 | 2 | 5 |
4 | 2 | 5 |
5 | 2 | 5 | 300
X, Y, Z are 3D space coordinates of some points, a curve passes through all the corresponding points from the first row to the last row. I need to calculate the curve length between two adjacent points whose "node" column aren't null.
If would be great if I can directly insert the result into another table that has three columns: "first_node", "second_node", "curve_length".
I don't need to interpolate extra points into the curve, just need to accumulate lengths all the straight lines, for example, in order to calculate the curve length between node 100 and 200, I need to sum the lengths of 3 straight lines: (1,2,3)<->(2,2,3), (2,2,3)<->(2,2,4), (2,2,4)<->(2,2,5)
EDIT
The table has an ID column, which is in increasing order from the first row to the last row.
To get a previous value in SQL, use the lag window function, e.g.
SELECT
x,
lag(x) OVER (ORDER BY id) as prev_x, ...
FROM ...
ORDER BY id;
That lets you get the previous and next points in 3-D space for a given segment. From there you can trivially calculate the line segment length using regular geometric maths.
You'll now have the lengths of each segment (sqlfiddle query). You can use this as input into other queries, using SELECT ... FROM (SELECT ...) subqueries or a CTE (WITH ....) term.
It turns out to be pretty awkward to go from the node segment lengths to node-to-node lengths. You need to create a table that spans the null entries, using a recursive CTE or with a window function.
I landed up with this monstrosity:
SELECT
array_agg(from_id) AS seg_ids,
-- 'max' is used here like 'coalese' for an aggregate,
-- since non-null is greater than null
max(from_node) AS from_node,
max(to_node) AS to_node,
sum(seg_length) AS seg_length
FROM (
-- lengths of all sub-segments with the null last segment
-- removed and a partition counter added
SELECT
*,
-- A running counter that increments when the
-- node ID changes. Allows us to group by series
-- of nodes in the outer query.
sum(CASE WHEN from_node IS NULL THEN 0 ELSE 1 END) OVER (ORDER BY from_id) AS partition_id
FROM
(
-- lengths of all sub-segments
SELECT
id AS from_id,
lead(id, 1) OVER (ORDER BY id) AS to_id,
-- length of sub-segment
sqrt(
(x - lead(x, 1) OVER (ORDER BY id)) ^ 2 +
(y - lead(y, 1) OVER (ORDER BY id)) ^ 2 +
(z - lead(z, 1) OVER (ORDER BY id)) ^ 2
) AS seg_length,
node AS from_node,
lead(node, 1) OVER (ORDER BY id) AS to_node
FROM
Table1
) sub
-- filter out the last row
WHERE to_id IS NOT NULL
) seglengths
-- Group into series of sub-segments between two nodes
GROUP BY partition_id;
Credit to How do I efficiently select the previous non-null value? for the partition trick.
Result:
seg_ids | to_node | from_node | seg_length
---------+---------+---------+------------
{1,2,3} | 100 | 200 | 3
{4,5,6} | 200 | 300 | 3
(2 rows)
To insert directly into another table, use INSERT INTO ... SELECT ....