Multiply a column with each values of an array - postgresql

How to multiply a column value with each element present in the array without using loop?
I have tried using the for each loop which iterates over the loop and multiplying each element with the column value.
CREATE OR REPLACE FUNCTION
public.test_p_offer_type_simulation1(offers numeric[])
RETURNS TABLE(sku character varying, cannibalisationrevenue double
precision, cannibalisationmargin double precision)
LANGUAGE plpgsql
AS $function$ declare a numeric []:= offers;
i numeric;
begin
foreach i in array a loop return QUERY
select
base.sku,
i * base.similar_sku,
.................
Suppose I have a column name 'baseline', and have an array [1,2,3], I want to multiply a baseline column value where its id =1 with each element of the array.
Example:::
id | baseline
----+----------
1 | 3
suppose I have an array with values [2,3,4]; I want to multiply baseline= 3 with (3 *2) , (3*3), (3*4). and return 3 rows after multiplication with values 6, 9, 12.
The output should be:
id | result| number
----+-------+---------
1 6 2
1 9 3
1 12 4

OK, according to your description, just use unnest function, the example SQL as below:
with tmp_table as (
select 1 as id, 3 as baseline, '{2,3,4}'::int[] as arr
)
select id,baseline*unnest(arr) as result,unnest(arr) as number from tmp_table;
id | result | number
----+--------+--------
1 | 6 | 2
1 | 9 | 3
1 | 12 | 4
(3 rows)
You can just replace the CTE table_tmp above to your real table name.

Related

Adding days to sysdate based off of values in a column

I am trying to create a manual table based off of a currently built views table.
The structure of this current table is as follows:
ID | Column1 | Column2 | Buffer Days
1 | Asdf | Asdf1 | 91
2 | Qwert | Qwert1 | 11
3 | Zxcv | Zxcv1 | 28
The goal is to add a 4th column after Buffer Days that lists the sys date + the number in buffer days
So the outcome would look like:
ID | Column1 | Column2 | Buffer Days | Lookout Date
1 | Asdf | Asdf1 | 91 | 02-Jan-18
That requirement smells like a virtual column candidate. However, it won't work:
SQL> create table test
2 (id number,
3 column1 varchar2(10),
4 buffer_days number,
5 --
6 lookout_date as (SYSDATE + buffer_days) --> virtual column
7 );
lookout_date as (SYSDATE + buffer_days)
*
ERROR at line 6:
ORA-54002: only pure functions can be specified in a virtual column expression
Obviously, as SYSDATE is a non-deterministic function (doesn't return the same value when invoked).
Why not an "ordinary" column in existing table? Because you shouldn't store values that are calculated using other table columns anyway. For example, good old Scott's EMP table contains SAL and COMM columns. It doesn't (and shouldn't) contain TOTAL_SAL column (as SAL + COMM) because - when SAL and/or COMM changes, you have to remember to update TOTAL as well.
Therefore, a view is what could help here. For example:
SQL> create table test
2 (id number,
3 column1 varchar2(10),
4 buffer_days number
5 );
Table created.
SQL> create or replace view v_test as
2 select id,
3 column1,
4 buffer_days,
5 sysdate + buffer_days lookout_date
6 from test;
View created.
SQL> insert into test (id, column1, buffer_days) values (1, 'asdf', 5);
1 row created.
SQL> select sysdate, v.* from v_test v;
SYSDATE ID COLUMN1 BUFFER_DAYS LOOKOUT_DA
---------- ---------- ---------- ----------- ----------
23.12.2017 1 asdf 5 28.12.2017
SQL>

How to sum rows in groups then filter rows based on the sum

I want to find the sum and count of all positive values in a table grouped by a type column. I am only interested in cases where sum >= 10 and count >= 2
For example,
type | value
------+-------
A | 10
B | 5
C | 7
B | 5
C | 6
C | -1
D | 3
D | 4
I want the result
type | sum | count
------+------+------
B | 10 | 2
C | 13 | 2
There should not be a row for A because count would only be 1. There should not be a row for D because the sum would only be 7. The negative value should be completely ignored.
I think the answer should be something like:
select type, sum(value) as sum, count(value) as count
from my_table
where value > 0
group by type
having sum >= 10 and count >= 2
However I am not sure how to correctly combine all of the relevant conditions.
select type, sum(value) as sum, count(type) as count
from my_table
where value > 0
group by type
having sum(value) >= 10 and count(type) >= 2

Coalesce overlapping time ranges in PostgreSQL

I have a PostgreSQL (9.4) table that contains time stamp ranges and user IDs, and I need to collapse any overlapping ranges (with the same user ID) into a single record.
I've tried a complicated set of CTEs to accomplish this, but there are some edge cases in our (40,000+ rows) real table that complicate matters. I've come to the conclusion that I probably need a recursive CTE, but I haven't had any luck writing it.
Here's some code to create a test table and populate it with data. This isn't the exact layout of our table, but it's close enough for an example.
CREATE TABLE public.test
(
id serial,
sessionrange tstzrange,
fk_user_id integer
);
insert into test (sessionrange, fk_user_id)
values
('[2016-01-14 11:57:01-05,2016-01-14 12:06:59-05]', 1)
,('[2016-01-14 12:06:53-05,2016-01-14 12:17:28-05]', 1)
,('[2016-01-14 12:17:24-05,2016-01-14 12:21:56-05]', 1)
,('[2016-01-14 18:18:00-05,2016-01-14 18:42:09-05]', 2)
,('[2016-01-14 18:18:08-05,2016-01-14 18:18:15-05]', 1)
,('[2016-01-14 18:38:12-05,2016-01-14 18:48:20-05]', 1)
,('[2016-01-14 18:18:16-05,2016-01-14 18:18:26-05]', 1)
,('[2016-01-14 18:18:24-05,2016-01-14 18:18:31-05]', 1)
,('[2016-01-14 18:18:12-05,2016-01-14 18:18:20-05]', 3)
,('[2016-01-14 19:32:12-05,2016-01-14 23:18:20-05]', 3)
,('[2016-01-14 18:18:16-05,2016-01-14 18:18:26-05]', 4)
,('[2016-01-14 18:18:24-05,2016-01-14 18:18:31-05]', 2);
I have found that I can do this to get the sessions sorted by the time they started:
select * from test order by fk_user_id, sessionrange
I could use this to determine whether an individual record overlaps with the previous, using window functions:
SELECT *, sessionrange && lag(sessionrange) OVER (PARTITION BY fk_user_id ORDER BY sessionrange)
FROM test
ORDER BY fk_user_id, sessionrange
But this only detects whether the single previous record overlaps the current one (see the record where id = 6). I need to detect all the way back to the beginning of the partition.
After that, I'd need to group any records that overlap together, to find the beginning of the earliest session and the end of the last session to terminate.
I'm sure there's a way to do this that I'm overlooking. How can I collapse these overlapping records?
It is relatively easy to merge overlapping ranges as elements of an array. For simplicity the following function returns set of tstzrange:
create or replace function merge_ranges(tstzrange[])
returns setof tstzrange language plpgsql as $$
declare
t tstzrange;
r tstzrange;
begin
foreach t in array $1 loop
if r && t then r:= r + t;
else
if r notnull then return next r;
end if;
r:= t;
end if;
end loop;
if r notnull then return next r;
end if;
end $$;
Just aggregate the ranges for a user and use the function:
select fk_user_id, merge_ranges(array_agg(sessionrange))
from test
group by 1
order by 1, 2
fk_user_id | merge_ranges
------------+-----------------------------------------------------
1 | ["2016-01-14 17:57:01+01","2016-01-14 18:21:56+01"]
1 | ["2016-01-15 00:18:08+01","2016-01-15 00:18:15+01"]
1 | ["2016-01-15 00:18:16+01","2016-01-15 00:18:31+01"]
1 | ["2016-01-15 00:38:12+01","2016-01-15 00:48:20+01"]
2 | ["2016-01-15 00:18:00+01","2016-01-15 00:42:09+01"]
3 | ["2016-01-15 00:18:12+01","2016-01-15 00:18:20+01"]
3 | ["2016-01-15 01:32:12+01","2016-01-15 05:18:20+01"]
4 | ["2016-01-15 00:18:16+01","2016-01-15 00:18:26+01"]
(8 rows)
Alternatively, the algorithm can be applied to the entire table in one function loop. I'm not sure but for a large dataset this method should be faster.
create or replace function merge_ranges_in_test()
returns setof test language plpgsql as $$
declare
curr test;
prev test;
begin
for curr in
select *
from test
order by fk_user_id, sessionrange
loop
if prev notnull and prev.fk_user_id <> curr.fk_user_id then
return next prev;
prev:= null;
end if;
if prev.sessionrange && curr.sessionrange then
prev.sessionrange:= prev.sessionrange + curr.sessionrange;
else
if prev notnull then
return next prev;
end if;
prev:= curr;
end if;
end loop;
return next prev;
end $$;
Results:
select *
from merge_ranges_in_test();
id | sessionrange | fk_user_id
----+-----------------------------------------------------+------------
1 | ["2016-01-14 17:57:01+01","2016-01-14 18:21:56+01"] | 1
5 | ["2016-01-15 00:18:08+01","2016-01-15 00:18:15+01"] | 1
7 | ["2016-01-15 00:18:16+01","2016-01-15 00:18:31+01"] | 1
6 | ["2016-01-15 00:38:12+01","2016-01-15 00:48:20+01"] | 1
4 | ["2016-01-15 00:18:00+01","2016-01-15 00:42:09+01"] | 2
9 | ["2016-01-15 00:18:12+01","2016-01-15 00:18:20+01"] | 3
10 | ["2016-01-15 01:32:12+01","2016-01-15 05:18:20+01"] | 3
11 | ["2016-01-15 00:18:16+01","2016-01-15 00:18:26+01"] | 4
(8 rows)
The problem is very interesting. I've tried to find a recursive solution but it seems the procedural attempt is most natural and efficient.
I have finally found a recursive solution. The query deletes overlapping rows and inserts their compacted equivalent:
with recursive cte (user_id, ids, range) as (
select t1.fk_user_id, array[t1.id, t2.id], t1.sessionrange + t2.sessionrange
from test t1
join test t2
on t1.fk_user_id = t2.fk_user_id
and t1.id < t2.id
and t1.sessionrange && t2.sessionrange
union all
select user_id, ids || t.id, range + sessionrange
from cte
join test t
on user_id = t.fk_user_id
and ids[cardinality(ids)] < t.id
and range && t.sessionrange
),
list as (
select distinct on(id) id, range, user_id
from cte, unnest(ids) id
order by id, upper(range)- lower(range) desc
),
deleted as (
delete from test
where id in (select id from list)
)
insert into test
select distinct on (range) id, range, user_id
from list
order by range, id;
Results:
select *
from test
order by 3, 2;
id | sessionrange | fk_user_id
----+-----------------------------------------------------+------------
1 | ["2016-01-14 17:57:01+01","2016-01-14 18:21:56+01"] | 1
5 | ["2016-01-15 00:18:08+01","2016-01-15 00:18:15+01"] | 1
7 | ["2016-01-15 00:18:16+01","2016-01-15 00:18:31+01"] | 1
6 | ["2016-01-15 00:38:12+01","2016-01-15 00:48:20+01"] | 1
4 | ["2016-01-15 00:18:00+01","2016-01-15 00:42:09+01"] | 2
9 | ["2016-01-15 00:18:12+01","2016-01-15 00:18:20+01"] | 3
10 | ["2016-01-15 01:32:12+01","2016-01-15 05:18:20+01"] | 3
11 | ["2016-01-15 00:18:16+01","2016-01-15 00:18:26+01"] | 4
(8 rows)

Calculate length of a series of line segments

I have a table like the following:
X | Y | Z | node
----------------
1 | 2 | 3 | 100
2 | 2 | 3 |
2 | 2 | 4 |
2 | 2 | 5 | 200
3 | 2 | 5 |
4 | 2 | 5 |
5 | 2 | 5 | 300
X, Y, Z are 3D space coordinates of some points, a curve passes through all the corresponding points from the first row to the last row. I need to calculate the curve length between two adjacent points whose "node" column aren't null.
If would be great if I can directly insert the result into another table that has three columns: "first_node", "second_node", "curve_length".
I don't need to interpolate extra points into the curve, just need to accumulate lengths all the straight lines, for example, in order to calculate the curve length between node 100 and 200, I need to sum the lengths of 3 straight lines: (1,2,3)<->(2,2,3), (2,2,3)<->(2,2,4), (2,2,4)<->(2,2,5)
EDIT
The table has an ID column, which is in increasing order from the first row to the last row.
To get a previous value in SQL, use the lag window function, e.g.
SELECT
x,
lag(x) OVER (ORDER BY id) as prev_x, ...
FROM ...
ORDER BY id;
That lets you get the previous and next points in 3-D space for a given segment. From there you can trivially calculate the line segment length using regular geometric maths.
You'll now have the lengths of each segment (sqlfiddle query). You can use this as input into other queries, using SELECT ... FROM (SELECT ...) subqueries or a CTE (WITH ....) term.
It turns out to be pretty awkward to go from the node segment lengths to node-to-node lengths. You need to create a table that spans the null entries, using a recursive CTE or with a window function.
I landed up with this monstrosity:
SELECT
array_agg(from_id) AS seg_ids,
-- 'max' is used here like 'coalese' for an aggregate,
-- since non-null is greater than null
max(from_node) AS from_node,
max(to_node) AS to_node,
sum(seg_length) AS seg_length
FROM (
-- lengths of all sub-segments with the null last segment
-- removed and a partition counter added
SELECT
*,
-- A running counter that increments when the
-- node ID changes. Allows us to group by series
-- of nodes in the outer query.
sum(CASE WHEN from_node IS NULL THEN 0 ELSE 1 END) OVER (ORDER BY from_id) AS partition_id
FROM
(
-- lengths of all sub-segments
SELECT
id AS from_id,
lead(id, 1) OVER (ORDER BY id) AS to_id,
-- length of sub-segment
sqrt(
(x - lead(x, 1) OVER (ORDER BY id)) ^ 2 +
(y - lead(y, 1) OVER (ORDER BY id)) ^ 2 +
(z - lead(z, 1) OVER (ORDER BY id)) ^ 2
) AS seg_length,
node AS from_node,
lead(node, 1) OVER (ORDER BY id) AS to_node
FROM
Table1
) sub
-- filter out the last row
WHERE to_id IS NOT NULL
) seglengths
-- Group into series of sub-segments between two nodes
GROUP BY partition_id;
Credit to How do I efficiently select the previous non-null value? for the partition trick.
Result:
seg_ids | to_node | from_node | seg_length
---------+---------+---------+------------
{1,2,3} | 100 | 200 | 3
{4,5,6} | 200 | 300 | 3
(2 rows)
To insert directly into another table, use INSERT INTO ... SELECT ....

Get integer part of number

So I have a table with numbers in decimals, say
id value
2323 2.43
4954 63.98
And I would like to get
id value
2323 2
4954 63
Is there a simple function in T-SQL to do that?
SELECT FLOOR(value)
http://msdn.microsoft.com/en-us/library/ms178531.aspx
FLOOR returns the largest integer less than or equal to the specified numeric expression.
Assuming you are OK with truncation of the decimal part you can do:
SELECT Id, CAST(value AS INT) INTO IntegerTable FROM NumericTable
FLOOR,CAST... do not return integer part with negative numbers, a solution is to define an internal procedure for the integer part:
DELIMITER //
DROP FUNCTION IF EXISTS INTEGER_PART//
CREATE FUNCTION INTEGER_PART(n DOUBLE)
RETURNS INTEGER
DETERMINISTIC
BEGIN
IF (n >= 0) THEN RETURN FLOOR(n);
ELSE RETURN CEILING(n);
END IF;
END
//
MariaDB [sidonieDE]> SELECT INTEGER_PART(3.7);
+-------------------+
| INTEGER_PART(3.7) |
+-------------------+
| 3 |
+-------------------+
1 row in set (0.00 sec)
MariaDB [sidonieDE]> SELECT INTEGER_PART(-3.7);
+--------------------+
| INTEGER_PART(-3.7) |
+--------------------+
| -3 |
+--------------------+
1 row in set (0.00 sec)
after you can use the procedure in a query like that:
SELECT INTEGER_PART(value) FROM table;
if you do not want to define an internal procedure in the database you can put an IF in a query like that:
select if(value < 0,CEILING(value),FLOOR(value)) from table ;