How to create multiple rows from a single row in Redshift SQL - amazon-redshift

I want to expand a single row to multiple rows in my table based on a column in the table in AWS Redshift.
Here is my example table schema and rows:
CREATE TABLE test (
start timestamp, -- start time of the first slot
slot_length int, -- the length of the slots in minutes
repeat int -- how many slots will be there
);
INSERT INTO test (start, slot_length, repeat) VALUES
('2019-09-22T00:00:00', 90, 2),
('2019-09-21T15:30:00', 60, 3);
I want to expand these two rows into 5 based on the value of the "repeat" column. So any row will be expanded "repeat" times. The first expansion won't change anything. The subsequent expansions need to add "slot_length" to the "start" column. Here is the final list of rows I want to have in the end:
'2019-09-22 00:00:00', 90, 2 -- expanded from the first row
'2019-09-22 01:30:00', 90, 2 -- expanded from the first row
'2019-09-21 15:30:00', 60, 3 -- expanded from the second row
'2019-09-21 16:30:00', 60, 3 -- expanded from the second row
'2019-09-21 17:30:00', 60, 3 -- expanded from the second row
Can this be done via pure SQL in Redshift?

This SQL should solves your purpose. Kindly Up-vote if it does.
select t.start
, case when rpt.repeat>1
then dateadd(min,t.slot_length*(rpt.repeat-1),t.start)
else t.start
end as new_start
, t.slot_length
, t.repeat
from schema.test t
join (select row_number() over() as repeat from schema.random_table) rpt
on t.repeat>=rpt.repeat
order by t.slot_length desc,rpt.repeat;
Please note that the "random_table" in your schema should have at least as many rows as the maximum value in your "repeat" column.

Related

Fast new row insertion if a value of a column depends on previous value in existing row

I have a table cusers with a primary key:
primary key(uid, lid, cnt)
And I try to insert some values into the table:
insert into cusers (uid, lid, cnt, dyn, ts)
values
(A, B, C, (
select C - cnt
from cusers
where uid = A and lid = B
order by ts desc
limit 1
), now())
on conflict do nothing
Quite often (with the possibility of 98%) a row cannot be inserted to cusers because it violates the primary key constraint, so hard select queries do not need to be executed at all. But as I can see PostgreSQL first counts the select query as a result of dyn column and only then rejects row because of uid, lid, cnt violation.
What is the best way to insert rows quickly in such situation?
Another explanation
I have a system where one row depends on another. Here is an example:
(x, x, 2, 2, <timestamp>)
(x, x, 5, 3, <timestamp>)
Two columns contain an absolute value (2 and 5) and relative value (2, 5 - 2). Each time I insert new row it should:
avoid same rows (see primary key constraint)
if new row differs, it should count a difference and put it into the dyn column (so I take the last inserted row for the user according to the timestamp and subtract values).
Another solution I've found is to use returning uid, lid, ts for inserts and get user ids which were really inserted - this is how I know they have differences from existing rows. Then I update inserted values:
update cusers
set dyn = (
select max(cnt) - min(cnt)
from (
select cnt
from cusers
where uid = A and lid = B
order by ts desc
limit 2) Table
)
where uid = A and lid = B and ts = TS
But it is not a fast approach either, as it seeks all over the ts column to find the two last inserted rows for each user. I need a fast insert query as I insert millions of rows at a time (but I do not write duplicates).
What the solution can be? May be I need a new index for this? Thanks in advance.

Make rows to Columns in Postgresql [duplicate]

Does any one know how to create crosstab queries in PostgreSQL?
For example I have the following table:
Section Status Count
A Active 1
A Inactive 2
B Active 4
B Inactive 5
I would like the query to return the following crosstab:
Section Active Inactive
A 1 2
B 4 5
Is this possible?
Install the additional module tablefunc once per database, which provides the function crosstab(). Since Postgres 9.1 you can use CREATE EXTENSION for that:
CREATE EXTENSION IF NOT EXISTS tablefunc;
Improved test case
CREATE TABLE tbl (
section text
, status text
, ct integer -- "count" is a reserved word in standard SQL
);
INSERT INTO tbl VALUES
('A', 'Active', 1), ('A', 'Inactive', 2)
, ('B', 'Active', 4), ('B', 'Inactive', 5)
, ('C', 'Inactive', 7); -- ('C', 'Active') is missing
Simple form - not fit for missing attributes
crosstab(text) with 1 input parameter:
SELECT *
FROM crosstab(
'SELECT section, status, ct
FROM tbl
ORDER BY 1,2' -- needs to be "ORDER BY 1,2" here
) AS ct ("Section" text, "Active" int, "Inactive" int);
Returns:
Section | Active | Inactive
---------+--------+----------
A | 1 | 2
B | 4 | 5
C | 7 | -- !!
No need for casting and renaming.
Note the incorrect result for C: the value 7 is filled in for the first column. Sometimes, this behavior is desirable, but not for this use case.
The simple form is also limited to exactly three columns in the provided input query: row_name, category, value. There is no room for extra columns like in the 2-parameter alternative below.
Safe form
crosstab(text, text) with 2 input parameters:
SELECT *
FROM crosstab(
'SELECT section, status, ct
FROM tbl
ORDER BY 1,2' -- could also just be "ORDER BY 1" here
, $$VALUES ('Active'::text), ('Inactive')$$
) AS ct ("Section" text, "Active" int, "Inactive" int);
Returns:
Section | Active | Inactive
---------+--------+----------
A | 1 | 2
B | 4 | 5
C | | 7 -- !!
Note the correct result for C.
The second parameter can be any query that returns one row per attribute matching the order of the column definition at the end. Often you will want to query distinct attributes from the underlying table like this:
'SELECT DISTINCT attribute FROM tbl ORDER BY 1'
That's in the manual.
Since you have to spell out all columns in a column definition list anyway (except for pre-defined crosstabN() variants), it is typically more efficient to provide a short list in a VALUES expression like demonstrated:
$$VALUES ('Active'::text), ('Inactive')$$)
Or (not in the manual):
$$SELECT unnest('{Active,Inactive}'::text[])$$ -- short syntax for long lists
I used dollar quoting to make quoting easier.
You can even output columns with different data types with crosstab(text, text) - as long as the text representation of the value column is valid input for the target type. This way you might have attributes of different kind and output text, date, numeric etc. for respective attributes. There is a code example at the end of the chapter crosstab(text, text) in the manual.
db<>fiddle here
Effect of excess input rows
Excess input rows are handled differently - duplicate rows for the same ("row_name", "category") combination - (section, status) in the above example.
The 1-parameter form fills in available value columns from left to right. Excess values are discarded.
Earlier input rows win.
The 2-parameter form assigns each input value to its dedicated column, overwriting any previous assignment.
Later input rows win.
Typically, you don't have duplicates to begin with. But if you do, carefully adjust the sort order to your requirements - and document what's happening.
Or get fast arbitrary results if you don't care. Just be aware of the effect.
Advanced examples
Pivot on Multiple Columns using Tablefunc - also demonstrating mentioned "extra columns"
Dynamic alternative to pivot with CASE and GROUP BY
\crosstabview in psql
Postgres 9.6 added this meta-command to its default interactive terminal psql. You can run the query you would use as first crosstab() parameter and feed it to \crosstabview (immediately or in the next step). Like:
db=> SELECT section, status, ct FROM tbl \crosstabview
Similar result as above, but it's a representation feature on the client side exclusively. Input rows are treated slightly differently, hence ORDER BY is not required. Details for \crosstabview in the manual. There are more code examples at the bottom of that page.
Related answer on dba.SE by Daniel Vérité (the author of the psql feature):
How do I generate a pivoted CROSS JOIN where the resulting table definition is unknown?
SELECT section,
SUM(CASE status WHEN 'Active' THEN count ELSE 0 END) AS active, --here you pivot each status value as a separate column explicitly
SUM(CASE status WHEN 'Inactive' THEN count ELSE 0 END) AS inactive --here you pivot each status value as a separate column explicitly
FROM t
GROUP BY section
You can use the crosstab() function of the additional module tablefunc - which you have to install once per database. Since PostgreSQL 9.1 you can use CREATE EXTENSION for that:
CREATE EXTENSION tablefunc;
In your case, I believe it would look something like this:
CREATE TABLE t (Section CHAR(1), Status VARCHAR(10), Count integer);
INSERT INTO t VALUES ('A', 'Active', 1);
INSERT INTO t VALUES ('A', 'Inactive', 2);
INSERT INTO t VALUES ('B', 'Active', 4);
INSERT INTO t VALUES ('B', 'Inactive', 5);
SELECT row_name AS Section,
category_1::integer AS Active,
category_2::integer AS Inactive
FROM crosstab('select section::text, status, count::text from t',2)
AS ct (row_name text, category_1 text, category_2 text);
DB Fiddle here:
Everything works: https://dbfiddle.uk/iKCW9Uhh
Without CREATE EXTENSION tablefunc; you get this error: https://dbfiddle.uk/j8W1CMvI
ERROR: function crosstab(unknown, integer) does not exist
LINE 4: FROM crosstab('select section::text, status, count::text fro...
^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.
Solution with JSON aggregation:
CREATE TEMP TABLE t (
section text
, status text
, ct integer -- don't use "count" as column name.
);
INSERT INTO t VALUES
('A', 'Active', 1), ('A', 'Inactive', 2)
, ('B', 'Active', 4), ('B', 'Inactive', 5)
, ('C', 'Inactive', 7);
SELECT section,
(obj ->> 'Active')::int AS active,
(obj ->> 'Inactive')::int AS inactive
FROM (SELECT section, json_object_agg(status,ct) AS obj
FROM t
GROUP BY section
)X
Sorry this isn't complete because I can't test it here, but it may get you off in the right direction. I'm translating from something I use that makes a similar query:
select mt.section, mt1.count as Active, mt2.count as Inactive
from mytable mt
left join (select section, count from mytable where status='Active')mt1
on mt.section = mt1.section
left join (select section, count from mytable where status='Inactive')mt2
on mt.section = mt2.section
group by mt.section,
mt1.count,
mt2.count
order by mt.section asc;
The code I'm working from is:
select m.typeID, m1.highBid, m2.lowAsk, m1.highBid - m2.lowAsk as diff, 100*(m1.highBid - m2.lowAsk)/m2.lowAsk as diffPercent
from mktTrades m
left join (select typeID,MAX(price) as highBid from mktTrades where bid=1 group by typeID)m1
on m.typeID = m1.typeID
left join (select typeID,MIN(price) as lowAsk from mktTrades where bid=0 group by typeID)m2
on m1.typeID = m2.typeID
group by m.typeID,
m1.highBid,
m2.lowAsk
order by diffPercent desc;
which will return a typeID, the highest price bid and the lowest price asked and the difference between the two (a positive difference would mean something could be bought for less than it can be sold).
There's a different dynamic method that I've devised, one that employs a dynamic rec. type (a temp table, built via an anonymous procedure) & JSON. This may be useful for an end-user who can't install the tablefunc/crosstab extension, but can still create temp tables or run anon. proc's.
The example assumes all the xtab columns are the same type (INTEGER), but the # of columns is data-driven & variadic. That said, JSON aggregate functions do allow for mixed data types, so there's potential for innovation via the use of embedded composite (mixed) types.
The real meat of it can be reduced down to one step if you want to statically define the rec. type inside the JSON recordset function (via nested SELECTs that emit a composite type).
dbfiddle.uk
https://dbfiddle.uk/N1EzugHk
Crosstab function is available under the tablefunc extension. You'll have to create this extension one time for the database.
CREATE EXTENSION tablefunc;
You can use the below code to create pivot table using cross tab:
create table test_Crosstab( section text,
status text,
count numeric)
insert into test_Crosstab values ( 'A','Active',1)
,( 'A','Inactive',2)
,( 'B','Active',4)
,( 'B','Inactive',5)
select * from crosstab(
'select section
,status
,count
from test_crosstab'
)as ctab ("Section" text,"Active" numeric,"Inactive" numeric)

Convert rows to columns in SQL Server table

My table has the below sample data:
DECLARE #FHTable table (PK_ID int,FK_ID int,P_ID int, T_ID int, A_day int,A_hour TIME,D_day int,D_hour time)
INSERT INTO #FHtable VALUES (129,194,252,1005322,NULL,NULL,1,'02:30:00.0000000')
INSERT INTO #FHtable VALUES (130,194,311,1000891,3,'04:30:00.0000000',null,null)
INSERT INTO #FHtable VALUES (131,194,311,1000129,NULL,NULL,4,'03:30:00.0000000')
INSERT INTO #FHtable VALUES (132,194,252,1000025,6,'03:00:00.0000000',null,null)
SELECT * FROM #FHtable
My final result Should be of the below table:
DECLARE #FinalResultTable TABLE (FK_ID int,P_IDFrom int,P_IDTO INT, T_IDFrom int,T_IDTo INT, A_day int,A_hour TIME,D_day int,D_hour time)
INSERT INTO #FinalResultTable VALUES (194,252,311,1005322,1000891,3,'04:30:00.0000000',1,'02:30:00.0000000')
INSERT INTO #FinalResultTable VALUES (194,311,252,1000129,1000025,6,'03:00:00.0000000',4,'03:30:00.0000000')
select * from #FinalResultTable
The logic is there will be 4 rows for each FK_ID. The first and the second row is a source to destination and the 3rd and 4th row is again a source to destination.
Can you please help
As I pointed out in the comments, it's not very clear what is the logic that joins each arrival with the corresponding departure.
What it's clear to me it's that your title is wrong: you are not talking about converting rows to columns. Instead, you are just grouping 2 by 2 the rows of the table. So, you have 2 approaches, using a GROUP BY or just using a JOIN, it depends on what is exactly your logic.
Here is an example:
select A.FK_ID,
A.PK_ID as P_IDFrom, B.PK_ID as P_IDTO,
A.T_ID as T_IDFrom int, B.T_ID as T_IDTo,
B.A_day, B.A_hour,
A.D_day, A.D_hour
from #FHTable A
join #FHTable B on B.PK_ID+1=A.PK_ID
where A.A_day is null
(this assumes that an arrivals's PK is always +1 w.r.t. its departure).

Multiply rows by difference of numbers in columns, with sequence list

I need to create a table using postgres that multiplies a row by the difference of the numbers in 2 columns, and provides the corresponding sequence. It's hard to explain, I'll leave a picture to save us a thousand words:
I have found a partial answer to this question in SQL, but it only multiplies by one column, and I'm having trouble with using it in Posgresql:
How to multiply a single row with a number from column in sql.
You can use the generate_series function: https://www.postgresql.org/docs/current/static/functions-srf.html
create table table_a(
a integer primary key,
start_a integer,
end_a integer
);
insert into table_a values
(1, 1, 3),
(2, 2, 5);
create table table_b as
select a, start_a, end_a, g as start_b, g+1 as end_b
from table_a, lateral generate_series(start_a, end_a-1) g;
select * from table_b;
You can try it here: http://rextester.com/RTZWK4070

Using two different rows from the same table in an expression

I'm using PostgreSQL + PostGIS.
In table I have a point and line geometry in the same column of the same table, in different rows. To get the line I run:
SELECT the_geom
FROM filedata
WHERE id=3
If i want to take point I run:
SELECT the_geom
FROM filedata
WHERE id=4
I want take point and line together, like they're shown in this WITH expression, but using a real query against the table instead:
WITH data AS (
SELECT 'LINESTRING (50 40, 40 60, 50 90, 30 140)'::geometry AS road,
'POINT (60 110)'::geometry AS poi)
SELECT ST_AsText(
ST_Line_Interpolate_Point(road, ST_Line_Locate_Point(road, poi))) AS projected_poi
FROM data;
You see in this example data comes from a hand-created WITH expression. I want take it from my filedata table. My problem is i dont know how to work with data from two different rows of one table at the same time.
One possible way:
A subquery to retrieve another value from a different row.
SELECT ST_AsText(
ST_Line_Interpolate_Point(
the_geom
,ST_Line_Locate_Point(
the_geom
,(SELECT the_geom FROM filedata WHERE id = 4)
)
)
) AS projected_poi
FROM filedata
WHERE id = 3;
Use a self-join:
SELECT ST_AsText(
ST_Line_Interpolate_Point(fd_road.the_geom, ST_Line_Locate_Point(
fd_road.the_geom,
fd_poi.the_geom
)) AS projected_poi
FROM filedata fd_road, filedata fd_poi
WHERE fd_road.id = 3 AND fd_poi.id = 4;
Alternately use a subquery to fetch the other row, as Erwin pointed out.
The main options for using multiple rows from one table in a single expression are:
Self-join the table with two different aliases as shown above, then filter the rows;
Use a subquery expression to get a value for all but one of the rows, as Erwin's answer shows;
Use a window function like lag() and lead() to get a row relative to the current row within the query result; or
JOIN on a subquery that returns a table
The latter two are more advanced options that solve problems that're difficult or inefficient to solve with the simpler self-join or subquery expression.