Postgresql select, show fixed count rows - postgresql

Simple question. I have a table "tablename" with 3 rows. I need show 5 rows in my select when count rows < 5.
select * from tablename
+------------------+
|colname1 |colname2|
+---------+--------+
|1 |AAA |
|2 |BBB |
|3 |CCC |
+---------+--------+
In this query I show all rows in the table.
But I need show 5 rows. 2 rows is empty.
For example (I need):
+------------------+
|colname1 |colname2|
+---------+--------+
|1 |AAA |
|2 |BBB |
|3 |CCC |
| | |
| | |
+---------+--------+
Last 2 rows is empty.
It is possible?

Something like this:
with num_rows (rn) as (
select i
from generate_series(1,5) i -- adjust here the desired number of rows
), numbered_table as (
select colname1,
colname2,
row_number() over (order by colname1) as rn
from tablename
)
select t.colname1, t.colname2
from num_rows r
left outer join numbered_table t on r.rn = t.rn;
This assigns a number for each row in tablename and joins that to a fixed number of rows. If you know that your values in colname1 are always sequential and without gaps (which is highly unlikely) then you can remove the generation of row numbers in the second CTE using row_number().
If you don't care which rows are returned, you can leave out the order by part - but then the rows that are matched will be random. Leaving out the order by will be a bit more efficient.
The above will always return exactly 5 rows, regardless of how many rows tablename contains. If you want at least 5 rows, then you need to flip the outer join:
....
select t.colname1, t.colname2
from numbered_table t
left outer join num_rows r on r.rn = t.rn;
SQLFiddle example: http://sqlfiddle.com/#!15/e5770/3

Related

SQL Join multiple table without repetition

I've got 3 tables
Table A
----------------------
| ID| Data1 | Data2 |
---------------------
| 1 |John | 2021 |
| 2 |Steve | 2020 |
Table B
----------------------
|Row|ID|Value1|Value2|
----------------------
|1 |1 |iR3000|0.5 |
|2 |1 |iRC252|0.7 |
|3 |2 |Dr2000|0.4 |
Table C
----------------------
|Row|ID|Value3|Value4|
----------------------
|1 |1 |aaaaaa|12345 |
|2 |1 |bbbbbb|6789 |
My goal is to add a result like this :
-------------------------------------------------
| ID| Data1 | Data2 |Value1|Value2|Value3|Value4|
-------------------------------------------------
| 1 |John | 2021 |iR3000|0.5 |aaaaaa|12345 |
| 1 |John | 2021 |iRC252|0.7 |bbbbbb|6789 |
| 2 |Steve | 2020 |Dr2000|0.4 |null |null |
Actually with my query, the ID 1 is duplicate 4 times.
Here is my query :
SELECT
a.id, a.data1,a.data2
,b.value1, b.value2
,c.value3,c.value4
FROM TableA a
JOIN TableB b
ON b.ID=a.ID
JOIN TableC c
ON c.ID=a.ID
What you had was close; only the JOIN to TableC was wrong. It needs to be an OUTER JOIN and also match on the Row column:
SELECT a.ID, a.Data1, a.Data2, b.Value1, b.Value2, c.Value3, c.Value4
FROM TableA a
INNER JOIN TableB b on b.ID = a.ID
LEFT JOIN TableC c on c.ID = b.ID AND c.Row = b.Row
Update based on the comment:
I cannot use row column cause they are not always match with the same number.
Okay. If the Row column at least exists, we can still work with that to create projections that might be more consistent between tables:
With TableB2 AS (
SELECT *, row_number() over (partition by ID order by row) As Row2
FROM TableB
),
TableC2 As (
SELECT *, row_number() over (partition by ID order by row) As Row2
FROM TableC
)
SELECT a.ID, a.Data1, a.Data2, b.Value1, b.Value2, c.Value3, c.Value4
FROM TableA a
INNER JOIN TableB2 b on b.ID = a.ID
LEFT JOIN TableC2 c on c.ID = b.ID AND c.Row = b.Row
What we cannot do is rely on the order of the records on disk or the insertion order. There MUST be some field to indicate, e.g. the iR3000 row in TableB relates to the aaaaaa row in TableC rather than the bbbbbb row.
The order records appear in the table is not good enough. Databases are based on relational set theory, so what we think of as "Tables" are more-formally defined as "Unordered Relations". Note the word "unordered" in that definition. While table order may seem to be stable over stretches, databases are free to re-ordered the rows on disk after insertion. They can and will do this to make queries more efficient, conform better with indexes, fill up pages, etc.

Given a row representing a path, union a total column

Say I have a table like the following table that represents a path from 1 -> 2 -> 3 -> 4 -> 5:
+------+----+--------+
| from | to | weight |
+------+----+--------+
| a | b | 1 |
| b | c | 2 |
| c | d | 1 |
| d | e | 1 |
| e | f | 3 |
+------+----+--------+
Each row knows where it came from and where it is going
I would like to union a total row that takes the starting name, ending name, and a total weight like so:
+------+----+--------+
| from | to | weight |
+------+----+--------+
| a | f | 8 |
+------+----+--------+
The first table is a result of a CTE expression, and I can easily get the total of the previous query with SUM, but I'm unable to get the LAST_VALUE to work in a similar way to:
WITH RECURSIVE cte AS (
...
)
SELECT *
FROM cte
UNION ALL
SELECT 'total', FIRST_VALUE(from), LAST_VALUE(to), SUM(weight)
FROM cte
The FIRST_VALUE and LAST_VALUE functions require OVER clauses which seem to add unnecessary complications to what I would expect, so I think I am going the wrong direction with that. Any ideas on how to achieve this?
So I made a strange solution that:
Selects the first from value (partitioned by TRUE)
Selects the last to value (partitioned by TRUE again)
Cross joins the sum of all weights, limited to 1
WITH RECURSIVE cte AS (
...
)
SELECT *
FROM cte
UNION ALL (
SELECT FIRST_VALUE(from) OVER (PARTITION BY TRUE), LAST_VALUE(to) OVER (PARTITION BY TRUE), total
FROM cte
CROSS JOIN (
SELECT SUM(weight) as total
FROM cte
) tmp
LIMIT 1
);
Is it hacky? Yes. Does it work? Also yes. I'm sure there are better solutions, and I would love to hear them.

Left Join two tables - dont include the joins where second table has more than 1 row for value from first table; rejects

As title said, I want to reject rows, so I will not create duplicates.
And first step is not to join on values that have more rows in second table.
Here is an example if needed:
Table a:
aa |bb |
---|----|
1 |111 |
2 |222 |
Table h:
hh |kk |
---|----|
1 |111 |
2 |111 |
3 |222 |
Using Normal Left join:
SELECT
*
FROM a
LEFT JOIN h
ON a.bb = h.kk
;
I get:
aa |bb |hh |kk |
---|----|---|----|
1 |111 |1 |111 |
1 |111 |2 |111 |
2 |222 |3 |222 |
I want to get rid of first two rows, where aa = 1.
...
And second step would be for another query, probably with some case, where is table a I will filter out only those rows which have in table b more than 2 rows.
Therefore I want to create table c, where i will have:
aa |bb |
---|----|
1 |111 |
Can someone help me please?
Thank you.
To get only the 1:1 joins
SELECT a.aa,h.hh,h.kk FROM a
LEFT JOIN h ON a.bb = h.kk
GROUP BY bb HAVING COUNT(kk)=1
To get only the 1:n joins
SELECT a.aa,h.hh,h.kk FROM a
LEFT JOIN h ON a.bb = h.kk
GROUP BY bb HAVING COUNT(kk)>1

Aggregating a table based on one column and then joining it with another table

I am working with the following two tables;
Table 1
Key |Clicks |Impressions
-------------+-------+-----------
USA-SIM-CARDS|55667 |544343
DE-SIM-CARDS |4563 |234829
AU-SIM-CARDS |3213 |232242
UK-SIM-CARDS |3213 |1333223
CA-SIM-CARDS |4321 |8883111
MX-SIM-CARDS |3193 |3291023
Table 2
Key |Conversions |Final Conversions|Active Sims
-----------------+------------+-----------------+-----------
USA-SIM-CARDS |456 |43 |4
USA-SIM-CARDS |65 |2 |1
UK-SIM-CARDS |123 |4 |3
UK-SIM-CARDS |145 |34 |5
The goal is to get the following output;
Key |Clicks |Impressions|Conversions|Final Conversions|Active Sims
-------------+-------+-----------+-----------+-----------------+-----------
USA-SIM-CARDS|55667 |544343 |521 |45 |5
DE-SIM-CARDS |4563 |234829 | | |
AU-SIM-CARDS |3213 |232242 | | |
UK-SIM-CARDS |3213 |1333223 |268 |38 |8
CA-SIM-CARDS |4321 |8883111 | | |
MX-SIM-CARDS |3193 |3291023 | | |
The most crucial part of this function involves aggregating the second table based on conversions
I would then I imagine execute this with an inner join.
Thank you.
Take this in two steps then:
1) Aggregate the second table:
SELECT Key, sum(Conversions) as Conversions, sum("Final Conversions") as FinalConversions, Sum("Active Sims") as ActiveSims FROM Table2 GROUP BY key
2) Use that as a subquery/derived table joining to your first table:
SELECT
t1.key,
t1.clicks,
t1.impressions,
t2.conversions,
t2.finalConversions,
t2.ActiveSims
From Table1 t1
LEFT OUTER JOIN (SELECT Key, sum(Conversions) as Conversions, sum("Final Conversions") as FinalConversions, Sum("Active Sims") as ActiveSims FROM Table2 GROUP BY 2) t2
ON t1.key = t2.key;
As an alternative, you could join and then group by as well since there isn't any need to aggregate twice or anything:
SELECT
t1.key,
t1.clicks,
t1.impressions,
sum(Conversions) as Conversions,
sum("Final Conversions") as FinalConversions,
Sum("Active Sims") as ActiveSims
From Table1 t1
LEFT OUTER JOIN table2 t2
ON t1.key = t2.key
GROUP BY t1.key, t1.clicks, t1.impressions
The only other important thing here is that we are using a LEFT OUTER JOIN since we want all record from Table1 and any records from Table2 that match on the key.

T-SQL generate sequence from string and count

I need to generate a sequence starting from a CSV string and a maximum count.
When the sequence exceed, I need to start the sequence again and continue until I saturate the COUNT variable
I have the following CSV:
A,B,C,D
In order to get 4 rows out of this CSV I am using XML and the following statement:
SET #xml_csv = N'<root><r>' + replace('A, B, C, D',',','</r><r>') + '</r></root>'
SELECT
REPLACE(t.value('.','varchar(max)'), ' ', '') AS [delimited items]
FROM
#xml_csv.nodes('//root/r') AS a(t)
Now my SELECT returns the following output:
|-------------|
| A |
| B |
| C |
| D |
Assuming I have a #count variable set to 9, I need to output the following:
|--|-----------|
|1 |A |
|2 |B |
|3 |C |
|4 |D |
|5 |A |
|6 |B |
|7 |C |
|8 |D |
|9 |A |
I tried to join a table called master..[spt_values] but I get for a COUNT = 10 10 rows for A, 10 for B and so on, while I need the sequence ordered and repeated until it saturate
Basically you are on the correct path. Joining the split result with a numbers table will get you the correct output.
I've chosen to use a different function for splitting the csv data since it's using a numbers table for the split as well. (taken from this great article)
First, if you don't already have a numbers table, create one. here is the script used in the article I've linked to:
SET NOCOUNT ON;
DECLARE #UpperLimit INT = 1000;
WITH n AS
(
SELECT
x = ROW_NUMBER() OVER (ORDER BY s1.[object_id])
FROM sys.all_objects AS s1
CROSS JOIN sys.all_objects AS s2
CROSS JOIN sys.all_objects AS s3
)
SELECT Number = x
INTO dbo.Numbers
FROM n
WHERE x BETWEEN 1 AND #UpperLimit;
GO
CREATE UNIQUE CLUSTERED INDEX n ON dbo.Numbers(Number)
WITH (DATA_COMPRESSION = PAGE);
GO
Then, create the split function:
CREATE FUNCTION dbo.SplitStrings_Numbers
(
#List NVARCHAR(MAX),
#Delimiter NVARCHAR(255)
)
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN
(
SELECT Item = SUBSTRING(#List, Number,
CHARINDEX(#Delimiter, #List + #Delimiter, Number) - Number)
FROM dbo.Numbers
WHERE Number <= CONVERT(INT, LEN(#List))
AND SUBSTRING(#Delimiter + #List, Number, LEN(#Delimiter)) = #Delimiter
);
GO
Next step: Join the split results with the numbers table:
DECLARE #Csv varchar(20) = 'A,B,C,D'
SELECT TOP 10 Item
FROM dbo.SplitStrings_Numbers(#Csv, ',')
CROSS JOIN Numbers
ORDER BY Number
Output:
Item
----
A
B
C
D
A
B
C
D
A
B
Great thanks to Aaron Bertrand for sharing his knowledge.