Is there a query that would push previous ids into an array, that would result in:
id c_array
-----------------
1 {}
2 {1}
3 {1,2}
4 {1,2,3}
I think you want to use the array_agg aggregate as a window function.
SELECT
id,
array_agg(id) OVER (ORDER BY id) AS c_array
FROM the_table
Try Like This
select id, array(select l.id from table1 l where l.id< i.id order by id) as
c_Arr from table1 i
Related
how to find number of records in both table using join.
i have two tables table1 and table2 with same structure.
table1
id
item
1
A
1
B
1
C
2
A
2
B
table2
id
item
1
A
1
B
2
A
2
B
2
C
2
D
Output should be like this.
id
table1.itemcount
table2.itemcount
1
3
2
2
2
4
SELECT DISTINCT id, (
SELECT COUNT(*) FROM table1 AS table1_2 WHERE table1_2.id=table1.id
) AS "table1.itemcount", (
SELECT COUNT(*) FROM table2 AS table2_2 WHERE table2_2.id=table1.id
) AS "table2.itemcount"
FROM table1;
Assuming that each id is guaranteed to exist in both tables, the following would work
select
t1.id,
count(distinct t1.item) t1count,
count(distinct t2.item) t2count
from t1
join t2 on t1.id = t2.id
group by 1;
But if that is not guaranteed then we'll have to use full outer join to get unique ids from both tables
select
coalesce(t1.id, t2.id) id,
count(distinct t1.item) t1count,
count(distinct t2.item) t2count
from t1
full outer join t2 on t1.id = t2.id
group by 1;
We're using coalesce here as well for id because if it only exists in t2, t1.id would result in null.
#DeeStark's answer also works if ids are guaranteed to be in both tables but it's quite inefficient because count is essentially run twice for every distinct id in the table. Here's the fiddle where you can test out different approaches. I've prefixed each query with explain which shows the cost
Hope this helps
I have Car table. Car has is_sold and is_shipped. A Car belongs to a dealership, dealership_id (FK).
I want to run a query that tells me the count of sold cars and the count of shipped cars for a given dealership all in one result.
sold_count | shipped_count
10 | 4
The single queries I have look like this:
select count(*) as sold_count
from car
where dealership_id=25 and is_sold=true;
and
select count(*) as shipped_count
from car
where dealership_id=25 and is_shipped=true;
How do I combine the two to get both counts in one result?
This will do:
select dealership_id,
sum(case when is_sold is true then 1 else 0 end),
sum(case when is_shipped is true then 1 else 0 end)
from cars group by dealership_id;
You can use the filter clause of the Aggregate function. (see demo)
select dealership_id
, count(*) filter (where is_sold) cars_sold
, count(*) filter (where is_shipped) cars_shipped
from cars
where dealership_id = 25
group by dealership_id;
You can also using cross join.
select 'hello' as col1, 'world' as col2;
return:
col1 | col2
-------+-------
hello | world
(1 row)
similarly,
with a as
(
select count(*) as a1 from emp where empid> 5),
b as (
select count(*) as a2 from emp where salary > 6000)
select * from a, b;
or you can even apply to different table. like:
with a as
(select count(*) as a1 from emp where empid> 5),
b as
(select count(*) as a2 from ab )
select * from a, b;
with a as
(
select count(*) as sold_count
from car
where dealership_id=25 and is_sold=true
),
b as
(
select count(*) as shipped_count
from car
where dealership_id=25 and is_shipped=true
)
select a,b;
further reading: https://www.postgresql.org/docs/current/queries-table-expressions.html.
https://stackoverflow.com/a/26369295/15603477
Input data
I have the following association table:
AssociationTable
- Item ID: Integer
- Tag ID: Integer
Referring to the following example data
Item Tag
1 1
1 2
1 3
2 1
and some input list of tags T (e.g. [1, 2])
What I want
For each item, I would like to know which tags were not provided in the input list T.
With our sample data, we'd get:
Item Num missing
1 1
2 0
My thoughts
The best I've done so far is: select "ItemId", count("TagId") as "Num missing" from "AssociationTab" where "TagId" not in (1) group by "ItemId";
The problem here is that items where all tags match will not be included in the output.
You could use a calendar table with anti-join approach:
WITH cte AS (
SELECT t1.Item, t2.Tag
FROM (SELECT DISTINCT Item FROM AssociationTable) t1
CROSS JOIN (SELECT 1 AS Tag UNION ALL SELECT 2) t2
)
SELECT
t1.Item,
COUNT(*) FILTER (WHERE t2.Item IS NULL) AS num_missing
FROM cte t1
LEFT JOIN AssociationTable t2
ON t1.Item = t2.Item AND
t1.Tag = t2.Tag AND
t2.Tag IN (1, 2)
GROUP BY
t1.Item;
Demo
The strategy here is to build a calendar/reference table in the first CTE which contains all combinations of items and tags. Then, we left join this CTE to your association table, aggregate by item, and then detect how many tags are missing for each item.
Simplest solution is
SELECT
ItemId,
count(*) FILTER (WHERE TagId NOT IN (1,2))
FROM AssociationTab
GROUP BY ItemId
Alternatively, if you already have an Items table with the item list, you could do this:
SELECT
i.ItemId,
count(a.TagId)
FROM Items i
LEFT JOIN AssociationTab a ON a.ItemId = i.ItemId AND a.TagId NOT IN (1,2)
GROUP BY i.ItemId
The key is that LEFT JOIN does not remove the Items row if no tags match.
Imagine a table that looks like this:
The SQL to get this data was just SELECT *
The first column is "row_id" the second is "id" - which is the order ID and the third is "total" - which is the revenue.
I'm not sure why there are duplicate rows in the database, but when I do a SUM(total), it's including the second entry in the database, even though the order ID is the same, which is causing my numbers to be larger than if I select distinct(id), total - export to excel and then sum the values manually.
So my question is - how can I SUM on just the distinct order IDs so that I get the same revenue as if I exported to excel every distinct order ID row?
Thanks in advance!
Easy - just divide by the count:
select id, sum(total) / count(id)
from orders
group by id
See live demo.
Also handles any level of duplication, eg triplicates etc.
You can try something like this (with your example):
Table
create table test (
row_id int,
id int,
total decimal(15,2)
);
insert into test values
(6395, 1509, 112), (22986, 1509, 112),
(1393, 3284, 40.37), (24360, 3284, 40.37);
Query
with distinct_records as (
select distinct id, total from test
)
select a.id, b.actual_total, array_agg(a.row_id) as row_ids
from test a
inner join (select id, sum(total) as actual_total from distinct_records group by id) b
on a.id = b.id
group by a.id, b.actual_total
Result
| id | actual_total | row_ids |
|------|--------------|------------|
| 1509 | 112 | 6395,22986 |
| 3284 | 40.37 | 1393,24360 |
Explanation
We do not know what the reasons is for orders and totals to appear more than one time with different row_id. So using a common table expression (CTE) using the with ... phrase, we get the distinct id and total.
Under the CTE, we use this distinct data to do totaling. We join ID in the original table with the aggregation over distinct values. Then we comma-separate row_ids so that the information looks cleaner.
SQLFiddle example
http://sqlfiddle.com/#!15/72639/3
Create custom aggregate:
CREATE OR REPLACE FUNCTION sum_func (
double precision, pg_catalog.anyelement, double precision
)
RETURNS double precision AS
$body$
SELECT case when $3 is not null then COALESCE($1, 0) + $3 else $1 end
$body$
LANGUAGE 'sql';
CREATE AGGREGATE dist_sum (
pg_catalog."any",
double precision)
(
SFUNC = sum_func,
STYPE = float8
);
And then calc distinct sum like:
select dist_sum(distinct id, total)
from orders
SQLFiddle
You can use DISTINCT in your aggregate functions:
SELECT id, SUM(DISTINCT total) FROM orders GROUP BY id
Documentation here: https://www.postgresql.org/docs/9.6/static/sql-expressions.html#SYNTAX-AGGREGATES
If we can trust that the total for 1 order is actually 1 row. We could eliminate the duplicates in a sub-query by selecting the the MAX of the PK id column. An example:
CREATE TABLE test2 (id int, order_id int, total int);
insert into test2 values (1,1,50);
insert into test2 values (2,1,50);
insert into test2 values (5,1,50);
insert into test2 values (3,2,100);
insert into test2 values (4,2,100);
select order_id, sum(total)
from test2 t
join (
select max(id) as id
from test2
group by order_id) as sq
on t.id = sq.id
group by order_id
sql fiddle
In difficult cases:
select
id,
(
SELECT SUM(value::int4)
FROM jsonb_each_text(jsonb_object_agg(row_id, total))
) as total
from orders
group by id
I would suggest just use a sub-Query:
SELECT "a"."id", SUM("a"."total")
FROM (SELECT DISTINCT ON ("id") * FROM "Database"."Schema"."Table") AS "a"
GROUP BY "a"."id"
The Above will give you the total of each id
Use below if you want the full total of each duplicate removed:
SELECT SUM("a"."total")
FROM (SELECT DISTINCT ON ("id") * FROM "Database"."Schema"."Table") AS "a"
Using subselect (http://sqlfiddle.com/#!7/cef1c/51):
select sum(total) from (
select distinct id, total
from orders
)
Using CTE (http://sqlfiddle.com/#!7/cef1c/53):
with distinct_records as (
select distinct id, total from orders
)
select sum(total) from distinct_records;
I have a query
SELECT id_anything FROM table1 JOIN table2 USING (id_tables)
Now, i have a situation which is:
If that join returns two rows from table2 i want to show the id_anything from table1 (1 row only)
and if the join from table2 returns 1 row, i want to show id_anything from table2.
Ps: id_anything from table 2 returns different values
Example data:
table1
id_tables | id_anything
1 | 1
table2
id_tables | id_anything
1 | 10
1 | 100
Return expected: 1
First, get the value you may want to return and the basis for deciding which to return together into one row.
SELECT table1.id_tables, table1.id_anything AS table1_id, MIN(table2.id_anything) AS table2_id, COUNT(*)
FROM table1 JOIN table2 USING (id_tables)
GROUP BY table1.id_tables, table1.id_anything
The aggregate function you use doesn't really matter since you'll only be using the value if there is only one.
You can then pick the relevant value:
WITH join_summary AS (
SELECT table1.id_tables, table1.id_anything AS table1_id, MIN(table2.id_anything) AS table2_id, COUNT(*) AS match_count
FROM table1 JOIN table2 USING (id_tables)
GROUP BY table1.id_tables, table1.id_anything
)
SELECT id_tables, CASE WHEN (match_count > 1) THEN table1_id ELSE table2_id END AS id_anything
FROM join_summary