I am trying to produce an aggregated output table using aggregates from two different tables. I am unclear on how to join the two outcomes.
The two tables, one listing all products in each store, the other the price variation for each product are as presented below.
| product_id | daily_price | date |
|------------|-------------|------------|
| 1 | 1.25$ | 01-01-2000 |
| 1 | ... | ... |
| 1 | 1$ | 31-12-2000 |
| 2 | 4.5$ | 01-01-2000 |
| 2 | ... | ... |
| 2 | 4.25$ | 31-12-2000 |
| store_id | product_id |
|----------|------------|
| 1 | 1 |
| 1 | 2 |
| 2 | 1 |
| 2 | 3 |
| 3 | 2 |
The first aggregation gets the average daily price (it varies) of all products.
SELECT product_id, ROUND((AVG(price)),2) as average_price FROM product_dailyprices
GROUP BY product_id;
| product_id | average_price |
|------------|---------------|
| 1 | 50 |
| 2 | 100 |
| 3 | 250 |
The second query gets me the number of different products available in each store
SELECT store, COUNT(product_id) as product_count FROM products
GROUP BY store;
| store_id | product_count |
|----------|---------------|
| 1 | 200 |
| 2 | 250 |
| 3 | 225 |
I am a bit lost on how to perform a query to produce the following:
| store_id | product_count | average_price_at_store |
|----------|---------------|------------------------|
| 1 | 34 | 6.51$ |
| 2 | 45 | 3.23$ |
| 3 | 36 | 5.37$ |
Thanks for the help!
As you did not provide an SQL for the tables, lets use the following bare bone structure:
CREATE TABLE products
(
id SERIAL NOT NULL,
name text NOT NULL,
CONSTRAINT products_pk PRIMARY KEY (id)
);
CREATE TABLE stores
(
id SERIAL NOT NULL,
name text NOT NULL,
CONSTRAINT stores_pk PRIMARY KEY (id)
);
CREATE TABLE daily_prices
(
product_id INTEGER NOT NULL,
daily_price DOUBLE PRECISION NOT NULL,
date timestamptz,
CONSTRAINT daily_prices_product FOREIGN KEY (product_id) REFERENCES products (id)
);
CREATE TABLE locations
(
store_id INTEGER NOT NULL,
product_id INTEGER NOT NULL,
CONSTRAINT products_product_fk FOREIGN KEY (product_id) REFERENCES products (id),
CONSTRAINT products_store_fk FOREIGN KEY (store_id) REFERENCES stores (id)
);
And let enter some sample data to help use verify tthat the query works:
INSERT INTO products(name)
VALUES ('product 1');
INSERT INTO products(name)
VALUES ('product 2');
INSERT INTO products(name)
VALUES ('product 3');
INSERT INTO stores(name)
VALUES ('store 1');
INSERT INTO stores(name)
VALUES ('store 2');
insert into locations (store_id, product_id)
values (1, 1),
(1, 2),
(2, 2),
(2, 3);
INSERT INTO daily_prices(product_id, daily_price, date)
VALUES (1, 2.0, '01-01-2020');
INSERT INTO daily_prices(product_id, daily_price, date)
VALUES (1, 4.0, '02-01-2020');
INSERT INTO daily_prices(product_id, daily_price, date)
VALUES (2, 3.0, '01-01-2020');
INSERT INTO daily_prices(product_id, daily_price, date)
VALUES (2, 5.0, '02-01-2020');
INSERT INTO daily_prices(product_id, daily_price, date)
VALUES (3, 10.0, '01-01-2020');
INSERT INTO daily_prices(product_id, daily_price, date)
VALUES (3, 20.0, '02-01-2020');
Then the query to produce your desired table would look like:
select l.store_id as store_id,
count(distinct l.product_id) as number_of_products,
avg(dp.daily_price) as average_price
from locations l
join daily_prices dp on dp.product_id = l.product_id
group by l.store_id;
And we can manually verify that it calculated the expected result:
+--------+------------------+-------------+
|store_id|number_of_products|average_price|
+--------+------------------+-------------+
|1 |2 |3.5 |
|2 |2 |9.5 |
+--------+------------------+-------------+
Related
Data table:
| WINNER | FOOT CLUB|
| -------- | -------- |
| 1 | Beşiktaş |
| 2 | Beşiktaş |
| 3 |Galatasaray |
| 4 |Galatasaray |
| 5 | Beşiktaş |
| 6 | Istanbul |
| 7 | Istanbul |
| 8 | Istanbul |
| 9 |Galatasaray |
| 10 |Galatasaray |
| 11 |Fenerbahçe |
| 12 |Fenerbahçe |
| 13 |Fenerbahçe |
| 14 | Istanbul |
Help, please. I need to make a sorted array of a sequence of identical values appear. Use SQL syntax of any version. I need this result:
Beşiktaş 2
Galatasaray 2
Beşiktaş 1
Istanbul 3
Galatasaray 2
Fenerbahçe 3
Istanbul 1
CREATE TABLE football (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL
);
INSERT INTO football VALUES (1, 'Beşiktaş');
INSERT INTO football VALUES (2, 'Beşiktaş');
INSERT INTO football VALUES (3, 'Galatasaray');
INSERT INTO football VALUES (4, 'Galatasaray');
INSERT INTO football VALUES (5, 'Beşiktaş');
INSERT INTO football VALUES (6, 'Istanbul');
INSERT INTO football VALUES (7, 'Istanbul');
INSERT INTO football VALUES (8, 'Istanbul');
INSERT INTO football VALUES (9, 'Galatasaray');
INSERT INTO football VALUES (10, 'Galatasaray');
INSERT INTO football VALUES (11, 'Fenerbahçe');
INSERT INTO football VALUES (12, 'Fenerbahçe');
INSERT INTO football VALUES (13, 'Fenerbahçe');
INSERT INTO football VALUES (14, 'Istanbul');
SELECT name,
RANK() OVER()
FROM football
it turned out like this:
Beşiktaş|1
Beşiktaş|1
Galatasaray|1
Galatasaray|1
Beşiktaş|1
Istanbul|1
Istanbul|1
Istanbul|1
Galatasaray|1
Galatasaray|1
Fenerbahçe|1
Fenerbahçe|1
Fenerbahçe|1
Istanbul|1
The below was adapted from this solution.
Dbfiddle for your solution if desired
select name, count(*) as cnt
from (select t.*,
(row_number() over (order by id) - row_number() over (partition by name order by id)
) as grp
from football t
) t
group by name, grp
order by min(id) asc
I need to turn a 1:n relationship into a 1:1 relationship with the data remaining the same.
I want to know if is it possible to achieve this with a single pure sql (no plpgsql, no external language).
Below there are more details, a MWE and some extra context.
To illustrate, if I have
+------+--------+ +------+----------+--------+
| id | name | | id | foo_id | name |
|------+--------| |------+----------+--------|
| 1 | foo1 | | 1 | 1 | baz1 |
| 2 | foo2 | | 2 | 1 | baz2 |
| 3 | foo3 | | 3 | 2 | baz3 |
+------+--------+ | 4 | 2 | baz4 |
| 5 | 3 | baz5 |
+------+----------+--------+
I want to get to
+------+--------+ +------+----------+--------+
| id | name | | id | foo_id | name |
|------+--------| |------+----------+--------|
| 4 | foo1 | | 1 | 4 | baz1 |
| 5 | foo1 | | 2 | 5 | baz2 |
| 6 | foo2 | | 3 | 6 | baz3 |
| 7 | foo2 | | 4 | 7 | baz4 |
| 8 | foo3 | | 5 | 8 | baz5 |
+------+--------+ +------+----------+--------+
Here is some code to set up the tables if needed:
drop table if exists baz;
drop table if exists foo;
create table foo(
id serial primary key,
name varchar
);
insert into foo (name) values
('foo1'),
('foo2'),
('foo3');
create table baz(
id serial primary key,
foo_id integer references foo (id),
name varchar
);
insert into baz (foo_id, name) values
(1, 'baz1'),
(1, 'baz2'),
(2, 'baz3'),
(2, 'baz4'),
(3, 'baz5');
I managed to work out the following query that updates only one entry (ie, the
pair <baz id, foo id> has to be provided):
with
existing_foo_values as (
select name from foo where id = 1
),
new_id as (
insert into foo(name)
select name from existing_foo_values
returning id
)
update baz
set foo_id = (select id from new_id)
where id = 1;
The real case scenario (a db migration in a nodejs environment) was solved using
something similar to
const existingPairs = await runQuery(`
select id, foo_id from baz
`);
await Promise.all(existingPairs.map(({
id, foo_id
}) => runQuery(`
with
existing_foo_values as (
select name from foo where id = ${foo_id}
),
new_id as (
insert into foo(name)
select name from existing_foo_values
returning id
)
update baz
set foo_id = (select id from new_id)
where id = ${id};
`)));
// Then delete all the orphan entries from `foo`
Here's a solution that works by first putting together what we want foo to look like (using values from the sequence), and then making the necessary changes to the two tables based on that.
WITH new_ids AS (
SELECT nextval('foo_id_seq') as foo_id, baz.id as baz_id, foo.name as foo_name
FROM foo
JOIN baz ON (foo.id = baz.foo_id)
),
inserts AS (
INSERT INTO foo (id, name)
SELECT foo_id, foo_name
FROM new_ids
),
updates AS (
UPDATE baz
SET foo_id = new_ids.foo_id
FROM new_ids
WHERE new_ids.baz_id = baz.id
)
DELETE FROM foo
WHERE id < (SELECT min(foo_id) FROM new_ids);
I have a couple of tables in Postgres database. I have joined and merges the tables. However, I would like to have common values in a specific column to appear together in the final table (In the end, I would like to perform groupby and maximum value calculation on the table).
The schema of the test tables looks like this:
Schema (PostgreSQL v11)
CREATE TABLE table1 (
id CHARACTER VARYING NOT NULL,
seq CHARACTER VARYING NOT NULL
);
INSERT INTO table1 (id, seq) VALUES
('UA502', 'abcdef'), ('UA503', 'ghijk'),('UA504', 'lmnop')
;
CREATE TABLE table2 (
id CHARACTER VARYING NOT NULL,
score FLOAT
);
INSERT INTO table2 (id, score) VALUES
('UA502', 2.2), ('UA503', 2.6),('UA504', 2.8)
;
CREATE TABLE table3 (
id CHARACTER VARYING NOT NULL,
seq CHARACTER VARYING NOT NULL
);
INSERT INTO table3 (id, seq) VALUES
('UA502', 'qrst'), ('UA503', 'uvwx'),('UA504', 'yzab')
;
CREATE TABLE table4 (
id CHARACTER VARYING NOT NULL,
score FLOAT
);
INSERT INTO table4 (id, score) VALUES
('UA502', 8.2), ('UA503', 8.6),('UA504', 8.8);
;
I performed join and union and oepration of the tables to get the desired columns.
Query #1
SELECT table1.id, table1.seq, table2.score
FROM table1 INNER JOIN table2 ON table1.id = table2.id
UNION
SELECT table3.id, table3.seq, table4.score
FROM table3 INNER JOIN table4 ON table3.id = table4.id
;
The output looks like this:
| id | seq | score |
| ----- | ------ | ----- |
| UA502 | qrst | 8.2 |
| UA502 | abcdef | 2.2 |
| UA504 | yzab | 8.8 |
| UA503 | uvwx | 8.6 |
| UA504 | lmnop | 2.8 |
| UA503 | ghijk | 2.6 |
However, the desired output should be:
| id | seq | score |
| ----- | ------ | ----- |
| UA502 | qrst | 8.2 |
| UA502 | abcdef | 2.2 |
| UA504 | yzab | 8.8 |
| UA504 | lmnop | 2.8 |
| UA503 | uvwx | 8.6 |
| UA503 | ghijk | 2.6 |
View on DB Fiddle
How should I modify my query to get the desired output?
I have a query that I'm attempting to optimize and running into some unexpected behavior.
For example, take a table like this:
CREATE TABLE objects
(id int, external_id int, timestamp timestamp);
INSERT INTO objects
(id, external_id, timestamp)
VALUES
(1, 1, '2019-08-16 12:00:00'),
(2, 1, NULL),
(3, 2, '2019-08-16 12:00:00'),
(4, 2, NULL);
I use a query as so to partition the objects by their external_id and then select the maximum value for timestamp across the partition:
SELECT *,
max(timestamp) OVER (PARTITION BY external_id) as max_timestamp,
row_number() OVER (PARTITION BY external_id ORDER BY timestamp ASC NULLS FIRST, id ASC)
FROM objects;
This produces the following (which is what I'm looking for):
[Results]:
| id | external_id | timestamp | max_timestamp | row_number |
|----|-------------|----------------------|----------------------|------------|
| 2 | 1 | (null) | 2019-08-16T12:00:00Z | 1 |
| 1 | 1 | 2019-08-16T12:00:00Z | 2019-08-16T12:00:00Z | 2 |
| 4 | 2 | (null) | 2019-08-16T12:00:00Z | 1 |
| 3 | 2 | 2019-08-16T12:00:00Z | 2019-08-16T12:00:00Z | 2 |
I want to remove the multiple window functions in favor of a single one. For example:
SELECT *,
max(timestamp) OVER w as max_timestamp,
row_number() OVER w
FROM objects
WINDOW w AS (PARTITION BY external_id ORDER BY timestamp ASC NULLS FIRST, id ASC);
However, doing this produces a different result with the max_timestamp set to NULL for half the results:
[Results]:
| id | external_id | timestamp | max_timestamp | row_number |
|----|-------------|----------------------|----------------------|------------|
| 2 | 1 | (null) | (null) | 1 |
| 1 | 1 | 2019-08-16T12:00:00Z | 2019-08-16T12:00:00Z | 2 |
| 4 | 2 | (null) | (null) | 1 |
| 3 | 2 | 2019-08-16T12:00:00Z | 2019-08-16T12:00:00Z | 2 |
Why would introducing a sort order to the window function have any effect on the return of max?
http://sqlfiddle.com/#!17/1fcc4/4
I'm asking this question with reference to the study material available at How to convert columns to rows and rows to columns. I have similar query explained in section UNPIVOTING. Here is my set up.
Table definition
CREATE TABLE MYTABLE (
ID INTEGER,
CODE_1 VARCHAR,
CODE_2 VARCHAR,
CODE_3 VARCHAR,
CODE_1_DT DATE,
CODE_2_DT DATE,
CODE_3_DT DATE,
WHO COLUMNS
);
Table Data
ID | CODE_1 | CODE_2 | CODE_3 | CODE_1_DT | CODE_2_DT | CODE_3_DT | UPDATED_BY
1 | CD1 | CD2 | CD3 | 20100101 | 20160101 | 20170101 | USER1
2 | CD1 | CD2 | CD3 | 20100101 | 20160101 | 20170101 | USER2
3 | CD1 | CD2 | CD3 | 20100101 | 20160101 | 20170101 | USER3
My SQL to convert columns to row
SELECT Q.CODE, Q.CODE_DT FROM MYTABLE AS MT,
TABLE VALUES(
(MT.CODE_1, MT.CODE_1_DT),
(MT.CODE_2, MT.CODE_2_DT),
(MT.CODE_3, MT.CODE_3_DT),
) AS Q(CODE, CODE_DT)
WHERE MT.ID=1;
Expected output is
CODE | CODE_DT
CD1 | 20100101
CD2 | 20160101
CD3 | 20170101
I'm not able to get the expected result and getting error related to cardinality or cardinality multiplier. I don't know what's going wrong or sq. is correct...any pointers?
Try this
select id1, code, date
from mytable t,
lateral (values (t.id, t.code_1, t.code_1_dt),
(t.id, t.code_2, t.code_2_dt),
(t.id, t.code_3, t.code_3_dt)
) as q (id1, code, date)