PostgreSQL Group By not working as expected - wants too many inclusions - postgresql

I have a simple postgresql table that I'm tying to query. Imaging a table like this...
| ID | Account_ID | Iteration |
|----|------------|-----------|
| 1 | 100 | 1 |
| 2 | 101 | 1 |
| 3 | 100 | 2 |
I need to get the ID column for each Account_ID where Iteration is at its maximum value. So, you'd think something like this would work
SELECT "ID", "Account_ID", MAX("Iteration")
FROM "Table_Name"
GROUP BY "Account_ID"
And I expect to get:
| ID | Account_ID | MAX(Iteration) |
|----|------------|----------------|
| 2 | 101 | 1 |
| 3 | 100 | 2 |
But when I do this, Postgres complains:
ERROR: column "ID" must appear in the GROUP BY clause or be used in an aggregate function
Which, when I do that it just destroys the grouping altogether and gives me the whole table!
Is the best way to approach this using the following?
SELECT DISTINCT ON ("Account_ID") "ID", "Account_ID", "Iteration"
FROM "Marketing_Sparks"
ORDER BY "Account_ID" ASC, "Iteration" DESC;

The GROUP BY statement aggregates rows with the same values in the columns included in the group by into a single row. Because this row isn't the same as the original row, you can't have a column that is not in the group by or in an aggregate function. To get what you want, you will probably have to select without the ID column, then join the result to the original table. I don't know PostgreSQL syntax, but I assume it would be something like the following.
SELECT Table_Name.ID, aggregate.Account_ID, aggregate.MIteration
(SELECT Account_ID, MAX(Iteration) AS MIteration
FROM Table_Name
GROUP BY Account_ID) aggregate
LEFT JOIN Table_Name ON aggregate.Account_ID = Table_Name.Account_ID AND
aggregate.MIteration = Tabel_Name.Iteration

Related

PostgreSQL How to merge two tables row to row without condition

I have two tables
The first table contains three text fields(username, email, num) the second have only one column with random birth_date DATE.
I need to merge tables without condition
For example
first table:
+----------+--------------+-----------+
| username | email | num |
+----------+--------------+-----------+
| 'user1' | 'user1#mail' | '+794949' |
| 'user2' | 'user2#mail' | '+799999' |
+----------+--------------+-----------+
second table:
+--------------+
| birth_date |
+--------------+
| '2001-01-01' |
| '2002-02-02' |
+--------------+
And I need result like
+----------+------------+-------------+--------------+
| username | email | num | birth_date |
+----------+------------+-------------+--------------+
| 'user1' | 'us1#mail' | '+7979797' | '2001-01-01' |
| 'user2' | 'us2#mail' | '+79898998' | '2002-02-02' |
+----------+------------+-------------+--------------+
I need to get in result table with 100 rows too
Tried different JOIN but there is no condition here
Sure there is a join condition, about the simplest there is: Join on true or cross join. Either is the basic merge tables without condition. However this does not result in what you want as it generates a result set of 10k rows. But you an then use limit:
select *
from table1
join table2 on true
order by random()
limit 100;
select *
from table1
cross join table2
order by random()
limit 100;
There is other option, witch I think may be closer to what you want. Assign a value to each row of each table. Then join on this assigned value:
select <column list>
from (select *, row_number() over() rn from table1) t1
join (select *, row_number() over() rn from table2) t2
on (t1.rn = t2.rn);
To eliminate the assigned value you must specifically list each column desired in the result. But that is the way it should be done anyway.
See demo here. (demo user just 3 rows instead of 100)

Fetch records with distinct value of one column while replacing another col's value when multiple records

I have 2 tables that I need to join based on distinct rid while replacing the column value with having different values in multiple rows. Better explained with an example set below.
CREATE TABLE usr (rid INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(12) NOT NULL,
email VARCHAR(20) NOT NULL);
CREATE TABLE usr_loc
(rid INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
code CHAR NOT NULL PRIMARY KEY,
loc_id INT NOT NULL PRIMARY KEY);
INSERT INTO usr VALUES
(1,'John','john#product'),
(2,'Linda','linda#product'),
(3,'Greg','greg#product'),
(4,'Kate','kate#product'),
(5,'Johny','johny#product'),
(6,'Mary','mary#test');
INSERT INTO usr_loc VALUES
(1,'A',4532),
(1,'I',4538),
(1,'I',4545),
(2,'I',3123),
(3,'A',4512),
(3,'A',4527),
(4,'I',4567),
(4,'A',4565),
(5,'I',4512),
(6,'I',4567);
(6,'I',4569);
Required Result Set
+-----+-------+------+-----------------+
| rid | name | Code | email |
+-----+-------+------+-----------------+
| 1 | John | B | 'john#product' |
| 2 | Linda | I | 'linda#product' |
| 3 | Greg | A | 'greg#product' |
| 4 | Kate | B | 'kate#product' |
| 5 | Johny | I | 'johny#product' |
| 6 | Mary | I | 'mary#test' |
+-----+-------+------+-----------------+
I have tried some queries to join and some to count but lost with the one which exactly satisfies the whole scenario.
The query I came up with is
SELECT distinct(a.rid)as rid, a.name, a.email, 'B' as code
FROM usr
JOIN usr_loc b ON a.rid=b.rid
WHERE a.rid IN (SELECT rid FROM usr_loc GROUP BY rid HAVING COUNT(*) > 1);`
You need to group by the users and count how many occurrences you have in usr_loc. If more than a single one, then replace the code by B. See below:
select
rid,
name,
case when cnt > 1 then 'B' else min_code end as code,
email
from (
select u.rid, u.name, u.email, min(l.code) as min_code, count(*) as cnt
from usr u
join usr_loc l on l.rid = u.rid
group by u.rid, u.name, u.email
) x;
Seems to me that you are using MySQL, rather than IBM DB2. Is that so?

selecting multiple columns but group by only one in postgres

I have a simple table in postgres:
remoteaddr count
142.4.218.156 592
158.69.26.144 613
167.114.209.28 618
Which I pulled using the following:
select remoteaddr,
count (remoteaddr)
from domain_visitors
group by remoteaddr
having count (remoteaddr) > 500
How do I select additional columns and still only group by remoteaddr?
Option 1: You could use the array_agg() function to concatenate the additional column values into a grouped list:
SELECT
remoteaddr,
array_agg(DISTINCT username) AS unique_users,
array_agg(username) AS repeated_users,
count(remoteaddr) as remote_count
FROM domain_visitors
GROUP BY remoteaddr;
See this SQL Fiddle. This query would return something like the below:
+----------------+---------------------------------+-----------------------------------------------------------------------------------------------------+--------------+
| remoteaddr | unique_users | repeated_users | remote_count |
+----------------+---------------------------------+-----------------------------------------------------------------------------------------------------+--------------+
| 142.4.218.156 | anotheruser,user9688766,vistor1 | user9688766,anotheruser,vistor1,vistor1,vistor1,vistor1,vistor1,anotheruser,anotheruser,anotheruser | 10 |
| 158.69.26.144 | anotheruser,user9688766 | anotheruser,user9688766,user9688766,user9688766,user9688766 | 5 |
| 167.114.209.28 | vistor1 | vistor1 | 1 |
+----------------+---------------------------------+-----------------------------------------------------------------------------------------------------+--------------+
Option 2: You could put your first query in a common table expression (aka a "WITH" clause), and join it against the original table, like this:
WITH grouped_addr AS (
SELECT remoteaddr, count(remoteaddr) AS remote_count
FROM domain_visitors
GROUP BY remoteaddr
)
SELECT ga.remoteaddr, dv.username, ga.remote_count
FROM grouped_addr ga
INNER JOIN domain_visitors dv
ON ga.remoteaddr = dv.remoteaddr
WHERE remote_count > 500;
Here is a SQL Fiddle.
Bear in mind that this will return repeated results for any additional columns (in this example, username). This is not usually what you want. Note each of the SELECT examples in the Fiddles and see which best suits your purpose.

Postgresql: How to remove duplicate rows while joining?

I have two postgresql table called charges and orders. I'm trying to create a matview with the data of how many charges turned into orders and it's worth. The two tables are not directly related, here's the table structure of both
Charges
| date | transaction_id | amount |
|--------|----------------|--------|
| 23-Apr | abcdef | 36 |
| 23-Apr | fghijkl | 198 |
| 24-Apr | yyyyyy | 200 |
Orders
| date | order_id |
|--------|----------|
| 23-Apr | abcdef |
| 23-Apr | abcdef |
| 24-Apr | yyyyyy |
And below is the query I'm using for generating the matview,
CREATE MATERIALIZED VIEW sales AS
SELECT ch.date AS date,
(ord.id IS NOT NULL) as placed_order,
COUNT(DISTINCT(ch.transaction_id)) AS attempts,
SUM(ch.amount) AS amount
FROM charges ch
LEFT OUTER JOIN orders as ord ON ch.transaction_id = ord.order_id
GROUP BY ch.date
The problem is caused by the Amount column generated in the view. Due to the duplicates in orders table multiple rows of charges are returned during the left outer join and the amount is basically increasing.
Is there an way to Distinct the order_id column from orders at the time of joining itself?
Or is there a way to distinct the order_id and sum the amount at the time of query itself? I tried sub-query and self-join but to no luck.
You can make a sub-query on table orders to filter out the duplicates:
CREATE MATERIALIZED VIEW sales AS
SELECT ch.date AS date,
(ord.order_id IS NOT NULL) AS placed_order,
count(ch.transaction_id) AS attempts,
sum(ch.amount) AS amount
FROM charges ch
LEFT JOIN (
SELECT DISTINCT date, order_id FROM orders) ord ON ch.transaction_id = ord.order_id
GROUP BY 1, 2

how to make array_agg() work like group_concat() from mySQL

So I have this table:
create table test (
id integer,
rank integer,
image varchar(30)
);
Then some values:
id | rank | image
---+------+-------
1 | 2 | bbb
1 | 3 | ccc
1 | 1 | aaa
2 | 3 | c
2 | 1 | a
2 | 2 | b
I want to group them by id and concatenate the image name in the order given by rank. In mySQL I can do this:
select id,
group_concat( image order by rank asc separator ',' )
from test
group by id;
And the output would be:
1 aaa,bbb,ccc
2 a,b,c
Is there a way I can have this in postgresql?
If I try to use array_agg() the names will not show in the correct order and apparently I was not able to find a way to sort them. (I was using postgres 8.4 )
In PostgreSQL 8.4 you cannot explicitly order array_agg but you can work around it by ordering the rows passed into to the group/aggregate with a subquery:
SELECT id, array_to_string(array_agg(image), ',')
FROM (SELECT * FROM test ORDER BY id, rank) x
GROUP BY id;
In PostgreSQL 9.0 aggregate expressions can have an ORDER BY clause:
SELECT id, array_to_string(array_agg(image ORDER BY rank), ',')
FROM test
GROUP BY id;