Adding row value after grouping the table causes * must appear in the GROUP BY clause - postgresql

I am trying to join 2 tables like so:
left join (
select t1.createdate, min(f1.createdate) as mindt, f1.status_aft
from new_table t1
left join new_folder f1 on t1.veh_id = f1.veh_id
where f1.createdate > t1.createdate
group by t1.createdate
) h3
on t1.createdate = h3.createdate
and f1.createdate = h3.mindt
But I am getting an error:
ERROR: column "f1.status_aft" must appear in the GROUP BY clause or be used in an aggregate function
This makes sense because I do not group it, my goal is just to take the value that is in that current row when f1.createdate is min.
For example:
A B C
one 10 a
one 15 b
two 20 c
two 25 d
Becomes
A B C
one 10 a
two 20 c
Because a and c was the values when column B were the lowest after grouping it by column A.
I've seen this answer but I still can't apply it to my scenario.
How can I achieve the desired result?

my goal is just to take the value that is in that current row when f1.createdate is min.
If you want just one row, you can order by and limit:
left join (
select t1.t1.createdate, f1.createdate as mindt, f1.status_aft
from new_table t1
left join new_folder f1 on t1.veh_id = f1.veh_id
where f1.createdate > t1.createdate
order by t1.createdate limit 1
) h3

Related

How can I list other matching values ​even if there is an unmatched value in the query?

In my query there is a value that will not match in the demand category table. Therefore, since one value does not match in the output of my query, other matching values ​​do not appear.
I want to do;
How can I list other matching values ​​even if there is an unmatched value in the query?
process Table
fk_unit_id fk_unit_position fk_demand_category
1 2 1
unit table
unit_id
1
unit_position table
unit_position
2
demand_category table
demand_category
1
Query:
SELECT unit_name,unit_position_name,demand_category_name From process
INNER JOIN unit ON process.fk_unit_id = unit_id and unit_id =1
INNER JOIN unit_position ON process.fk_unit_position_id = unit_position_id and unit_position_id = 2
INNER JOIN demand_category ON process.fk_demand_category_id = demand_category_id and demand_category_id =0 ;
Switch INNER JOIN on demand_category with LEFT JOIN
LEFT JOIN gets all records from the LEFT linked and the related record from the right table ,but if you have selected some columns from the RIGHT table, if there is no related records, these columns will contain NULL.
SELECT unit_name,unit_position_name,demand_category_name From process
INNER JOIN unit ON process.fk_unit_id = unit_id and unit_id =1
INNER JOIN unit_position ON process.fk_unit_position_id = unit_position_id and unit_position_id = 2
LEFT JOIN demand_category ON process.fk_demand_category_id = demand_category_id and demand_category_id =0 ;
You can use outer join to have the columns that don't match, just the corresponding values in other table will be padded with null. Other way is to use IN operator, but slower query performance.

Join two tables on all columns to determine if they contain identical information

I want to check if tables table_a and table_b are identical. I thought I could full outer join both tables on all columns and count the number of rows and missing values. However, both tables have many columns and I do not want to explicitly type out every column name.
Both tables have the same number of columns as well as names. How can I full outer join both of them on all columns without explicitly typing every column name?
I would like to do something along this syntax:
select
count(1)
,sum(case when x.id is null then 1 else 0 end) as x_nulls
,sum(case when y.id is null then 1 else 0 end) as y_nulls
from
x
full outer join
y
on
*
;
You can use NATURAL FULL OUTER JOIN here. The NATURAL key word will join on all columns that have the same name.
Just testing if the tables are identical could then be:
SELECT *
FROM x NATURAL FULL OUTER JOIN y
WHERE x.id IS NULL OR y.id IS NULL
This will show "orphaned" rows in either table.
You might use except operators.
For example the following would return an empty set if both tables contain the same rows:
select * from t1
except
select * from t2;
If you want to find rows in t1 that are different to those in t2 you could do
select * from t1
where not exists (select * from t1 except select * from t2);
Provided the number and types of columns match you can use select *, the tables' columns can vary in names; you could also invert the above and union to return combined differences.

SQL left join on maximum date

I have two tables: contracts and contract_descriptions.
On contract_descriptions there is a column named contract_id which is equal on contracts table records.
I am trying to join the latest record on contract_descriptions:
SELECT *
FROM contracts c
LEFT JOIN contract_descriptions d ON d.contract_id = c.contract_id
AND d.date_description =
(SELECT MAX(date_description)
FROM contract_descriptions t
WHERE t.contract_id = c.contract_id)
It works, but is it the performant way to do it? Is there a way to avoid the second SELECT?
You could also alternatively use DISTINCT ON:
SELECT * FROM contracts c LEFT JOIN (
SELECT DISTINCT ON (cd.contract_id) cd.* FROM contract_descriptions cd
ORDER BY cd.contract_id, cd.date_description DESC
) d ON d.contract_id = c.contract_id
DISTINCT ON selects only one row per contract_id while the sort clause cd.date_description DESC ensures that it is always the last description.
Performance depends on many values (for example, table size). In any case, you should compare both approaches with EXPLAIN.
Your query looks okay to me. One typical way to join only n rows by some order from the other table is a lateral join:
SELECT *
FROM contracts c
CROSS JOIN LATERAL
(
SELECT *
FROM contract_descriptions cd
WHERE cd.contract_id = c.contract_id
ORDER BY cd.date_description DESC
FETCH FIRST 1 ROW ONLY
) cdlast;

Update table with from sub select

I have two table a and b.
I want to update the row in table a that is the most recent insert for each id from the earliest insert in table b where a.id = b.id
I've been trying to use an update statement with a sub select in the from.
If I execute the sub query on its own it returns x number of rows, however when I execute the whole update statement it updated y number of rows.
update a
set title = b.title
created_at = b.created_at
from
(
select
e.id,e.title,e.created_at
from
(
select
l.id,
l.title,
l.created_at
l.t_insert
from b l
left join b r
l.id = r.id and l.t_insert > r.t_insert
) e
join
(
select
l.id,
l.title,
l.created_at,
l.t_insert
from a l
left join a r on l.report_id = r.report_id and l.t_insert <
r.t_insert
) f
)
where
a.id=b.id
I want the same number of rows to be updated as returned in the sub select query in the from.
In this case, having fewer rows updated than returned by the subquery could be because one row id is returned more than once in the subquery. If that happens, the update statement will still only update the row once. I'm assuming the statement you've provided is not exactly what you're running, but you should check that the subquery is not providing duplicates in the id field of the subquery (either using DISTINCT or GROUP BY or by double checking your JOIN conditions.

How to count total number of records after join the three tables in postgresql?

I have a query which gives me total 12408 records after executing but i want this give me total records as count column
select
c.complaint_id,c.server_time,c.completion_date,c.road_id,c.photo,c.dept_code,c.dist_code,c.eng_userid,c.feedback_type,c.status,p.dist_name,p.road_name,p.road_dept,e.display_name,e.mobile
from complaints as c INNER JOIN pwd_roads as p ON p.road_id=c.road_id
INNER JOIN enc_details as e ON CAST(e.enc_code as INTEGER) = p.enccode
where c.complaint_id=c.parent_complaint_id and c.dept_code='PWDBnR'
and c.server_time between '2018-09-03' and '2018-12-19'
You can solve this issue using window functions. For example, if you want your first columns to be a count of the total rows done by the SELECT statement:
select count(1) over(range between unbounded preceding and unbounded following) as total_row_count
, c.complaint_id,c.server_time,c.completion_date,c.road_id,c.photo,c.dept_code,c.dist_code,c.eng_userid,c.feedback_type,c.status,p.dist_name,p.road_name,p.road_dept,e.display_name,e.mobile from complaints as c INNER JOIN pwd_roads as p ON p.road_id=c.road_id INNER JOIN enc_details as e ON CAST(e.enc_code as INTEGER) = p.enccode where c.complaint_id=c.parent_complaint_id and c.dept_code='PWDBnR' and c.server_time between '2018-09-03' and '2018-12-19'
Note that the window function is evaluated before the LIMIT clause if one is used, so if you were to add LIMIT 100 to the query it might give a row count greater than 100 even though a max of 100 rows would be returned.
Easiest but not very elegant way to do this is:
select count(*)
from
(
select c.complaint_id,c.server_time,c.completion_date,c.road_id,c.photo,c.dept_code,c.dist_code,c.eng_userid,c.feedback_type,c.status,p.dist_name,p.road_name,p.road_dept,e.display_name,e.mobile from complaints as c INNER JOIN pwd_roads as p ON p.road_id=c.road_id INNER JOIN enc_details as e ON CAST(e.enc_code as INTEGER) = p.enccode where c.complaint_id=c.parent_complaint_id and c.dept_code='PWDBnR' and c.server_time between '2018-09-03' and '2018-12-19'
)