Use coalesce function in sql to to return zero counts of records without showing any column value to be null - postgresql

I am trying to return zero counts as results for a query. But when I am running this query, a particular column's values are returning as null.
select tab1.source_type,
coalesce(tab2.numberofrecords,0) as numberofrecords,
coalesce(dt,current_date-40) as dt,
coalesce(client_id,client_id) as client_id
from (select distinct source_type from integration_customers )
as tab1
left join
(select count(id) as Numberofrecords,
source_type, Date(created_at) as dt,
client_id
from integration_customers ic
where Date(created_at)= current_date-39
and
source_type in
(select distinct "source" from integration_integrationconfig ii where status ='complete')
group by source_type ,dt,client_id
order by dt desc) as tab2
on tab1.source_type = tab2.source_type
But the results for this query is something like this:
I want to remove these null values and show the client id specifically for each zero record as well.
The table integration customers has the client id, created at,source type.

Related

Apply join, sort on date column and select the first row where one of the column value is not null

I have two tables(Table A and Table B) in a Postgres DB.
Both have "id" column in common. Table A has one column called "id" and Table B has three columns: "id, date, value($)".
For each "id" of Table A there exists multiple rows in Table B in the following format - (id, date, value).
For instance, for Table A with "id" as 1 if there exists following rows in Table B:
(1, 2018-06-21, null)
(1, 2018-06-20, null)
(1, 2018-06-19, 202)
(1, 2018-06-18, 200)
I would like to extract the most recent dated non-null value. For example for id - 1, the result should be 202. Please share your thoughts or let me know in case more info is required.
Here is the solution I went ahead with:
with mapping as ( select distinct table1.id, table2.value, table2.date, row_number() over (partition by table1.id order by table2.date desc nulls last) as row_number
from table1
left join table2 on table2.id=table1.id and table2.value is not null
)
select * from mapping where row_number = 1
Let me know if there is scope for improvement.
You may very well want an inner join, not an outer join. If you have an id in table1 that does not exist in table2 or that has only null values you will get NULL for both date and value. This is due to the how outer join works. What it says is if nothing in the right side table matches the ON condition then return NULL for each column in that table. So
with mapping as
(select distinct table1.id
, table2.value
, table2.date
, row_number() over (partition by table1.id order by table2.date desc nulls last) as row_number
from table1
join table2 on table2.id=table1.id and table2.value is not null
)
select *
from mapping
where row_number = 1;
See example of each here. Your query worked because all your test data satisfied the 1st condition of the ON condition. You really need test data that fails to see what your query does.
Caution: DATE and VALUE are very poor choice for a column names. Both are SQL standard reserved words, although not Postgres specifically. Further DATE is a Postgres data type. Having columns with names the same as datatype leads to confusion.

Query to select by number of associated objects

I have two tables that look like the following:
Orders
------
id
tracking_number
ShippingLogs
------
tracking_number
created_at
stage
I would like to select the IDs of Orders that have ONLY ONE ShippingLog associated with it, and the stage of the ShippingLog must be error. If it has two ShippingLog entries, I don't want it. If it has one ShippingLog bug its stage is shipped, I don't want it.
This is what I have, and it doesn't work, and I know why (it finds the log with the error, but has no way of knowing if there are others). I just don't really know how to get it the way I need it.
SELECT DISTINCT
orders.id, shipping_logs.created_at, COUNT(shipping_logs.*)
FROM
orders
JOIN
shipping_logs ON orders.tracking_number = shipping_logs.tracking_number
WHERE
shipping_logs.created_at BETWEEN '2021-01-01 23:40:00'::timestamp AND '2021-01-26 23:40:00'::timestamp AND shipping_logs.stage = 'error'
GROUP BY
orders.id, shipping_logs.created_at
HAVING
COUNT(shipping_logs.*) = 1
ORDER BY
orders.id, shipping_logs.created_at DESC;
If you want to retain every column from the join of the two tables given your requirements, then I would suggest using COUNT here as an analytic function:
WITH cte AS (
SELECT o.id, sl.created_at,
COUNT(*) OVER (PARTITION BY o.id) num_logs,
COUNT(*) FILTER (WHERE sl.stage <> 'error')
OVER (PARTITION BY o.id) non_error_cnt
FROM orders o
INNER JOIN shipping_logs sl ON sl.tracking_number = o.tracking_number
WHERE sl.created_at BETWEEN '2021-01-01 23:40:00'::timestamp AND
'2021-01-26 23:40:00'::timestamp
)
SELECT id AS order_id, created_at
FROM cte
WHERE num_logs = 1 AND non_error_cnt = 0
ORDER BY id, created_at DESC;

How do I make my RANK () OVER query work in select?

table image
I have this table that I need to sort in the following way:
need to rank Departments by Salary;
need to show if Salary = NULL - 'No data to be shown' message
need to add total salary paid to the department
need to count people in the department
SELECT RANK() OVER (
ORDER BY Salary DESC
)
,CASE
WHEN Salary IS NULL
THEN 'NO DATA TO BE SHOWN'
ELSE Salary
,Count(Fname)
,Total(Salary) FROM dbo.Employees
I get an error saying:
Column 'dbo.Employees.Salary' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Why so?
Column 'dbo.Employees.Salary' is invalid in the select list because it
is not contained in either an aggregate function or the GROUP BY
clause.
Why so?
The aggregate functions are returning a single value for the whole table, you can't SELECT a field alongside them it doesn't makes sense. Like say, you have a students table you apply Sum(marks) for the whole students table, and you are then also selecting student's name Select studentname in your query. Which student's name will the database engine select? Confusing
Column "invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause"
I tried this-
using inner query
SELECT RANK() OVER (ORDER BY SAL DESC) RANK,FNAME,DEPARTMENT
CASE
WHEN SAL IS NULL THEN 'NO DATA TO BE SHOWN'
ELSE SAL
END
FROM
(SELECT COUNT(FNAME) FNAME, SUM(SALARY) SAL, DEPARTMENT
FROM TESTEMPLOYEE
GROUP BY DEPARTMENT) t

array_agg group by and null

Given this table:
SELECT * FROM CommodityPricing order by dateField
"SILVER";60.45;"2002-01-01"
"GOLD";130.45;"2002-01-01"
"COPPER";96.45;"2002-01-01"
"SILVER";70.45;"2003-01-01"
"GOLD";140.45;"2003-01-01"
"COPPER";99.45;"2003-01-01"
"GOLD";150.45;"2004-01-01"
"MERCURY";60;"2004-01-01"
"SILVER";80.45;"2004-01-01"
As of 2004, COPPER was dropped and mercury introduced.
How can I get the value of (array_agg(value order by date desc) ) [1] as NULL for COPPER?
select commodity,(array_agg(value order by date desc) ) --[1]
from CommodityPricing
group by commodity
"COPPER";"{99.45,96.45}"
"GOLD";"{150.45,140.45,130.45}"
"MERCURY";"{60}"
"SILVER";"{80.45,70.45,60.45}"
SQL Fiddle
select
commodity,
array_agg(
case when commodity = 'COPPER' then null else price end
order by date desc
)
from CommodityPricing
group by commodity
;
To "pad" missing rows with NULL values in the resulting array, build your query on full grid of rows and LEFT JOIN actual values to the grid.
Given this table definition:
CREATE TEMP TABLE price (
commodity text
, value numeric
, ts timestamp -- using ts instead of the inappropriate name date
);
I use generate_series() to get a list of timestamps representing the years and CROSS JOIN to a unique list of all commodities (SELECT DISTINCT ...).
SELECT commodity, (array_agg(value ORDER BY ts DESC)) AS years
FROM generate_series ('2002-01-01 00:00:00'::timestamp
, '2004-01-01 00:00:00'::timestamp
, '1y') t(ts)
CROSS JOIN (SELECT DISTINCT commodity FROM price) c(commodity)
LEFT JOIN price p USING (ts, commodity)
GROUP BY commodity;
Result:
COPPER {NULL,99.45,96.45}
GOLD {150.45,140.45,130.45}
MERCURY {60,NULL,NULL}
SILVER {80.45,70.45,60.45}
SQL Fiddle.
I cast the array to text in the fiddle, because the display sucks and would swallow NULL values otherwise.

How to get fields and added in group by in PostreSQL8.4?

I am selecting column used in group by and count, and query looks something like
SELECT s.country, count(*) AS posts_ct
FROM store s
JOIN store_post_map sp ON sp.store_id = s.id
GROUP BY 1;
However, I want to select some more fields, like store name or store address from store table where count is max, but I don't to include that in group by clause.
For instance, to get the stores with the highest post-count per country:
SELECT DISTINCT ON (s.country)
s.country, s.store_id, s.name, sp.post_ct
FROM store s
JOIN (
SELECT store_id, count(*) AS post_ct
FROM store_post_map
GROUP BY store_id
) sp ON sp.store_id = s.id
ORDER BY s.country, sp.post_ct DESC
Add any number of columns from store to the SELECT list.
Details about this query style in this related answer:
Select first row in each GROUP BY group?
Reply to comment
This produces the count per country and picks (one of) the store(s) with the highest post-count:
SELECT DISTINCT ON (s.country)
s.country, s.store_id, s.name
,sum(post_ct) OVER (PARTITION BY s.country) AS post_ct_for_country
FROM store s
JOIN (
SELECT store_id, count(*) AS post_ct
FROM store_post_map
GROUP BY store_id
) sp ON sp.store_id = s.id
ORDER BY s.country, sp.post_ct DESC;
This works because the window function sum() is applied before DISTINCT ON per definition.