Understanding Postgres Query - postgresql

Regarding the difference between...
select * from table_a where id != 30 and name != 'Kevin';
and
select * from table_a where id != 30 or name != 'Kevin';
First one means, "select all rows from table_a where the id is not 30 and the name is not Kevin".
So {Id, Name} row of {30, 'Bill'} would be returned from this first query.
But, the second one means, "select all rows from table_a where the id is not 30 or the name is not 'Kevin'".
So the above {30, 'Bill'} would not be returned from this second query.
Is that right?

select * from table_a where id != 30 and name != 'Kevin';
So {Id, Name} row of {30, 'Bill'} would be returned from this first
query.
No, it wouldn't.
select * from table_a where id != 30 or name != 'Kevin';
So the above {30, 'Bill'} would not be returned from this second
query.
No, it would. You have the logic backwards. Just try it.

Nope. The second query means "select all rows where the id is not 30 or the name is not 'Kevin'", hence a name of 'Bill' qualifies the record for inclusion in the query.

Recap:
A B not(A) not(B) AND OR
1 1 0 0 0 0
1 0 0 1 0 1
0 1 1 0 0 1
0 0 1 1 1 1
So, the two query's will return the same rows only if:
1- id=30 and name='Kevin'
or
2- id!=30 and name!='Kevin'

Quick logic expression transformation tip:
NOT (A AND B) == NOT A OR NOT B
NOT (A OR B) == NOT A AND NOT B

Related

PostgreSQL Query not returning the proper results

So this is my table structure
learning_paths
id
name
version
created_at
updated_at
learning_path_levels
id
name
learning_path_id
order
created_at
updated_at
learning_path_level_nodes
id
name
description
documentation_links
evaluation_methodology
learning_path_level_id
created_at
updated_at
learning_path_node_users
id
learning_path_level_node_id
user_id
evaluated_by
evaluated_at
is_successful
created_at
updated_at
I'm writing a query to retrieve the learning_path_name, count of the amount of levels each learning path has, the pending and completed nodes per level for the user, and the total amount of nodes per level.
I have the following query
select learning_paths."name",
sum(case when learning_path_node_users.is_successful and learning_path_node_users.user_id is not null then 1 else 0 end) as completed_nodes,
sum(case when learning_path_node_users.is_successful = false or learning_path_node_users.user_id is null then 1 else 0 end) as pending_nodes,
count(learning_path_levels.id) as total_levels,
count(*) as total_nodes
from learning_path_level_nodes
inner join learning_path_levels on learning_path_levels.id = learning_path_level_nodes.learning_path_level_id
inner join learning_paths on learning_paths.id = learning_path_levels.learning_path_id
left join learning_path_node_users on learning_path_node_users.learning_path_level_node_id = learning_path_level_nodes.id
group by learning_paths."name"
which returns:
name
completed_nodes
pending_nodes
total_levels
total_nodes
Devops
5
3
8
8
QA
0
1
1
1
Project manager
3
3
6
6
AI
0
5
5
5
Everything is correct, except for the levels count,
for example, for Devops,it should be 2, and it is returning 8
for Project Manager it should be 2, and it is returning 6
a pattern I see is that it returns the amount of nodes as the amount of levels,
How can I fix this?
I'd really appreciate any help or suggestions, as I've been struggling with this.
Thanks in advance
EDIT: As per your suggestion, I'm attaching a fiddle with the tables and data.
https://dbfiddle.uk/?rdbms=postgres_14&fiddle=f29676ff7051686a28de96928db1e3a6
While I don't get the exact results you want, I think you want to add a distinct to your count for the total levels:
select
lp.name,
sum(case when u.is_successful and u.user_id is not null then 1 else 0 end) as completed_nodes,
sum(case when u.is_successful = false or u.user_id is null then 1 else 0 end) as pending_nodes,
count(distinct lpl.id) as total_levels, -- added "distinct"
array_agg (lpl.id) as level_detail, -- debugging aid
count(*) as total_nodes
from
learning_path_level_nodes n
join learning_path_levels lpl on lpl.id = n.learning_path_level_id
join learning_paths lp on lp.id = lpl.learning_path_id
left join learning_path_node_users u on u.learning_path_level_node_id = n.id
group by
lp.name
To help expose the rationale, I added the field level_detail, which you can delete, to show why the results are what they are. You can obviously remove that once the results are what you want.
If it's not what you expect, perhaps you can explain or give by example what I might be missing.

How to select rows after using row_number() in postgresql

I have to select the null values one by one from a column with some null values in it. I have to select them one by one because I want to update all of them with different values.
I chose to do it with row_number() after running out of possible options in my mind, So here is the query that I executed
select cid, ROW_NUMBER () OVER (ORDER BY random()) as row from aa_dev.calls where cid is null;
How can I pick each row without storing it in any temp table and update each row?
This column has 100 values 96 are populated with integers with only 4 nulls.
I need to populate row_number as follows. for example there are total 10 values in this data and 3 of them are null values.
cid
row
1
0
54
0
null
1
26
0
86
0
45
0
null
2
56
0
null
3
5
0
Two possible ways came to mind:
demos:db<>fiddle
Using the row_number() over partitions which are (cid = NULL) and (cid != NULL). And this just execute for those records that are NULL, all others are set to 0:
SELECT
*,
CASE WHEN cid IS NULL THEN
row_number() OVER (PARTITION BY cid IS NULL)
ELSE 0 END
FROM
mytable
Second way is simply using a UNION construct over (cid = NULL) and (cid != NULL):
SELECT
cid,
row_number() OVER ()
FROM mytable
WHERE cid IS NULL
UNION
SELECT
cid,
0
FROM mytable
WHERE cid IS NOT NULL

Postgresql Query Results in Division by 0 After Use of Case to Check for 0

The following query is using a subquery to allow for a weighted value to be calculated. The problem I am receiving is a division by 0 error that occurs at random for true 0 value aggregates as well as possible >0 aggregate returns from the subquery.
SELECT
table1.id,
SUM(subquery1.total_value_1),
CASE
WHEN SUM(subquery1.total_value_1) = 0 THEN 0
ELSE ROUND(SUM(percentage_value * (table1.value_1 /subquery1.total_value_1 ::FLOAT)) ::NUMERIC,2)
END AS percentage_value
FROM
table1,
(SELECT
id,
SUM(value_1) AS total_value_1
FROM
table1
WHERE
report_time BETWEEN '2016-10-28 00:00' AND '2016-10-29 23:59'
GROUP BY
id
) subquery1
WHERE
table1.id = subquery1.id
AND report_time BETWEEN '2016-10-28 00:00' AND '2016-10-29 23:59'
AND table1.id = 12572
GROUP BY
table1.id
ORDER BY
table1.id
In some instances, the Case statement is still doing the evaluation of the division despite the value of subquery1.total_value_1 being 0. Just to note, there is no possibility for subquery1.total_value_1 being NULL, as the table defaults this value to 0 on insert if the value added is not defined.
In example below, sum(column) is 1 for both rows, while column is equal to zero or one:
a=# with v as (
select generate_series(0,1,1) al
)
select sum(v.al) over(),v.al
from v;
sum | al
-----+----
1 | 0
1 | 1
(2 rows)
so in your SUM(subquery1.total_value_1) = 0 can be not equal to zero, but subquery1.total_value_1 ::FLOAT will be, this way you get division by zero

TSQL Update value to 1 if Max value between two table else 0

I have two table
TABLE 1 : Stage_product
PRODUCT_ID SYS_ROWDATETIMEUTC
1 2015-03-13 06:09:30.040
2 ....
3
TABLE 2 : DIM_Product
PRODUCT_ID SYS_ROWSTARTDATETIMEUTC SYS_ROWISCURRENT
1 2014-03-13 06:09:30.040 0
2 2015-03-13 06:09:30.040 1
I want to do an update statement that if the value SYS_ROWDATETIMEUTC in the first table is more recent than the value SYS_ROWSTARTDATETIMEUTC in the table, then the value SYS_ROWISCURRENT in the second table is set to 0, else 1.
You can use the following query:
UPDATE t2
SET t2.SYS_ROWISCURRENT = CASE
WHEN t1.SYS_ROWDATETIMEUTC > t2.SYS_ROWSTARTDATETIMEUTC THEN 0
ELSE 1
END
FROM Table2 t2
INNER JOIN Table1 t1 ON t2.PRODUCT_ID = t1.PRODUCT_ID
I assume you want to compare dates between the two tables for the same product.

TSQL - Mapping one table to another without using cursor

I have tables with following structure
create table Doc(
id int identity(1, 1) primary key,
DocumentStartValue varchar(100)
)
create Metadata (
DocumentValue varchar(100),
StartDesignation char(1),
PageNumber int
)
GO
Doc contains
id DocumentStartValue
1000 ID-1
1100 ID-5
2000 ID-8
3000 ID-9
Metadata contains
Documentvalue StartDesignation PageNumber
ID-1 D 0
ID-2 NULL 1
ID-3 NULL 2
ID-4 NULL 3
ID-5 D 0
ID-6 NULL 1
ID-7 NULL 2
ID-8 D 0
ID-9 D 0
What I need to is to map Metadata.DocumentValues to Doc.id
So the result I need is something like
id DocumentValue PageNumber
1000 ID-1 0
1000 ID-2 1
1000 ID-3 2
1000 ID-4 3
1100 ID-5 0
1100 ID-6 1
1100 ID-7 2
2000 ID-8 0
3000 ID-9 0
Can it be achieved without the use of cursor?
Something like, sorry can't test
;WITH RowList AS
( --assign RowNums to each row...
SELECT
ROW_NUMBER() OVER (ORDER BY id) AS RowNum,
id, DocumentStartValue
FROM
doc
), RowPairs AS
( --this allows us to pair a row with the previous rows to create ranges
SELECT
R.DocumentStartValue AS Start, R.id,
R1.DocumentStartValue AS End
FROM
RowList R JOIN RowList R1 ON R.RowNum + 1 = R1.RowNum
)
--use ranges to join back and get the data
SELECT
RP.id, M.DocumentValue, M.PageNumber
FROM
RowPairs RP
JOIN
Metadata M ON RP.Start <= M.DocumentValue AND M.DocumentValue < RP.End
Edit: This assumes that you can rely on the ID-x values matching and being ascending. If so, StartDesignation is superfluous/redundant and may conflict with the Doc table DocumentStartValue
with rm as
(
select DocumentValue
,PageNumber
,case when StartDesignation = 'D' then 1 else 0 end as IsStart
,row_number() over (order by DocumentValue) as RowNumber
from Metadata
)
,gm as
(
select
DocumentValue as DocumentGroup
,DocumentValue
,PageNumber
,RowNumber
from rm
where RowNumber = 1
union all
select
case when rm.IsStart = 1 then rm.DocumentValue else gm.DocumentGroup end
,rm.DocumentValue
,rm.PageNumber
,rm.RowNumber
from gm
inner join rm on rm.RowNumber = (gm.RowNumber + 1)
)
select d.id, gm.DocumentValue, gm.PageNumber
from Doc d
inner join gm on d.DocumentStartValue = gm.DocumentGroup
Try to use query above (maybe you will need to add option (maxrecursion ...) also) and add index on DocumentValue for Metadata table. Also, it it's possible - it will be better to save appropriate group on Metadat rows inserting.
UPD: I've tested it and fixed errors in my query, not it works and give result as in initial question.
UPD2: And recommended indexes:
create clustered index IX_Metadata on Metadata (DocumentValue)
create nonclustered index IX_Doc_StartValue on Doc (DocumentStartValue)