Nested SQL Query Optimization - postgresql

is there any better way to write this query to be optimized?
SELECT * FROM data d
WHERE d.id IN (SELECT max(d1.id)
FROM data d1
WHERE d1.name='A'
AND d1.version='2')
I am not so good with SQL.

With PostgreSQL v13, you can do it like this:
SELECT * FROM data
WHERE name = 'A'
AND version = '2'
ORDER BY is DESC
FETCH FIRST 1 ROWS WITH TIES;
That will give you all rows where id is the maximum.
If id is unique, you can use FETCH FIRST 1 ROWS ONLY or LIMIT 1, which will also work with older PostgreSQL versions.

Apart from other answers that are equally interesting / correct, IN is typically a non-performant keyword. You can remove it by using a slightly different way of writing your own query:
SELECT * FROM data d
WHERE d.name = 'A' and d.version = '2' and
d.id = (SELECT max(d1.id) FROM data d1 WHERE d1.name='A' AND d1.version='2')

Related

SQL left join on maximum date

I have two tables: contracts and contract_descriptions.
On contract_descriptions there is a column named contract_id which is equal on contracts table records.
I am trying to join the latest record on contract_descriptions:
SELECT *
FROM contracts c
LEFT JOIN contract_descriptions d ON d.contract_id = c.contract_id
AND d.date_description =
(SELECT MAX(date_description)
FROM contract_descriptions t
WHERE t.contract_id = c.contract_id)
It works, but is it the performant way to do it? Is there a way to avoid the second SELECT?
You could also alternatively use DISTINCT ON:
SELECT * FROM contracts c LEFT JOIN (
SELECT DISTINCT ON (cd.contract_id) cd.* FROM contract_descriptions cd
ORDER BY cd.contract_id, cd.date_description DESC
) d ON d.contract_id = c.contract_id
DISTINCT ON selects only one row per contract_id while the sort clause cd.date_description DESC ensures that it is always the last description.
Performance depends on many values (for example, table size). In any case, you should compare both approaches with EXPLAIN.
Your query looks okay to me. One typical way to join only n rows by some order from the other table is a lateral join:
SELECT *
FROM contracts c
CROSS JOIN LATERAL
(
SELECT *
FROM contract_descriptions cd
WHERE cd.contract_id = c.contract_id
ORDER BY cd.date_description DESC
FETCH FIRST 1 ROW ONLY
) cdlast;

MYSQL- query too slow to load

My query is working but it takes time to display the data. Can you help me to make it quick.
$sql="SELECT allinvty3.*, stock_transfer_tb.* from stock_transfer_tb
INNER JOIN allinvty3 on stock_transfer_tb.in_code = allinvty3.in_code
where stock_transfer_tb.in_code NOT IN (SELECT barcode.itemcode from barcode where stock_transfer_tb.refnumber = barcode.refitem)";
I would recommend using the following query:
SELECT
a.*,
s.*
FROM stock_transfer_tb s
INNER JOIN allinvty3 a
ON s.in_code = a.in_code
WHERE
NOT EXISTS (SELECT 1 FROM barcode b
WHERE s.refnumber = b.refitem AND s.in_code = b.itemcode);
If this still doesn't give you the performance you want, then you should look into adding indices on all columns involved in the join and where clause.

UPDATE column from one table to another

I need to update a column in a table to the latest date/time combination from another table. How can I get the latest date/time combination from the one table and then update a column with that date in another table?
The two tables I am using are called dbo.DD and dbo.PurchaseOrders. The JOIN between the two tables are dbo.DueDate.XDORD = dbo.PurchaseOrders.PBPO AND dbo.DueDate.XDLINE = dbo.PurchaseOrders.PBSEQ. The columns from dbo.DueDate that I need the latest date/time from are dbo.DueDate.XDCCTD and dbo.DueDate.XDCCTT.
I need to set dbo.PurchaseOrders.PBDUE = dbo.DueDate.XDCURDT.I can't use an ORDER BY statement in the UPDATE statement, so I'm not sure how to do this. I know row_number sometimes works in these situations, but I'm unsure of how to implement.
The general pattern is:
;WITH s AS
(
SELECT
key, -- may be multiple columns
date_col,
rn = ROW_NUMBER() OVER
(
PARTITION BY key -- again, may be multiple columns
ORDER BY date_col DESC
)
FROM dbo.SourceTable
)
UPDATE d
SET d.date_col = s.date_col
FROM dbo.DestinationTable AS d
INNER JOIN s
ON d.key = s.key -- one more time, may need multiple columns here
WHERE s.rn = 1;
I didn't try to map your table names and columns because (a) I didn't get from your word problem which table was the source and which was the destination and (b) those column names look like alphabet soup and I would have screwed them up anyway.
Did seem though that the OP got this specific code working:
;WITH s AS
(
SELECT
XDORD, XDLINE,
XDCURDT,
rn = ROW_NUMBER() OVER
(
PARTITION BY XDORD, XDLINE
ORDER BY XDCCTD DESC, XDCCTT desc
)
FROM dbo.DueDate
)
UPDATE d
SET d.PBDUE = s.XDCURDT
FROM dbo.PurchaseOrders AS d
INNER JOIN s
ON d.PBPO = s.XDORD AND d.PBSEQ = s.XDLINE
WHERE s.rn = 1;

T-SQL query one table, get presence or absence of other table value

I'm not sure what this type of query is called so I've been unable to search for it properly. I've got two tables, Table A has about 10,000 rows. Table B has a variable amount of rows.
I want to write a query that gets all of Table A's results but with an added column, the value of that column is a boolean that says whether the result also appears in Table B.
I've written this query which works but is slow, it doesn't use a boolean but rather a count that will be either zero or one. Any suggested improvements are gratefully accepted:
SELECT u.number,u.name,u.deliveryaddress,
(SELECT COUNT(productUserid)
FROM ProductUser
WHERE number = u.number and productid = #ProductId)
AS IsInPromo
FROM Users u
UPDATE
I've run the query with actual execution plan enabled, I'm not sure how to show the results but various costs are:
Nested Loops (left semi join): 29%]
Clustered Index scan (User Table): 41%
Clustered Index Scan (ProductUser table): 29%
NUMBERS
There are 7366 users in the users table and currently 18 rows in the productUser table (although this will change and could be in the thousands)
You can use EXISTS to short circuit after the first row is found rather than COUNT-ing all matching rows.
SQL Server does not have a boolean datatype. The closest equivalent is BIT
SELECT u.number,
u.name,
u.deliveryaddress,
CASE
WHEN EXISTS (SELECT *
FROM ProductUser
WHERE number = u.number
AND productid = #ProductId) THEN CAST(1 AS BIT)
ELSE CAST(0 AS BIT)
END AS IsInPromo
FROM Users u
RE: "I'm not sure what this type of query is called". This will give a plan with a semi join. See Subqueries in CASE Expressions for more about this.
Which management system are you using?
Try this:
SELECT u.number,u.name,u.deliveryaddress,
case when COUNT(p.productUserid) > 0 then 1 else 0 end
FROM Users u
left join ProductUser p on p.number = u.number and productid = #ProductId
group by u.number,u.name,u.deliveryaddress
UPD: this could be faster using mssql
;with fff as
(
select distinct p.number from ProductUser p where p.productid = #ProductId
)
select u.number,u.name,u.deliveryaddress,
case when isnull(f.number, 0) = 0 then 0 else 1 end
from Users u left join fff f on f.number = u.number
Since you seem concerned about performance, this query can perform faster as this will cause index seek on both tables versus an index scan:
SELECT u.number,
u.name,
u.deliveryaddress,
ISNULL(p.number, 0) IsInPromo
FROM Users u
LEFT JOIN ProductUser p ON p.number = u.number
WHERE p.productid = #ProductId

Table cardinality information in Firebird

I am new to Firebird and I am messing around in it's meta data to get some information about the table structure and etc.
My problem is that I can't seem to find some information about estimated table cardinality. Is there a way to get this information from Firebird?
Edit:
By cardinality i mean the number of rows in a table :) and for my use the select count(*) is not an option.
You can use an aproximative method, using the selectivity of primary key like this:
SELECT
R.RDB$RELATION_NAME TABLENAME,
(
CASE
WHEN I.RDB$STATISTICS = 0 THEN 0
ELSE 1 / I.RDB$STATISTICS
END) AS COUNTRECORDS8
FROM RDB$RELATIONS R
JOIN RDB$RELATION_CONSTRAINTS C ON (R.RDB$RELATION_NAME = C.RDB$RELATION_NAME AND C.RDB$CONSTRAINT_TYPE = 'PRIMARY KEY')
JOIN RDB$INDICES I ON (I.RDB$RELATION_NAME = C.RDB$RELATION_NAME AND I.RDB$INDEX_NAME = C.RDB$INDEX_NAME)
To get the number of rows in a table you use the COUNT() function as in any other SQL DB, ie
SELECT count(*) FROM table;
Why not use a query:
select count(distinct field_name)/(count(field_name) + 0.0000) from table_name
The closer result to 1 the higher cardinality of a specified column.