I was searching on NET for alternative to MySQL FOUND_ROWS(), how to get all rows when using limit and WHERE. I need this for pagination.
I found a few different approach, but I don't have much experience (I migrate to PostgreSQL a week ago) which approach will give the best performance?
SELECT stuff, count(*) OVER() AS total_count
FROM table
WHERE condition
ORDER BY stuff OFFSET 40 LIMIT 20
BEGIN;
SELECT * FROM mytable OFFSET X LIMIT Y;
SELECT COUNT(*) AS total FROM mytable;
END;
BEGIN ISOLATION LEVEL SERIALIZABLE;
SELECT id, username, title, date FROM posts ORDER BY date DESC LIMIT 20;
SELECT count(id, username, title, date) AS total FROM posts;
END;
select * into tmp_tbl from tbl where [limitations];
select * from tmp_tbl offset 10 limit 10;
select count(*) from tmp_tbl;
drop table tmp_tbl;
If there is another approach not described yet and will get the best performance, please let me know.
I am using PostgreSQL version 9.3.4 and PDO for PHP.
Related
How can we convert the below oracle query into Postgres?
SELECT EMPLOYEE_CD AS EMPLOYEE_CD,
EMPLOYEE_ID AS EMPLOYEE_ID,
TO_NUMBER(COUNT (1)) AS CNT
FROM EMPLOYEE
GROUP BY EMPLOYEE_CD, EMPLOYEE_ID
HAVING COUNT (1) > 1;
It makes no sense to convert a number to a number (and this senseless in Oracle as well). So just remove the to_number() from your query:
SELECT EMPLOYEE_CD AS EMPLOYEE_CD,
EMPLOYEE_ID AS EMPLOYEE_ID,
COUNT (*) AS CNT
FROM EMPLOYEE
GROUP BY EMPLOYEE_CD, EMPLOYEE_ID
HAVING COUNT (*) > 1;
It has been a long standing, but nevertheless wrong, myth that count(1) is faster than count(*). In Postgres count(1) is actually slower than count(*). So using count(1) never makes sense nor does it improve anything.
Working with table with columns:
(PK)sales_log_id
user_id
created_at
With sales_log_id to represent transaction activities for users.
I have been able to query how many users have x amount of transactions.
Now I would like to find out how many users have eg. > 10 AND < 20 transactions in a certain period of time.
Being new with databases and Postgres, I'm learning that you can do a query and another query with the previous result (subquery). So I tried to query first how many users are having < 30 transactions in June and later query the result for users having > 10 transactions.
SELECT COUNT(DISTINCT t.user_id) usercounter
FROM (
SELECT user_id, created_at, sales_log_id
FROM sales_log
WHERE created_at BETWEEN
'2019-06-01' AND '2019-06-30'
GROUP BY user_id, created_at, sales_log_id
HAVING COUNT (sales_log_id) <30
)t
GROUP BY t.user_id
HAVING COUNT (t.sales_log.id) >10;
But it produced an error
ERROR: missing FROM-clause entry for table "sales_log"
LINE 11: HAVING COUNT (t.sales_log.id) >10;
^
SQL state: 42P01
Character: 359
Can anyone please provide the correct way to do this?
I think it is as simple as
SELECT count(*)
FROM (
SELECT user_id, COUNT(*)
FROM sales_log
WHERE created_at BETWEEN '2019-06-01' AND '2019-06-30'
GROUP BY user_id
HAVING COUNT (sales_log_id) BETWEEN 11 AND 29
) AS q;
Only add DISTINCT to a query if you really need it.
It is just one word, but it can have a big performance penalty.
An application inherited by me was oriented on so to say "natural record flow" in a PostgreSQL table and there was a Delphi code:
query.Open('SELECT * FROM TheTable');
query.Last();
The task is to get all the fields of last table record. I decided to rewrite this query in a more effective way, something like this:
SELECT * FROM TheTable ORDER BY ReportDate DESC LIMIT 1
but it broke all the workflow. Some of ReportDate records turned out to be NULL. The application was really oriented on a "natural" records order in a table.
How to do a physically last record selection effectively without ORDER BY?
to do a physically last record selection, you should use ctid - the tuple id, to get the last one - just select max(ctid). smth like:
t=# select ctid,* from t order by ctid desc limit 1;
ctid | t
--------+-------------------------------
(5,50) | 2017-06-13 11:41:04.894666+00
(1 row)
and to do it without order by:
t=# select t from t where ctid = (select max(ctid) from t);
t
-------------------------------
2017-06-13 11:41:04.894666+00
(1 row)
Its worth knowing that you can find ctid only after sequential scan. so checking the latest physically row will be costy on large data sets
I want to search DISTINCT on a Postgresql 9.4 table with about 300 000 records. It takes almost 8 seconds. I read on this post that using this could speed it up. And it really did. Down to 0.26 sec.
SELECT COUNT(*) FROM (SELECT DISTINCT column_name FROM table_name) AS temp;
Is much faster than
COUNT(DISTINCT(column_name))
When I write this I get the result but I want to add a WHERE clause.
This works but takes over 7 sec.
SELECT COUNT(DISTINCT(species)) FROM darwincore2
WHERE darwincore2.dataeier ILIKE '%nnog%'
This works (0.26 sec.) but fails when I add the WHERE clause:
SELECT COUNT(*) FROM (SELECT DISTINCT species FROM darwincore2) as temp
WHERE darwincore2.dataeier ILIKE '%nnog%'
with:
ERROR: missing FROM-clause entry for table "darwincore2"
Anyone know how I can fix this? Or am I trying to do something that does not work??
The WHERE clause should be in the subquery:
SELECT COUNT(*)
FROM (
SELECT DISTINCT species
FROM darwincore2
WHERE darwincore2.dataeier ILIKE '%nnog%'
) as temp
Following is my query:
select * from table order by timestamp desc limit 10
this takes too much time compared to
select * from table limit 10
How can I optimize the first query to get to near performance of second query.
UPDATE: I don't have control over the db server, so can not index columns to gain performance.
Create an index on timestamp.
Quassnoi is correct -- you need an index on timestamp.
That said, if your timestamp field reasonably maps your primary key (e.g. a date_created or an invoice_date field), you can try this workaround:
select *
from (select * from table order by id desc limit 1000) as table
order by timestamp desc limit 10;
#Nishan is right. There is little you can do. If you do not need every column in the table you may gain a few milliseconds by explicitly asking for just the columns you need