Is there a way to see if a limit offset query has reached the end with pgpromse? - postgresql

I have a table of posts. I would like to query these posts as pages. Because I would like to keep my endpoints stateless I would like to do this with offset and limit like this:
SELECT * FROM post LIMIT 50 OFFSET $1 ORDER BY id
Where $1 one would be the page number times the page size (50). The easy way to check if we have reached the end would be to see if we got 50 pages back. The problem of course is if the number of pages is divisible by 50, we can't be sure.
The way I have solved this until now is by simply fetching 51 posts per query with the page size still being 50. That way if the return query is less than 51, we have reached the end.
Unfortunately, this seems a very hacky way to do this. So I was wondering, is there some feature within pg-promise or postgresql that would indicate that I have reached the end of a table without resorting to tricks like this?

The simplest method with the lowest overhead I found:
You can request pageLimit+1 rows on every page request. In your controller you will check if rowsCount > pageLimit and will know that there is more data available. Of course, before returning the rows, you would need to remove the last element and send along the rows something like a hasNext boolean.
It is usually way cheaper for the DB to retrieve an extra row of data than count all rows or make an extra request for page+1 to check if it returns any rows.

Well there is no built in process for this directly. But you can count the rows and add that to the results. You could then even give the user the number of items or number of pages:
-- Item count
with pc(cnt) as (select count(*) from post)
select p.*, cnt
from post p
cross join pc
limit 50 offset $1;
-- page count
with pc(cnt) as (select count(*)/50 + ((count(*)%50)>0)::int from post)
select p.*, cnt
from post p
cross join pc
limit 50 offset $1;
Caution: The count function can be slow, and even when not it does add to response time. Is it worth the additional overhead? Only you and the user can answer that.

This method works well only in specific settings (SPA with caching of network requests and desire to make pagination feel faster with pre-fetching):
One every page, you make two requests: one for the current page data and one for the next page's data.
It works if you for example use a React Single-Page Application with react-query where the nextPage will not be refetched but reused when user opens it.
Otherwise, if the nextPage is not reused, it's worse than checking for a total number of rows to determine whether there are any rows left as you will make 2 requests for every page.
It will even make the user interface snappier as the transition to the next page will always be instant.
This method will work well if you have a lot of page transitions as the total number of calls equals numberOfPages+1, so if on average users go to 10 pages, numberOfPages+1=10+1 or just 10% overhead. But if your users usually do not go beyond the first page, it makes little sense as in this case numberOfPages+1=2 calls for a single page.

Related

ListView: instantly updating and time comparision

I have to work on an application in which we are getting list of orders(with all details and display_time) and we have to show them in list view, but the condition is we have to show particular order on their exact display_time.
For example below are some orders with display time:
order_id: 101 |
display_time (hh:mm:ss): 09:10:00
order_id: 102 |
display_time (hh:mm:ss): 09:30:00
Then the requirement is:
We have to show the orders on list on exact their display time.
All order should come instantly as they entered in database.
Edit
The first thing that I need is:
To get the order from database (SQL Server) instantly without hitting
any API. Like push notification.
Then the second need is:
To compare the device's time and order's display_time and if its matched then make visible the order in ListView. I have to do this for each order i think.
I don't know how can I do this.
So please suggest how can we do the above task.
I hope I understand correctly. Mainly you want the list of items to be in the order of time from earliest to latest.
Here is one algorithm and it could be enhanced.
Allocate 100 (arbitrary) items/rows in the Listview.
If you only have 2 items or orders initially, make the other 98 rows not visible (state).
If a new order is entered, add the new order into one of the non-visible rows. And make it visible of course.
The issue in this case, you may have to reorder the items above the new order. However this is only a manipulation of text in your rows. A data structure is necessary to support this.
I claim this algorithm is fast. I think this is a good start.

Is there any way to avoid PostgreSQL placing the updated row as the last row?

Well, my problem is that each time that I make an update of a row, this row goes to the last place in the table. It doesn't really matter where was placed before.
I've read in this post Postgresql: row number changes on update that rows in a relational table are not sorted. Then, why when I execute a select * from table; do I always get the same order?
Anyway, I don't want to start a discussion about that, just to know if is there any way to don't let update sentence place the row in the last place.
Edit for more info:
I don't really want to get all results at all. I have programmed 2 buttons in Java, next and previous and, being still a begginer, the only way that I had to get the next or the previous row was to use select * from table limit 1 and adding offset num++ or offset num-- depending of the button clicked. So, when I execute the update, I lose the initial order (insertion order).
Thanks.
You could make some space in your tables for updates. Change the fill factor from the default 100%, no space for updates left on a page, to something less to create space for updates.
From the manual (create table):
fillfactor (integer)
The fillfactor for a table is a percentage
between 10 and 100. 100 (complete packing) is the default. When a
smaller fillfactor is specified, INSERT operations pack table pages
only to the indicated percentage; the remaining space on each page is
reserved for updating rows on that page. This gives UPDATE a chance to
place the updated copy of a row on the same page as the original,
which is more efficient than placing it on a different page. For a
table whose entries are never updated, complete packing is the best
choice, but in heavily updated tables smaller fillfactors are
appropriate. This parameter cannot be set for TOAST tables.
But without an ORDER BY in your query, there is no guarantee that a result set will be sorted the way you expect it to be sorted. No fill factor can change that.

Range query with no dups

I have a collection that I would like to serve out as 'pages'. The collection could get quite large, I have read skip is not optimal in that case. I think range queries will work just fine in my case so I am going to try that route.
My collection will be sorted and paged on a timestamp field. I have implemented the API such that a user passes in a startDate and I will return a certain number ('limit', max of 1000) of items. However I am struggling with how to not get duplicates on each page if documents have the same time.
As an example (small page size to make it easy).
I have 6 documents let's docs 3 and 4 have the same time. If I ask for page one I will get the first three. However when I ask for page 2 with a startDate that it 'gte' the last doc on page one I will get a dup on page 2 as the last doc from page one will be that same as the first doc on page 2.
I cannot find a range query example anywhere that deals with dates, while not returning dups.

Update depending on the row number SQL Server 2008

This one is kind of weird and my lack of experience has me asking this.
I have an update to do and because of how bad the tables are put together where I'm grabbing this data it makes it a bit difficult.
The scenario:
There could be 1 to x amount of Visits per Patient. I want to grab the last visit. Here is where the problem is - one patient can have two or three IDs. These IDs are linked to ONE ID to help migrate them over to a new database and under one ID.
Now I've tried top 1 in a cross apply and a joining on a maxid. I can get some of it to work but not all. So I used a row number ranking to get how many times a particular person visited. However I have to run a pass on the update for each visit to get the last one as it will overwrite the previous.
Is there a way to use row_number() over (partition by B.Uid order by B.Uid) PID
So I would run a pass where pid = 1 then another pass on where pid = 2 and so on.
I am thinking there must be a way to have it do one pass - either by setting up some while loop or checking to see the highest PID then update.

Pagination on large data sets? – Abort count(*) after a certain time

We use the following pagination technique here:
get count(*) of given filter
get first 25 records of given filter
-> render some pagination links on the page
This works pretty well as long as count(*) is reasonable fast. In our case the data size has grown to a point where a non-indexd query (although most stuff is covered by indices) takes more than a minute. So at this point the user waits for a mostly unimportant number (total records matching filter, number of pages). The first N records are often ready pretty fast.
Therefore I have two questions:
can I limit the count(*) to a certain number
or would it be possible to limit it by time? (no count() known after 20ms)
Or just in general: are there some easy ways to avoid that problem? We would like to keep the system as untouched as possible.
Database: Oracle 10g
Update
There are several scenarios
a) there's an index -> neither count(*) nor the actual select should be a problem
b) there's no index
count(*) is HUGE, and it takes ages to determine it -> rownum would help
count(*) is zero or very low, here a time limit would help. Or I could just dont do a count(*) if the result set is already below the page limit.
You could use 'where rownum < x' to limit the number of rows to count. And if you need to show to your user that you has more register, you could use x+1 in count just to see if there is more than x registers.