How to make postgres (cursor?) start at particular row - postgresql

I have created the following query:
select t.id, t.row_id, t.content, t.location, t.retweet_count, t.favorite_count, t.happened_at,
a.id, a.screen_name, a.name, a.description, a.followers_count, a.friends_count, a.statuses_count,
c.id, c.code, c.name,
t.parent_id
from tweets t
join accounts a on a.id = t.author_id
left outer join countries c on c.id = t.country_id
where t.row_id > %s
-- order by t.row_id
limit 100
Where %s is a number that starts at 0 and is incremented by 100 after each such query is conducted. I want to fetch all records from the database using this method, where I just increase the %s in the where condition. I found this approach on https://ivopereira.net/efficient-pagination-dont-use-offset-limit. I also included a column in my table which is corresponding to row number (I named it row_id). Now the problem is when I run this query the first time, it returns rows which have an row_id of 3 million. I would like the cursor (not sure if my terminology is correct) to start from rows with row_id 1 through 100 and so on. The table contains 7 million rows. Am I missing something obvious with which I could achieve my goal?

Related

How to count total number of records after join the three tables in postgresql?

I have a query which gives me total 12408 records after executing but i want this give me total records as count column
select
c.complaint_id,c.server_time,c.completion_date,c.road_id,c.photo,c.dept_code,c.dist_code,c.eng_userid,c.feedback_type,c.status,p.dist_name,p.road_name,p.road_dept,e.display_name,e.mobile
from complaints as c INNER JOIN pwd_roads as p ON p.road_id=c.road_id
INNER JOIN enc_details as e ON CAST(e.enc_code as INTEGER) = p.enccode
where c.complaint_id=c.parent_complaint_id and c.dept_code='PWDBnR'
and c.server_time between '2018-09-03' and '2018-12-19'
You can solve this issue using window functions. For example, if you want your first columns to be a count of the total rows done by the SELECT statement:
select count(1) over(range between unbounded preceding and unbounded following) as total_row_count
, c.complaint_id,c.server_time,c.completion_date,c.road_id,c.photo,c.dept_code,c.dist_code,c.eng_userid,c.feedback_type,c.status,p.dist_name,p.road_name,p.road_dept,e.display_name,e.mobile from complaints as c INNER JOIN pwd_roads as p ON p.road_id=c.road_id INNER JOIN enc_details as e ON CAST(e.enc_code as INTEGER) = p.enccode where c.complaint_id=c.parent_complaint_id and c.dept_code='PWDBnR' and c.server_time between '2018-09-03' and '2018-12-19'
Note that the window function is evaluated before the LIMIT clause if one is used, so if you were to add LIMIT 100 to the query it might give a row count greater than 100 even though a max of 100 rows would be returned.
Easiest but not very elegant way to do this is:
select count(*)
from
(
select c.complaint_id,c.server_time,c.completion_date,c.road_id,c.photo,c.dept_code,c.dist_code,c.eng_userid,c.feedback_type,c.status,p.dist_name,p.road_name,p.road_dept,e.display_name,e.mobile from complaints as c INNER JOIN pwd_roads as p ON p.road_id=c.road_id INNER JOIN enc_details as e ON CAST(e.enc_code as INTEGER) = p.enccode where c.complaint_id=c.parent_complaint_id and c.dept_code='PWDBnR' and c.server_time between '2018-09-03' and '2018-12-19'
)

Simple SELECT, but adding JOIN returns too many rows

The query below returns 9,817 records. Now, I want to SELECT one more field from another table. See the 2 lines that are commented out, where I've simply selected this additional field and added a JOIN statement to bind this new columns. With these lines added, the query now returns 649,200 records and I can't figure out why! I guess something is wrong with my WHERE criteria in conjunction with the JOIN statement. Please help, thanks.
SELECT DISTINCT dbo.IMPORT_DOCUMENTS.ITEMID, BEGDOC, BATCHID
--, dbo.CATEGORY_COLLECTION_CATEGORY_RESULTS.CATEGORY_ID
FROM IMPORT_DOCUMENTS
--JOIN dbo.CATEGORY_COLLECTION_CATEGORY_RESULTS ON
dbo.CATEGORY_COLLECTION_CATEGORY_RESULTS.ITEMID = dbo.IMPORT_DOCUMENTS.ITEMID
WHERE (BATCHID LIKE 'IC0%' OR BATCHID LIKE 'LP0%')
AND dbo.IMPORT_DOCUMENTS.ITEMID IN
(SELECT dbo.CATEGORY_COLLECTION_CATEGORY_RESULTS.ITEMID FROM
CATEGORY_COLLECTION_CATEGORY_RESULTS
WHERE SCORE >= .7 AND SCORE <= .75 AND CATEGORY_ID IN(
SELECT CATEGORY_ID FROM CATEGORY_COLLECTION_CATS WHERE COLLECTION_ID IN (11,16))
AND Sample_Id > 0)
AND dbo.IMPORT_DOCUMENTS.ITEMID NOT IN
(SELECT ASSIGNMENT_FOLDER_DOCUMENTS.Item_Id FROM ASSIGNMENT_FOLDER_DOCUMENTS)
One possible reason is because one of your tables contains data at lower level, lower than your join key. For example, there may be multiple records per item id. The same item id is repeated X number of times. I would fix the query like the below. Without data knowledge, Try running the below modified query.... If output is not what you're looking for, convert it into SELECT Within a Select...
Hope this helps....
Try this SQL: SELECT DISTINCT a.ITEMID, a.BEGDOC, a.BATCHID, b.CATEGORY_ID FROM IMPORT_DOCUMENTS a JOIN (SELECT DISTINCT ITEMID FROM CATEGORY_COLLECTION_CATEGORY_RESULTS WHERE SCORE >= .7 AND SCORE <= .75 AND CATEGORY_ID IN (SELECT DISTINCT CATEGORY_ID FROM CATEGORY_COLLECTION_CATS WHERE COLLECTION_ID IN (11,16)) AND Sample_Id > 0) B ON a.ITEMID =b.ITEMID WHERE a.(a.BATCHID LIKE 'IC0%' OR a.BATCHID LIKE 'LP0%') AND a.ITEMID NOT IN (SELECT DIDTINCT Item_Id FROM ASSIGNMENT_FOLDER_DOCUMENTS)

T-SQL query one table, get presence or absence of other table value

I'm not sure what this type of query is called so I've been unable to search for it properly. I've got two tables, Table A has about 10,000 rows. Table B has a variable amount of rows.
I want to write a query that gets all of Table A's results but with an added column, the value of that column is a boolean that says whether the result also appears in Table B.
I've written this query which works but is slow, it doesn't use a boolean but rather a count that will be either zero or one. Any suggested improvements are gratefully accepted:
SELECT u.number,u.name,u.deliveryaddress,
(SELECT COUNT(productUserid)
FROM ProductUser
WHERE number = u.number and productid = #ProductId)
AS IsInPromo
FROM Users u
UPDATE
I've run the query with actual execution plan enabled, I'm not sure how to show the results but various costs are:
Nested Loops (left semi join): 29%]
Clustered Index scan (User Table): 41%
Clustered Index Scan (ProductUser table): 29%
NUMBERS
There are 7366 users in the users table and currently 18 rows in the productUser table (although this will change and could be in the thousands)
You can use EXISTS to short circuit after the first row is found rather than COUNT-ing all matching rows.
SQL Server does not have a boolean datatype. The closest equivalent is BIT
SELECT u.number,
u.name,
u.deliveryaddress,
CASE
WHEN EXISTS (SELECT *
FROM ProductUser
WHERE number = u.number
AND productid = #ProductId) THEN CAST(1 AS BIT)
ELSE CAST(0 AS BIT)
END AS IsInPromo
FROM Users u
RE: "I'm not sure what this type of query is called". This will give a plan with a semi join. See Subqueries in CASE Expressions for more about this.
Which management system are you using?
Try this:
SELECT u.number,u.name,u.deliveryaddress,
case when COUNT(p.productUserid) > 0 then 1 else 0 end
FROM Users u
left join ProductUser p on p.number = u.number and productid = #ProductId
group by u.number,u.name,u.deliveryaddress
UPD: this could be faster using mssql
;with fff as
(
select distinct p.number from ProductUser p where p.productid = #ProductId
)
select u.number,u.name,u.deliveryaddress,
case when isnull(f.number, 0) = 0 then 0 else 1 end
from Users u left join fff f on f.number = u.number
Since you seem concerned about performance, this query can perform faster as this will cause index seek on both tables versus an index scan:
SELECT u.number,
u.name,
u.deliveryaddress,
ISNULL(p.number, 0) IsInPromo
FROM Users u
LEFT JOIN ProductUser p ON p.number = u.number
WHERE p.productid = #ProductId

Firebird get the list with all available id

In a table I have records with id's 2,4,5,8. How can I receive a list with values 1,3,6,7. I have tried in this way
SELECT t1.id + 1
FROM table t1
WHERE NOT EXISTS (
SELECT *
FROM table t2
WHERE t2.id = t1.id + 1
)
but it's not working correctly. It doesn't bring all available positions.
Is it possible without another table?
You can get all the missing ID's from a recursive CTE, like this:
with recursive numbers as (
select 1 number
from rdb$database
union all
select number+1
from rdb$database
join numbers on numbers.number < 1024
)
select n.number
from numbers n
where not exists (select 1
from table t
where t.id = n.number)
the number < 1024 condition in my example limit the query to the max 1024 recursion depth. After that, the query will end with an error. If you need more than 1024 consecutive ID's you have either run the query multiple times adjusting the interval of numbers generated or think in a different query that produces consecutive numbers without reaching that level of recursion, which is not too difficult to write.

TSQL show only first row

I have the following TSQL query:
SELECT DISTINCT MyTable1.Date
FROM MyTable1
INNER JOIN MyTable2
ON MyTable1.Id = MyTable2.Id
WHERE Name = 'John' ORDER BY MyTable1.Date DESC
It retrieves a long list of Dates, but I only need the first one, the one in the first row.
How can I get it?
Thanks a ton!
In SQL Server you can use TOP:
SELECT TOP 1 MyTable1.Date
FROM MyTable1
INNER JOIN MyTable2
ON MyTable1.Id = MyTable2.Id
WHERE Name = 'John'
ORDER BY MyTable1.Date DESC
If you need to use DISTINCT, then you can use:
SELECT TOP 1 x.Date
FROM
(
SELECT DISTINCT MyTable1.Date
FROM MyTable1
INNER JOIN MyTable2
ON MyTable1.Id = MyTable2.Id
WHERE Name = 'John'
) x
ORDER BY x.Date DESC
Or even:
SELECT MAX(MyTable1.Date)
FROM MyTable1
INNER JOIN MyTable2
ON MyTable1.Id = MyTable2.Id
WHERE Name = 'John'
--ORDER BY MyTable1.Date DESC
There are several options here. You can use TOP(1) as Taryn mentioned. But according to docs for the purposes of limiting the rows returned it is better to use OFFSET and FETCH.
We recommend that you use the OFFSET and FETCH clauses instead of the TOP clause to implement a query paging solution and limit the number of rows sent to a client application.
Using OFFSET and FETCH as a paging solution requires running the query one time for each "page" of data returned to the client application. For example, to return the results of a query in 10-row increments, you must execute the query one time to return rows 1 to 10 and then run the query again to return rows 11 to 20 and so on.
Assuming, the solution for your problem using OFFSET and FETCH approach could be:
SELECT DISTINCT MyTable1.Date
FROM MyTable1
INNER JOIN MyTable2
ON MyTable1.Id = MyTable2.Id
WHERE Name = 'John' ORDER BY MyTable1.Date DESC
OFFSET 0 ROWS
FETCH NEXT 1 ROW ONLY