Getting the total number of records in a single knexjs query when using the limit() method - postgresql

I use knexjs and postgresql. Is it possible in knexjs to get the total of records from the same query in which the limit is used?
For example:
knex.select().from('project').limit(50)
Is it possible to somehow get the total number of records in the same query if there are more than 50?
The question arose due to the fact that my query is much more complex, which uses a lot of subqueries and conditions, and I would not like to make this query twice to get the data in one query and the total number of records (I use the .count() method) from another.

I do not know your obscurification manager (knexjs?) but I would think you should be able to add the window version of the count() function to your select list. In plain SQL something like: Where ... represents your current select list. (see demo)
select ..., count(*) over() total_rows
from project
limit 5;
This works because the window count function counts all rows selected, after all rows selected, but before the LIMIT clause is applied. Note: This adds a column to the result set with the same value in every row.

Related

What does ORDER BY within OVER() window function mean in postgresql?

I am trying to understand how the ORDER BY clause in the OVER() window function is different from ORDER BY clause in generic SQL.
I was solving the following problem: https://www.pgexercises.com/questions/aggregates/nummembers.html
Produce a monotonically increasing numbered list of members (including guests),
ordered by their date of joining. Remember that member IDs are not guaranteed to be sequential.
The following query is one of the accepted solutions:
SELECT COUNT(*) OVER(ORDER by joindate), firstname, surname FROM cd.members;
As per my understanding, since we are not supplying a PARTITION BY clause in the OVER() function, all the rows in cd.members table form one big partition (let's call it X). When the window function runs, it should order X by joindate, and then COUNT(*) on X would return the number of rows in X which is just the number of rows in cd.members.
But this understanding is incorrect. The 'Answers and Discussion' accompanying the aforementioned problem states:
Since we define an order for the window function, for any given row the window is: start of the dataset -> current row.
The PG documentation on window function states:
You can also control the order in which rows are processed by window functions using ORDER BY within OVER. (The window ORDER BY does not even have to match the order in which the rows are output.)
What I cannot comprehend is why will ORDER BY inside the OVER() stop at the current row? Could you please elaborate how this is working?
Thank you for reading through.
I don't know what to add beyond what the docs (same page as you already linked to) already say:
By default, if ORDER BY is supplied then the frame consists of all
rows from the start of the partition up through the current row, plus
any following rows that are equal to the current row according to the
ORDER BY clause.
I don't now if this required by the SQL standard, but it certainly seems reasonable. Why specify an ORDER BY if you expect it to have no observable effect?

running total using windows function in sql has same result for same data

From every references that I search how to do cumulative sum / running total. they said it's better using windows function, so I did
select grandtotal,sum(grandtotal)over(order by agentname) from call
but I realize that the results are okay as long as the value of each rows are different. Here is the result :
Is There anyway to fix this?
You might want to review the documentation on window specifications (which is here). The default is "range between" which defines the range by the values in the row. You want "rows between":
select grandtotal,
sum(grandtotal) over (order by agentname rows between unbounded preceding and current row)
from call;
Alternatively, you could include an id column in the sort to guarantee uniqueness and not have to deal with the issue of equal key values.

kdb/q: use function in a select from partitioned table

I'm trying to get max drawdown from a partitioned table across multiple dates. The query works fine when run with a date constrained to a specific day. E.g.
select {max neg x-maxs x} pnl from trades where date=last date
It's getting map-reduced over multiple dates so the above query no longer works. I can make the query run over multiple dates by adding another aggregation:
select max {max neg x-maxs x} pnl from trades
but it's not getting the max drawdown from continuous sequence of trades but a maximum of daily drawdowns.
I wonder if there's a way to make it work with a single select without chaining selects like
select {max neg x-maxs x} pnl from select pnl from trades
I've got a rather big query to pull a lot of various metrics on the trades where max drawdown is just one of them. Using chained select means that I need to break the big query into two queries, map-reduced and non-map-reduced, and then join them back which would make the query look ugly.
Thanks!
Select query runs on each date in partition db and apply function to each date values and finally aggregates them depending upon the call (user defined function behaves differently than plain 'q' functions).
So I don't think you can combine that into one query. But there are ways you can look for to make your query more generalized and reusable for different scenarios.
For ex. convert your query to functional form and use variables in that query for column name and user function. Put this in one function which will accept column name and user function. Now you can call this function with different set of (column ;function). Something like :
runF:{[col;usrfunc] funtional_query_uses_col_userfunc }
All this depends on your use cases. Also check for memory usage as you'll be taking lot of data into memory.

See length (count) of query results in workbench

I just started using MySQL Workbench (6.1). The default limit for queries is 1,000 and that's fine I want to keep that.
But the results from the action output message will therefore always say "1000 rows returned".
Is there a setting to see the number of records that would be returned in the query had their been no limit? For sanity checking query results?
I know this is late by a few years, but I think you're asking for a way to see total row count in the bottom of the results pane, like in SQL Server. In SQL Server, you would also go in the messages pane and it would say how many rows were returned. I was actually looking for exactly what you were asking for as well, and seems like there is no way to find that. If you have an ID in your table that is just numeric and is in numeric order, you could order by ID desc and look at the biggest number there. That is what I've decided to do.
The result is not always "1000 rows returned". If there are less records than that you will get the actual count. If you want to know the total number of rows in a table do a select count(*) from table. Alternatively, you can switch off the automatic limit and have all records returned by MySQL Workbench, but that can be time + memory consuming for large tables.
I think removing the row limit will help. By default, MySQL workbench will limit the result set to 1000 rows but you can always disable the limit. Check out https://superuser.com/questions/240291/how-to-remove-1000-row-limit-in-mysql-workbench-queries on how to do that.
You can run a second query to check that
select count(*) from (your original query) as t;
this will return the total rows in actual result.
You can use the SQL count function. It returns the count of the total number of rows a query returns.
A sample query:
select count(*) from tableName where field1 = value1
In workbench, in the dropdown menu at the top, set it to dont limit Then run the query to extract data from table Then under the output pane below, the total count of the query results will be displayed in the message column

n-th row in PostgreSQL for p-quantile

I'm trying to fetch the n-th row of a query result. Further posts suggested the use of OFFSET or LIMIT but those forbid the use of variables (ERROR: argument of OFFSET must not contain variables). Further I read about the usage of cursors but I'm not quite sure how to use them even after reading their PostgreSQL manpage. Any other suggestions or examples for how to use cursors?
My main goal is to calculate the p-quantile of a row and since PostgreSQL doesn't provide this function by default I have to write it on my own.
Cheers
The following returns the 5th row of a result set:
select *
from (
select <column_list>,
row_number() over (order by some_sort_column) as rn
) t
where rn = 5;
You have to include an order by because otherwise the concept of "5th row" doesn't make sense.
You mention "use of variable" so I'm not sure what you are actually trying to achive. But you should be able to supply the value 5 as a variable for this query (or even a sub-select).
You might also want to dig further into windowing functions. Because with that you could e.g. do a sum() over the 3 rows before the current row (or similar constructs) - which could also be useful for you.
if you would like to get 10th record, below query also work fine.
select * from table_name order by sort_column limit 1 offset 9
OFFSET simply skip that many rows before beginning to return rows as mentioned in LIMIT clause.