How to simplify this UNION query (remove the UNION)? - postgresql

I have a table with players, let's call it player.
Let's say they have 3 columns: userId (UUID in a varchar(255)), levelNumber (integer) and a column through a one-to-one relation with FetchType.Lazy, let's say facebookProfile.
I need to retrieve the rankings "around" the player, so 9 players above the given player and 9 players below the given player, to have a total of 19 players (with my player in the middle).
Some time ago I just came up with this idea:
(select * from player where current_level >= :levelNumber + 1 and (not userid = :userIdToIgnore) order by current_level asc limit 9)
union
(select * from player where current_level <= :levelNumber - 1 and (not userid = :userIdToIgnore) order by current_level desc limit 9)
You get the idea.
Is there any way to simplify this so it doesn't use the UNION?
I'm asking cause I need to convert that to a JPQL query, so it won't be a nativeQuery.
This is all because nativeQueries lead to the N+1 problem and I have troubles with lazy-loading (facebookProfile column) and multiple selects later. That's why I need to simplify that algorithm to be able to use JPQL.

I think you can do this with window functions and conditional expressions:
select *
from (
select p.*,
case when current_level >= :levelNumber + 1 then row_number() over(order by current_level) end rn1,
case when current_level <= :levelNumber - 1 then row_number() over(order by current_level desc) end rn_desc
from player p
where userid <> :userIdToIgnore and (current_level >= :levelNumber + 1 or current_level <= :levelNumber - 1)
) t
where rn1 between 1 and 9 or rn2 between 1 and 9

Related

How can I increment the numerical value in my WHERE clause using a loop?

I am currently using the UNION ALL workaround below to calculate old_eps_tfq regression slopes of each ticker based off its corresponding rownum value (see WHERE rownum < x). I am interested to know what the old_eps_tfq is when rownum < 4 then increment 4 by 1 to find out what old_eps_tfq is when rownum < 5, and so on (there are ~20 rownum)
Could I use PL/pgSQL for this?
SELECT * FROM(
WITH regression_slope AS(
SELECT
ROW_NUMBER() OVER ( PARTITION BY ticker ORDER BY earnings_growths_ped) AS rownum,
*
FROM "ANALYTICS"."vEARNINGS_GROWTHS"
--WHERE ticker = 'ACN'
ORDER BY ticker )
SELECT
ticker,
current_period_end_date,
max(earnings_growths_ped) AS max_earnings_growths_ped,
--max(rownum) AS max_rownum,
round(regr_slope(old_eps_tfq, rownum)::numeric, 2) AS slope,
round(regr_intercept(old_eps_tfq, rownum)::numeric, 2) AS y_intercept,
round(regr_r2(old_eps_tfq, rownum)::numeric, 3) AS r_squared
FROM regression_slope
WHERE rownum < 4
GROUP BY ticker, current_period_end_date
ORDER BY ticker asc ) q
UNION ALL
SELECT * FROM(
WITH regression_slope AS(
SELECT
ROW_NUMBER() OVER ( PARTITION BY ticker ORDER BY earnings_growths_ped) AS rownum,
*
FROM "ANALYTICS"."vEARNINGS_GROWTHS"
--WHERE ticker = 'ACN'
ORDER BY ticker )
SELECT
ticker,
current_period_end_date,
max(earnings_growths_ped) AS max_earnings_growths_ped,
--max(rownum) AS max_rownum,
round(regr_slope(old_eps_tfq, rownum)::numeric, 2) AS slope,
round(regr_intercept(old_eps_tfq, rownum)::numeric, 2) AS y_intercept,
round(regr_r2(old_eps_tfq, rownum)::numeric, 3) AS r_squared
FROM regression_slope
WHERE rownum < 5
GROUP BY ticker, current_period_end_date
ORDER BY ticker asc ) q
Here is my table
The top query SELECT * FROM (...) q sounds like useless.
Then you can try this :
WITH regression_slope AS(
SELECT
ROW_NUMBER() OVER ( PARTITION BY ticker ORDER BY earnings_growths_ped) AS rownum,
*
FROM "ANALYTICS"."vEARNINGS_GROWTHS"
--WHERE ticker = 'ACN'
ORDER BY ticker )
SELECT
max,
ticker,
current_period_end_date,
max(earnings_growths_ped) AS max_earnings_growths_ped,
--max(rownum) AS max_rownum,
round(regr_slope(old_eps_tfq, rownum)::numeric, 2) AS slope,
round(regr_intercept(old_eps_tfq, rownum)::numeric, 2) AS y_intercept,
round(regr_r2(old_eps_tfq, rownum)::numeric, 3) AS r_squared
FROM regression_slope
INNER JOIN generate_series(4, 24) AS max -- the range 4 to 24 can be adjusted to the need
ON rownum < max
GROUP BY max, ticker, current_period_end_date
ORDER BY max asc, ticker asc

Top N values in window frame

I have a table t with 3 fields of interest:
d (date), pid (int), and score (numeric)
I am trying to calculate a 4th field that is an average of each player's top N (3 or 5) scores for the days before the current row.
I tried the following join on a subquery but it is not producing the results I'm looking for:
SELECT t.d, t.pid, t.score, sq.highscores
FROM t, (SELECT *, avg(score) as highscores FROM
(SELECT *, row_number() OVER w AS rnum
FROM t AS t2
WINDOW w AS (PARTITION BY pid ORDER BY score DESC ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING)) isq
WHERE rnum <= 3) sq
WHERE t.d = sq.d AND t.pid = sq.pid
Any suggestions would be greatly appreciated! I'm a hobbyist programmer and this is more complex of a query than I'm used to.
You can't select * and avg(score) in the same (inner) query. I.e. which non-aggregated values should be selected for each average? PostgreSQL won't decide this instead of you.
Becasue you PARTITION BY pid in the innermost query, you should use GROUP BY pid in the aggregating subquery. That way, you can SELECT pid, avg(score) as highscores:
SELECT pid, avg(score) as highscores
FROM (SELECT *, row_number() OVER w AS rnum
FROM t AS t2
WINDOW w AS (PARTITION BY pid ORDER BY score DESC)) isq
WHERE rnum <= 3
GROUP BY pid
Note: ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING makes no difference for row_number().
But if the top N part is fixed (and N will be few in your real-world use-case too), you can solve this without that much subquery (with the nth_value() window function):
SELECT d, pid, score,
(coalesce(nth_value(score, 1) OVER w, 0) +
coalesce(nth_value(score, 2) OVER w, 0) +
coalesce(nth_value(score, 3) OVER w, 0)) /
((nth_value(score, 1) OVER w IS NOT NULL)::int +
(nth_value(score, 2) OVER w IS NOT NULL)::int +
(nth_value(score, 3) OVER w IS NOT NULL)::int) highscores
FROM t
WINDOW w AS (PARTITION BY pid ORDER BY score DESC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
http://rextester.com/GUUPO5148

how to do dead reckoning on column of table, postgresql

I have a table looks like,
x y
1 2
2 null
3 null
1 null
11 null
I want to fill the null value by conducting a rolling
function to apply y_{i+1}=y_{i}+x_{i+1} with sql as simple as possible (inplace)
so the expected result
x y
1 2
2 4
3 7
1 8
11 19
implement in postgresql. I may encapsulate it in a window function, but the implementation of custom function seems always complex
WITH RECURSIVE t AS (
select x, y, 1 as rank from my_table where y is not null
UNION ALL
SELECT A.x, A.x+ t.y y , t.rank + 1 rank FROM t
inner join
(select row_number() over () rank, x, y from my_table ) A
on t.rank+1 = A.rank
)
SELECT x,y FROM t;
You can iterate over rows using a recursive CTE. But in order to do so, you need a way to jump from row to row. Here's an example using an ID column:
; with recursive cte as
(
select id
, y
from Table1
where id = 1
union all
select cur.id
, prev.y + cur.x
from Table1 cur
join cte prev
on cur.id = prev.id + 1
)
select *
from cte
;
You can see the query at SQL Fiddle. If you don't have an ID column, but you do have another way to order the rows, you can use row_number() to get an ID:
; with recursive sorted as
(
-- Specify your ordering here. This example sorts by the dt column.
select row_number() over (order by dt) as id
, *
from Table1
)
, cte as
(
select id
, y
from sorted
where id = 1
union all
select cur.id
, prev.y + cur.x
from sorted cur
join cte prev
on cur.id = prev.id + 1
)
select *
from cte
;
Here's the SQL Fiddle link.

T-SQL: Simplify Select statement with ROW_NUMBER

I have the following select SQL that does a basic select statement although it does include a calculated column:
Select *
From
(
Select *,
ROW_NUMBER() OVER
(ORDER BY
CASE WHEN #sortBy = 0 THEN R.DateCreated End Desc,
CASE WHEN #sortBy = 1 THEN R.DateCreated end Asc,
CASE WHEN #sortBy = 2 THEN TotalVotes END Desc,
CASE WHEN #sortBy = 2 THEN R.TotalFoundNotUseful END Desc
) AS RowNumber
From
(
Select *, (TotalFoundUseful + TotalFoundNotUseful) As TotalVotes
From Reviews
Where (DealID = #dealID) And (TotalAbuses < 10) And (Deleted = 0)
) As R
) As Rev
Where RowNumber BETWEEN #startRecord AND #endRecord
If you look carefully, the SELECT statement itself is executed 3 times. I can't believe that this is necessary. Is there a way to reduce this to 2 select statements (or possibly even one). I don't actually need to return the RowNumber. It is only used for selecting rows within a certain range.
You can do it with two by putting the rownumber in with the original select against reviews. You can't go to one if you want to have a WHERE clause on a windowing function like ROW_NUMBER.
You do have to write TotalFoundUseful + TotalFoundNotUseful twice but it will only be evaluated once so that doesn't effect performance.
I also wouldn't expect moving to two to have any effect on performance but you should test it.
Select *
From
(
Select *, (TotalFoundUseful + TotalFoundNotUseful) As TotalVotes,
ROW_NUMBER() OVER
(ORDER BY
CASE WHEN #sortBy = 0 THEN DateCreated End Desc,
CASE WHEN #sortBy = 1 THEN DateCreated end Asc,
CASE WHEN #sortBy = 2 THEN TotalFoundUseful + TotalFoundNotUseful END Desc,
CASE WHEN #sortBy = 2 THEN TotalFoundUseful END Desc
) AS RowNumber
From Reviews
Where (DealID = #dealID) And (TotalAbuses < 10) And (Deleted = 0)
) As Rev
Where RowNumber BETWEEN #startRecord AND #endRecord

T-SQL if value exists use it other wise use the value before

I have the following table
-----Account#----Period-----Balance
12345---------200901-----$11554
12345---------200902-----$4353
12345 --------201004-----$34
12345 --------201005-----$44
12345---------201006-----$1454
45677---------200901-----$14454
45677---------200902-----$1478
45677 --------201004-----$116776
45677 --------201005-----$996
56789---------201006-----$1567
56789---------200901-----$7894
56789---------200902-----$123
56789 --------201003-----$543345
56789 --------201005-----$114
56789---------201006-----$54
I want to select the account# that have a period of 201005.
This is fairly easy using the code below. The problem is that if a user enters 201003-which doesnt exist- I want the query to select the previous value.*NOTE that there is an account# that has a 201003 period and I still want to select it too.*
I tried CASE, IF ELSE, IN but I was unsuccessfull.
PS:I cannot create temp tables due to system limitations of 5000 rows.
Thank you.
DECLARE #INPUTPERIOD INT
#INPUTPERIOD ='201005'
SELECT ACCOUNT#, PERIOD , BALANCE
FROM TABLE1
WHERE PERIOD =#INPUTPERIOD
SELECT t.ACCOUNT#, t.PERIOD, t.BALANCE
FROM (SELECT ACCOUNT#, MAX(PERIOD) AS MaxPeriod
FROM TABLE1
WHERE PERIOD <= #INPUTPERIOD
GROUP BY ACCOUNT#) q
INNER JOIN TABLE1 t
ON q.ACCOUNT# = t.ACCOUNT#
AND q.MaxPeriod = t.PERIOD
select top 1 account#, period, balance
from table1
where period >= #inputperiod
; WITH Base AS
(
SELECT *, ROW_NUMBER() OVER (ORDER BY Period DESC) RN FROM #MyTable WHERE Period <= 201003
)
SELECT * FROM Base WHERE RN = 1
Using CTE and ROW_NUMBER() (we take all the rows with Period <= the selected date and we take the top one (the one with auto-generated ROW_NUMBER() = 1)
; WITH Base AS
(
SELECT *, 1 AS RN FROM #MyTable WHERE Period = 201003
)
, Alternative AS
(
SELECT *, ROW_NUMBER() OVER (ORDER BY Period DESC) RN FROM #MyTable WHERE NOT EXISTS(SELECT 1 FROM Base) AND Period < 201003
)
, Final AS
(
SELECT * FROM Base
UNION ALL
SELECT * FROM Alternative WHERE RN = 1
)
SELECT * FROM Final
This one is a lot more complex but does nearly the same thing. It is more "imperative like". It first tries to find a row with the exact Period, and if it doesn't exists does the same thing as before. At the end it unite the two result sets (one of the two is always empty). I would always use the first one, unless profiling showed me the SQL wasn't able to comprehend what I'm trying to do. Then I would try the second one.