I have this (beginner) query:
let getCandlesSQL =
$"SELECT
date_trunc('minute', ts) ts,
instrument,
MAX(price) high,
MIN(price) low,
(SUM(price * price * quantity) / SUM(price * quantity)) midpoint,
SUM(price * quantity) volume,
(SUM(direction * price * quantity) / SUM(price * quantity)) direction
FROM {tableTradesName}
WHERE instrument = '{instrument.Ticker}' AND ts BETWEEN '{fromTime}' AND '{toTime}'
GROUP BY date_trunc('minute', ts), instrument
ORDER BY ts
LIMIT 4500"
I rebuild the string with internal variables at every call, so I don't need to use the SQL variable mechanism.
There are a few calculations that are done multiple times, for example 'price * quantity' is done many times.
Is there a way to write the query to do it once and then re-use it?
Related
I'm using two different data sources in Grafana and need to use the result from each to calculate a percentage. What I need is equivalent to A/B, where A is
SELECT count(id) FROM Logs WHERE $__unixEpochFilter(RequestTimestamp DIV 1000)
from Database 1 and B is
SELECT count(id) FROM Entries WHERE $__unixEpochFilter(RequestTimestamp DIV 1000)
from Database 2. I can create a mixed data source panel and retrieve A and B, but can't find a way to perform an operation on the two results.
You can move the subqueries to the from clause and use cross join:
SELECT l.cnt * 1.0 / e.cnt
FROM (SELECT count(id) as cnt
FROM Logs
WHERE $__unixEpochFilter(RequestTimestamp DIV 1000)
) l CROSS JOIN
(SELECT count(id) as cnt
FROM Entries
WHERE $__unixEpochFilter(RequestTimestamp DIV 1000)
) e
The * 1.0 is because DB2 does integer division.
Given a column of overlapping and/or discontinuous ranges:
WITH tbl (active_dates) AS
(
VALUES
('["2015-05-21","2018-10-01")'::TSRANGE),
('["2016-08-13","2018-09-01")'::TSRANGE),
('["2019-03-01","2019-05-01")'::TSRANGE)
)
SELECT *
FROM tbl;
How can we generate an output that identifies all the discrete time periods like so:
active_dates
------------
["2015-05-21 00:00:00","2016-08-13 00:00:00")
["2016-08-13 00:00:00","2018-09-01 00:00:00")
["2018-09-01 00:00:00","2018-10-01 00:00:00")
["2019-03-01 00:00:00","2018-05-01 00:00:00")
As always, you can do that with window functions:
WITH tbl (active_dates) AS
(
VALUES
('["2015-05-21","2018-10-01")'::TSRANGE),
('["2016-08-13","2018-09-01")'::TSRANGE),
('["2019-03-01","2019-05-01")'::TSRANGE)
),
/* get all time points where something changes */
points AS (
SELECT upper(active_dates) AS p
FROM tbl
UNION SELECT lower(active_dates)
FROM tbl
),
/*
* Get all date ranges between these time points.
* The first time range will start with NULL,
* but that will be excluded in the next CTE anyway.
*/
inter AS (
SELECT tsrange(
lag(p) OVER (ORDER BY p),
p
) i
FROM points
)
/*
* Get all date ranges that are contained
* in at least one of the intervals.
*/
SELECT DISTINCT i
FROM inter
CROSS JOIN tbl
WHERE i <# active_dates
ORDER BY i;
I am running a query in pgadmin but facing issue column distance does not exist
select f.title, f.longitude, f.latitude, (3959 * cos(cos(radians('52.512452')) * cos(radians(latitude)) * cos(radians(longitude) - radians('13.390931')) + sin(radians('52.512452')) * sin(radians(latitude)))) AS distance from fitness_studio f having distance<1 order by distance desc
Thanks in advance for any help.
Regards,
Aisha
As far as I know postgresql has no way to directly use an alias column in where clause. So you should either try to duplicate the logic:
SELECT
f.title,
f.longitude,
f.latitude,
(3959 * cos(cos(radians('52.512452')) * cos(radians(latitude)) * cos(radians(longitude)
- radians('13.390931')) + sin(radians('52.512452')) * sin(radians(latitude)))) AS distance
FROM fitness_studio f
WHERE (3959 * cos(cos(radians('52.512452')) * cos(radians(latitude)) *
cos(radians(longitude) - radians('13.390931')) +
sin(radians('52.512452')) * sin(radians(latitude)))) < 1
ORDER BY distance DESC
either to use a subquery:
WITH container AS (
SELECT
f.title,
f.longitude,
f.latitude,
(3959 * cos(cos(radians('52.512452')) * cos(radians(latitude)) *
cos(radians(longitude) - radians('13.390931')) +
sin(radians('52.512452')) * sin(radians(latitude)))) AS distance
FROM fitness_studio f)
SELECT *
FROM container
WHERE distance < 1
ORDER BY distance DESC
Please keep in mind that using such subquery may negatively affect execution plan and when your table is large enough execution speed becomes more important than query awkwardness.
PS: Note that ORDER BY may correctly get alias as parameter. Suppose it's cause ORDER BY doesn't affect selected rows, it just rotates them. Same picture with GROUP BY
I would like to select the top 1% of rows; however, I cannot use subqueries to do it. I.e., this won't work:
SELECT * FROM mytbl
WHERE var='value'
ORDER BY id,random()
LIMIT(SELECT (COUNT(*) * 0.01)::integer FROM mytbl)
How would I accomplish the same output without using a subquery with limit?
You can utilize PERCENT_RANK:
WITH cte(ID, var, pc) AS
(
SELECT ID, var, PERCENT_RANK() OVER (ORDER BY random()) AS pc
FROM mytbl
WHERE var = 'value'
)
SELECT *
FROM cte
WHERE pc <= 0.01
ORDER BY id;
SqlFiddleDemo
I solved it with Python using the psycopg2 package:
cur.execute("SELECT ROUND(COUNT(id)*0.01,0)
FROM mytbl")
nrows = str([int(d[0]) for d in cur.fetchall()][0])
cur.execute("SELECT *
FROM mytbl
WHERE var='value'
ORDER BY id, random() LIMIT (%s)",nrows)
Perhaps there is a more elegant solution using just SQL, or a more efficient one, but this does exactly what I'm looking for.
If I got it right, you need:
Random 1% sample of all rows,
If some id is within the sample, all rows with the same id must be there too.
The follow sql should do the trick:
with ids as (
select id,
total,
sum(cnt) over (order by max(rnd)) running_total
from (
select id,
count(*) over (partition by id) cnt,
count(*) over () total,
row_number() over(order by random()) rnd
from mytbl
) q
group by id,
cnt,
total
)
select mytbl.*
from mytbl,
ids
where mytbl.id = ids.id
and ids.running_total <= ids.total * 0.01
order by mytbl.id;
I don’t have your data, of course, but I have no trouble using a sub query in the LIMIT clause.
However, the sub query contains only the count(*) part and I then multiply the result by 0.01:
SELECT * FROM mytbl
WHERE var='value'
ORDER BY id,random()
LIMIT(SELECT count(*) FROM mytbl)*0.01;
I have t-sql as follows:
SELECT (COUNT(Intakes.fk_ClientID) * 100) / (
SELECT count(*)
FROM INTAKES
WHERE Intakes.AdmissionDate >= #StartDate
)
FROM Intakes
WHERE Intakes.fk_ReleasedFromID = '1'
AND Intakes.AdmissionDate >= #StartDate;
I'm trying to get the percentage of clients who have releasedfromID = 1 out of a subset of clients who have a certain range of admission dates. But I get rows of 1's and 0's instead. Now, I can get the percentage if I take out the where clauses, it works:
SELECT (COUNT(Intakes.fk_ClientID) * 100) / (
SELECT count(*)
FROM INTAKES
)
FROM Intakes
WHERE Intakes.fk_ReleasedFromID = '1';
works fine. It selects ClientIDs where ReleasedFromID =1, multiplies it by 100 and divides by total rows in Intakes. But how do you run percentage with WHERE clauses as above?
After reading comment from #Anssssss
SELECT (COUNT(Intakes.fk_ClientID) * 100.0) / (
SELECT count(*)
FROM INTAKES
) 'percentage'
FROM Intakes
WHERE Intakes.fk_ReleasedFromID = '1';