php + mysql : find data by time end & time start - php4

event table have two fields start and end and all in datetime format
This is query
SELECT e.* FROM jos_events e
WHERE 1=1
AND e.start >= 2006-07-10
AND e.end <= 2015-10-21
ORDER BY e.id DESC
i want to find events between time start and time end.
how to do it ?

SELECT e.* FROM jos_events e
WHERE
AND e.start <= '2006-07-10 00:00:00'
AND e.end >= '2015-10-21 23:59:59'
ORDER BY e.id DESC

Related

Optimize Query contains join and SubQuery

I need to run this query but it takes so long and I got timeout exception.
would you please help me how can I decrease the execution time of this query or how how can I make it simpler?
here is my Postgres Query:
select
AR1.patient_id,
CONCAT(Ac."firstName", ' ', Ac."lastName") as doctor_full_name,
to_json(Ac.expertise::json->0->'id')::text as expertise_id,
to_json(Ac.expertise::json->0->'title')::text as expertise_title,
AP."phoneNumbers" as mobile,
AC.account_id as account_id,
AC.city_id
from
tb1 as AR1
LEFT JOIN tb2 as AA
on AR1.appointment_id = AA.id
LEFT JOIN tb3 as AC
on AC.account_id = AA.appointment_owner_id
LEFT JOIN tb4 as AP
on AP.id = AR1.patient_id
where AR1.status = 'canceled'
and AR1.updated_at >= '2022-12-30 00:00:00'
and AR1.updated_at < '2022-12-30 23:59:59'
and AP."phoneNumbers" <> ''
and patient_id not in (
select
AR2.patient_id
from
tb1 as AR2
LEFT JOIN tb2 as AA2
on AR2.appointment_id = AA2.id
LEFT JOIN tb3 as AC2
on AC2.account_id = AA2.appointment_owner_id
where AR2.status = 'submited'
and AR2.created_at >= '2022-12-30 00:00:00'
and ( to_json(Ac2.expertise::json->0->'id')::text = to_json(Ac.expertise::json->0->'id')::text or ac2.account_id = ac.account_id )
)
Try creating an index on tb1 to handle the WHERE clauses you use in your outer query.
CREATE INDEX status_updated ON tb1
(status, updated_at, patient_id);
And, create this index to handle your inner query.
CREATE INDEX status_created ON tb1
(status, created_at, patient_id);
These work because the query planner can random access these BTREE indexes to the the first eligible row by status and date, and then sequentially scan the index until the last eligible row.
The comments about avoiding f(column) expressions in WHERE and ON conditions are correct. You want those conditions to be sargable whenever possible.
And, by the way, you want this for a datestamp range
and AR1.updated_at >= '2022-12-30 00:00:00'
and AR1.updated_at < '2022-12-31 00:00:00'
You have
and AR1.updated_at >= '2022-12-30 00:00:00'
and AR1.updated_at < '2022-12-30 23:59:59'
which excludes, rather than includes, rows from the last moment of 2022-12-30. In can be very hard to figure out what went wrong if you exclude a row improperly with a date-range off-by-one-error. (Ask how I know this sometime :-)

Postgresql want to run a query for each day in an interval

i have a query which i need to run for every day in an interval, like for each day, for the last 2 years, i don't have the day info in the table, so i need to do it in a loop i think:
' select distinct on (osu.order_id) osu.order_id, osu.order_state, osu.created_at
from stock_management.order_state_updates osu
where osu.created_at < '2021-01-26 22:00:00'
order by osu.order_id desc, osu.created_at desc) temp
where temp.order_state = 'Filter1';'
in which the date '2021-01-26 22:00:00' would go through each day of the interval. thank you
https://docs.google.com/spreadsheets/d/1B2xx-c3wWZYaEN76LxjYhHnlrPRUx4TG8vsAkZ1X_Vs/edit?usp=sharing
error
You can generate a calendar and join it to your query. I'm not sure this will retrieve the right datas because I don't have sample data and expected result.
with d as (select * from generate_series ('20210101','20220427',interval '1 day') as date)
select distinct on (osu.order_id) osu.order_id, osu.order_state, osu.created_at::date
from stock_management.order_state_updates osu right join d on osu.created_at::date = d.date
order by osu.order_id desc, osu.created_at desc) temp
where temp.order_state = 'Filter1';

Select count between dates & All time counts in one query Postgres DB

I want to select count of impression between the dates and All time impression as well, can we do this in one query ?
This is my query in which I am able to get impression only in between dates
SELECT
robotAds."Ad_ID",
count(robotScraper."adIDAdID") as ad_impression
FROM
robot__ads robotAds
LEFT JOIN robot__session__scraper__data robotScraper
ON robotScraper."adIDAdID" = robotAds."Ad_ID"
LEFT JOIN robot__session_data robotSession
ON robotSession."id" = robotScraper."sessionIDId"
AND robotSession."Session_start" BETWEEN '2020-11-25 00:00:00'
AND '2021-04-01 00:00:00'
GROUP BY
robotAds."Ad_ID"
What I have to do to get count of all time impression in this same query.
Thanks
yes you can :
SELECT
robotAds."Ad_ID",
count(robotScraper."adIDAdID") filter (where robotSession."Session_start" BETWEEN '2020-11-25 00:00:00'AND '2021-04-01 00:00:00') as ad_impression,
count(robotScraper."adIDAdID") count_alltime
FROM
robot__ads robotAds
LEFT JOIN robot__session__scraper__data robotScraper
ON robotScraper."adIDAdID" = robotAds."Ad_ID"
LEFT JOIN robot__session_data robotSession
ON robotSession."id" = robotScraper."sessionIDId"
GROUP BY
robotAds."Ad_ID"
"Conditional aggregation" should meet this need. Essentially this is using a case expression inside the aggregation function, like this:
SELECT
robotAds."Ad_ID"
, count(CASE
WHEN robotSession."Session_start" BETWEEN '2020-11-25 00:00:00'
AND '2021-04-01 00:00:00'
THEN 1
END) AS range_ad_impression
, count(robotScraper."adIDAdID") AS all_ad_impression
FROM robot__ads robotAds
LEFT JOIN robot__session__scraper__data robotScraper ON robotScraper."adIDAdID" = robotAds."Ad_ID"
LEFT JOIN robot__session_data robotSession ON robotSession."id" = robotScraper."sessionIDId"
GROUP BY robotAds."Ad_ID"
Note: the count() function ignores NULLs, above I have ommitted an explicit instruction to return NULL but some prefer to do this using else i.e.
,count(CASE
WHEN robotSession."Session_start" BETWEEN '2020-11-25 00:00:00'
AND '2021-04-01 00:00:00'
THEN 1 ELSE NULL
END) AS range_count

postgresql complex query joing same table

I would like to get those customers from a table 'transactions' which haven't created any transactions in the last 6 Months.
Table:
'transactions'
id, email, state, paid_at
To visualise:
|------------------------ time period with all transactions --------------------|
|-- period before month transactions > 0) ---|---- curr month transactions = 0 -|
I guess this is doable with a join showing only those that didn't have any transactions on the right side.
Example:
Month = November
The conditions for the left side should be:
COUNT(l.id) > 0
l.paid_at < '2013-05-01 00:00:00'
Conditions for the right side:
COUNT(r.id) = 0
r.paid_at BETWEEN '2013-05-01 00:00:00' AND '2013-11-30 23:59:59'
Is join the right approach?
Answer
SELECT
C .email
FROM
transactions C
WHERE
(
C .email NOT IN (
SELECT DISTINCT
email
FROM
transactions
WHERE
paid_at >= '2013-05-01 00:00:00'
AND paid_at <= '2013-11-30 23:59:59'
)
AND
C .email IN (
SELECT DISTINCT
email
FROM
transactions
WHERE
paid_at <= '2013-05-01 00:00:00'
)
)
AND c.paid_at <= '2013-11-30 23:59:59'
There are a couple of ways you could do this. Use a subquery to get distinct customer ids for transactions in the last 6 months, and then select customers where their id isn't in the subquery.
select c.id, c.name
from customer c
where c.id not in (select distinct customer_id from transaction where dt between <start> and <end>);
Or, use a left join from customer to transaction, and filter the results to have transaction id null. A left join includes all rows from the left-hand table, even when there are no matching rows in the right-hand table. Explanation of left joins here: http://www.codinghorror.com/blog/2007/10/a-visual-explanation-of-sql-joins.html
select c.id, c.name
from customer c
left join transaction t on c.id = t.customer_id
and t.dt between <start> and <end>
where t.id is null;
The left join approach is likely to be faster.

Postgresql SQL GROUP BY time interval with arbitrary accuracy (down to milli seconds)

I have my measurement data stored into the following structure:
CREATE TABLE measurements(
measured_at TIMESTAMPTZ,
val INTEGER
);
I already know that using
(a) date_trunc('hour',measured_at)
AND
(b) generate_series
I would be able to aggregate my data by:
microseconds,
milliseconds
.
.
.
But is it possible to aggregate the data by 5 minutes or let's say an arbitrary amount of seconds? Is it possible to aggregate measured data by an arbitrary multiple of seconds?
I need the data aggregated by different time resolutions to feed them into a FFT or an AR-Model in order to see possible seasonalities.
You can generate a table of "buckets" by adding intervals created by generate_series(). This SQL statement will generate a table of five-minute buckets for the first day (the value of min(measured_at)) in your data.
select
(select min(measured_at)::date from measurements) + ( n || ' minutes')::interval start_time,
(select min(measured_at)::date from measurements) + ((n+5) || ' minutes')::interval end_time
from generate_series(0, (24*60), 5) n
Wrap that statement in a common table expression, and you can join and group on it as if it were a base table.
with five_min_intervals as (
select
(select min(measured_at)::date from measurements) + ( n || ' minutes')::interval start_time,
(select min(measured_at)::date from measurements) + ((n+5) || ' minutes')::interval end_time
from generate_series(0, (24*60), 5) n
)
select f.start_time, f.end_time, avg(m.val) avg_val
from measurements m
right join five_min_intervals f
on m.measured_at >= f.start_time and m.measured_at < f.end_time
group by f.start_time, f.end_time
order by f.start_time
Grouping by an arbitrary number of seconds is similar--use date_trunc().
A more general use of generate_series() lets you avoid guessing the upper limit for five-minute buckets. In practice, you'd probably build this as a view or a function. You might get better performance from a base table.
select
(select min(measured_at)::date from measurements) + ( n || ' minutes')::interval start_time,
(select min(measured_at)::date from measurements) + ((n+5) || ' minutes')::interval end_time
from generate_series(0, ((select max(measured_at)::date - min(measured_at)::date from measurements) + 1)*24*60, 5) n;
Catcall has a great answer. My example of using it demonstrates having fixed buckets - in this case 30 minute intervals starting at midnight. It also shows that there can be one extra bucket generated in Catcall's first version and how to eliminate it. I wanted exactly 48 buckets in a day. In my problem, observations have separate date and time columns and I want to average the observations within a 30 minute period across the month for a number of different services.
with intervals as (
select
(n||' minutes')::interval as start_time,
((n+30)|| ' minutes')::interval as end_time
from generate_series(0, (23*60+30), 30) n
)
select i.start_time, o.service, avg(o.o)
from
observations o right join intervals i
on o.time >= i.start_time and o.time < i.end_time
where o.date between '2013-01-01' and '2013-01-31'
group by i.start_time, i.end_time, o.service
order by i.start_time
How about
SELECT MIN(val),
EXTRACT(epoch FROM measured_at) / EXTRACT(epoch FROM INTERVAL '5 min') AS int
FROM measurements
GROUP BY int
where '5 min' can be any expression supported by INTERVAL
The following will give you buckets of any size, even if they don't aline well with a nice minute/hour/whatever boundary. The value "300" is for a 5 minute grouping, but any value can be substituted:
select measured_at,
val,
(date_trunc('seconds', (measured_at - timestamptz 'epoch') / 300) * 300 + timestamptz 'epoch') as aligned_measured_at
from measurements;
You can then use whatever aggregate you need around "val", and use "group by aligned_measured_at" as required.
This is based on Mike Sherrill's answer, except that it uses timestamp intervals instead of separate start/end columns.
with intervals as (
select tstzrange(s, s + '5 minutes') das_interval
from (select generate_series(min(lower(time_range)), max(upper(time_rage)), '5 minutes') s
from your_table) x)
select das_interval, your_table.*
from your_table
right join intervals on time_range && das_interval
order by das_interval;
From PostgreSQL v14 on, you can use the date_bin function for that:
SELECT date_bin(
INTERVAL '5 minutes',
measured_at,
TIMSTAMPTZ '2000-01-01'
),
sum(val)
FROM measurements
GROUP BY 1;
I wanted to look at the past 24 hours of data and count things in hourly increments. I started Cat Recall's solution, which is pretty slick. It's bound to the data, though, rather than just what's happened in the past 24H. So I refactored and ended up with something pretty close to Julian's solution, but with more CTE. So it's sort of the marriage of the 2 answers.
WITH interval_query AS (
SELECT (ts ||' hour')::INTERVAL AS hour_interval
FROM generate_series(0,23) AS ts
), time_series AS (
SELECT date_trunc('hour', now()) + INTERVAL '60 min' * ROUND(date_part('minute', now()) / 60.0) - interval_query.hour_interval AS start_time
FROM interval_query
), time_intervals AS (
SELECT start_time, start_time + '1 hour'::INTERVAL AS end_time
FROM time_series ORDER BY start_time
), reading_counts AS (
SELECT f.start_time, f.end_time, br.minor, count(br.id) readings
FROM beacon_readings br
RIGHT JOIN time_intervals f
ON br.reading_timestamp >= f.start_time AND br.reading_timestamp < f.end_time AND br.major = 4
GROUP BY f.start_time, f.end_time, br.minor
ORDER BY f.start_time, br.minor
)
SELECT * FROM reading_counts
Note that any additional limiting I wanted in the final query needed to be done in the RIGHT JOIN. I'm not suggesting this is necessarily the best (or even a good approach), but it is something I'm running with (at least at the moment) in a dashboard.
The Timescale extension for PostgreSQL gives the ability to group by arbitrary time intervals. The function is called time_bucket() and has the same syntax as the date_trunc() function but takes an interval instead of a time precision as first parameter. Here you can find its API Docs. This is an example:
SELECT
time_bucket('5 minutes', observation_time) as bucket,
device_id,
avg(metric) as metric_avg,
max(metric) - min(metric) as metric_spread
FROM
device_readings
GROUP BY bucket, device_id;
You may also take a look at the continuous aggregate views if you want the 'grouped by an interval' views be updated automatically with new ingested data and if you want to query these views on a frequent basis. This can save you a lot of resources and will make your queries a lot faster.
I've taken a synthesis of all the above to try and come up with something slightly easier to use;
create or replace function interval_generator(start_ts timestamp with TIME ZONE, end_ts timestamp with TIME ZONE, round_interval INTERVAL)
returns TABLE(start_time timestamp with TIME ZONE, end_time timestamp with TIME ZONE) as $$
BEGIN
return query
SELECT
(n) start_time,
(n + round_interval) end_time
FROM generate_series(date_trunc('minute', start_ts), end_ts, round_interval) n;
END
$$
LANGUAGE 'plpgsql';
This function is a timestamp abstraction of Mikes answer, which (IMO) makes things a little cleaner, especially if you're generating queries on the client end.
Also using an inner join gets rid of the sea of NULLs that appeared previously.
with intervals as (select * from interval_generator(NOW() - INTERVAL '24 hours' , NOW(), '30 seconds'::INTERVAL))
select f.start_time, m.session_id, m.metric, min(m.value) min_val, avg(m.value) avg_val, max(m.value) max_val
from ts_combined as m
inner JOIN intervals f
on m.time >= f.start_time and m.time < f.end_time
GROUP BY f.start_time, f.end_time, m.metric, m.session_id
ORDER BY f.start_time desc
(Also for my purposes I added in a few more aggregation fields)
Perhaps, you can extract(epoch from measured_at) and go from that?