how do you sum over a related period - tsql

I need to sum values that are + 2 months or within a quarter period (related date table)
is there a way to use dense rank to partition those periods (custom periods)?
select
FiscalMonth
,Value
from table

The sql will have to do the following:
Join the value table and the period table
Include the period in the select list and sum the value, grouping by the period
i.e
select b.period, sum(a.value)
from table a
inner join period b on a.FiscalMonth between b.StartMonth and b.EndMonth
group by b.period
Note: The join condition will have to be modified based on what data you actually have in the period table.
Hope this helps

Well, If you need value from an X interval, by month you could use something like:
SELECT *
FROM yourTable
MONTH(some_date) = MONTH(CURRENT_DATE - INTERVAL 1 MONTH) //Could be X interval!
This is an example (which show the results of the previous month, from the actual one). Just trying to write that it is possible to massage the query in functions on intervals.
Of course, you could use the SUMcommand for the adding.

Related

Postgres: Storing output of moving average query to a column

I have a table in Postgres 14.2
Table name is test
There are 3 columns: date, high, and five_day_mavg (date is PK if it matters)
I have a select statement which properly calculates a 5 day moving average based on the data in high.
select date,
avg(high) over (order by date rows between 4 preceding and current row) as mavg_calc
from test
It products output as such:
I have 2 goals:
First to store the output of the query in five_day_mavg.
Second to store this in such a way that when I a new row with data
in high, it automatically calculates that value
The closest I got was:
update test set five_day_mavg = a.mav_calc
from (
select date,
avg(high) over (order by date rows between 4 preceding and current row) as mav_calc
from test
) a;
but all that does is sets the value of every row in five_day_mavg to entire average of high
Thanks to #a_horse_with_no_name
I played around with the WHERE clause
update test l set five_day_mavg = b.five_day_mavg from (select date, avg(high) over (order by date rows between 4 preceding and current row) as five_day_mavg from test )b where l.date = b.date;
a couple of things. I defined each table. The original table I aliased as l, the temporary table created by doing a windows function (the select statement in parenthesis) I aliased as b and I joined with the WHERE clause on date which is the index/primary key.
Also, I was using 'a' as the letter for alias, and I think that may have contributed to the issue.
Either way, solved now.

clickhouse downsample into OHLC time bar intervals

For a table e.g. containing a date, price timeseries with prices every e.g. millisecond, how can this be downsampled into groups of open high low close (ohlc) rows with time interval e.g. minute?
While option with arrays will work, the simplest option here is to use use combination of group by timeintervals with min, max, argMin, argMax aggregate functions.
SELECT
id,
minute,
max(value) AS high,
min(value) AS low,
avg(value) AS avg,
argMin(value, timestamp) AS first,
argMax(value, timestamp) AS last
FROM security
GROUP BY id, toStartOfMinute(timestamp) AS minute
ORDER BY minute
In ClickHouse you solve this kind of problem with arrays. Let's assume a table like the following:
CREATE TABLE security (
timestamp DateTime,
id UInt32,
value Float32
)
ENGINE=MergeTree
PARTITION BY toYYYYMM(timestamp)
ORDER BY (id, timestamp)
You can downsample to one-minute intervals with a query like the following:
SELECT
id, minute, max(value) AS high, min(value) AS low, avg(value) AS avg,
arrayElement(arraySort((x,y)->y,
groupArray(value), groupArray(timestamp)), 1) AS first,
arrayElement(arraySort((x,y)->y,
groupArray(value), groupArray(timestamp)), -1) AS last
FROM security
GROUP BY id, toStartOfMinute(timestamp) AS minute
ORDER BY minute
The trick is to use array functions. Here's how to decode the calls:
groupArray gathers column data within the group into an array.
arraySort sorts the values using the timestamp order. We use a lambda function to provide the timestamp array as the sorting key for the first array of values.
arrayElement allows us to pick the first and last elements respectively.
To keep the example simple I used DateTime for the timestamp which only samples at 1 second intervals. You can use a UInt64 column to get any precision you want. I added an average to my query to help check results.

Divide count of Table 1 by count of Table 2 on the same time interval in Tableau

I have two tables with IDs and time stamps. Table 1 has two columns: ID and created_at. Table 2 has two columns: ID and post_date. I'd like to create a chart in Tableau that displays the Number of Records in Table 1 divided by Number of Records in Table 2, by week. How can I achieve this?
One way might be to use Custom SQL like this to create a new data source for your visualization:
SELECT created_table.created_date,
created_table.created_count,
posted_table.posted_count
FROM (SELECT TRUNC (created_at) AS created_date, COUNT (*) AS created_count
FROM Table1) created_table
LEFT JOIN
(SELECT TRUNC (post_date) AS posted_date, COUNT (*) AS posted_count
FROM Table2) posted_table
ON created_table.created_date = posted_table.posted_date
This would give you dates and counts from both tables for those dates, which you could group using Tableau's date functions in the visualization. I made created_table the first part of the left join on the assumption that some records would be created and not posted, but you wouldn't have posts without creations. If that isn't the case you will want a different join.

How to get count of timestamps which has interval bigger than xx seconds between next row in PostgresSQL

I have table with 3 columns (postgres 9.6) : serial , timestamp , clock_name
Usually there is 1 second different between each row but sometimes the interval is bigger.
I'm trying to get the number of occasions that the timestamp interval between 2 rows was bigger than 10 seconds (lets say I limit this to 1000 rows)
I would like to do this in one query (probably select from select) but I have no idea how to write such a query , my sql knowladge is very basic.
Any help will be appreciated
You can use window functions to retrieve the next record record given the current record.
Using the ORDER BY on the function to ensure things are in time stamp order and using PARTITION to keep the clocks separate you can find for each row the row that follows it.
WITH links AS
(
SELECT
id, ts, clock, LEAD(ts) OVER (PARTITION BY clock ORDER BY ts) AS next_ts
FROM myTable
)
SELECT * FROM links
WHERE
EXTRACT(EPOCH FROM (next_ts - ts)) > 10
You can then just compare the time stamps.
Window functions https://www.postgresql.org/docs/current/static/functions-window.html
Or if you prefer to use derived tables instead of WITH clause.
SELECT * FROM (
SELECT
id, ts, clock, LEAD(ts) OVER (PARTITION BY clock ORDER BY ts) AS next_ts
FROM myTable
) links
WHERE
EXTRACT(EPOCH FROM (next_ts - ts)) > 10

Getting a count of number of rows matching MAX() value in Postgres

I have a table called 'games' that has a column in it called 'week'. I am trying to find a single query that will give me the maximum value for 'week' along with a count of how many rows in that table have the maximum value for 'week'. I could split it up into two queries:
SELECT MAX(week) FROM games
// store value in a variable $maxWeek
SELECT COUNT(1) FROM games WHERE week = $maxWeek
// store that result in a variable
Is there a way to do this all in one query?
SELECT week, count(*) FROM games GROUP BY week ORDER BY week DESC LIMIT 1;
or
SELECT week, count(*) FROM games WHERE week = (SELECT max(week) FROM games) GROUP BY week;
(may be faster)