How to group data into 10 second intevals? - tsql

I have a table with 3 columns.
id bigint,
rate decimal(18,2),
dateCreated datetime
I need the results to be grouped by 10 second interval.
Something like :
2019-12-15:00 13:20:10 avg(rate)
2019-12-15:00 13:20:20 avg(rate)
2019-12-15:00 13:20:30 avg(rate)
etc....
I am using SQL server 2017.
How do I accomplish that ?
Thank you in advance.

Perhaps truncating the datetime
Example
Select TrimTime = convert(varchar(18),dateCreated,120)+'0'
,AvgRate = avg(Rate)
From YourTable
Group By convert(varchar(18),dateCreated,120)

Imagine you have a time table... let's call it TimeHacks...and it's divided into 10 second time spans. Maybe it looks something like
create table TimeHacks
(
IntervalNumber bigint,
DateTimeAtInterval datetime
)
Then it would be easy to join it to your table:
select
t.IntervalNumber,
avg( y.rate )
from
dbo.TimeHacks i
inner join
dbo.YourTable y
on
datediff( s, y.DateCreated, i.DateTimeAtInterval ) between 0 and 9
group by
i.IntervalNumber
...and so the question is...do you make a table that has all those time hacks in it, or is there a better way?
So...now let's say that TimeHacks table isn't a table at all, but a SQL Server table valued function that takes 3 arguments, a start datetime, an end datetime and a number of seconds between hacks.
You could say something like:
select
i.IntervalNumber,
avg( y.rate )
from
dbo.TimeHacks( '2019-1-15 10:10:10', '2019-1-16 10:10:10', 10 )
inner join
dbo.YourTable y
on
datediff( s, y.DateCreated, i.DateTimeAtInterval ) between 0 and 9
group by
i.id
Hmmm...but how to make that TimeHacks function? Let's imagine that we also have a table valued function that can generate arbitrary sequences of integers from any start to any end... like dbo.FromTo( start bigint, end bigint ). Then, our TimeHacks could be something like:
create function dbo.TimeHacks( #start datetime, #end datetime, #seconds int ) returns table as return
select
ft.x as IntervalNumber,
dateadd( s, ft.x * #seconds, #start ) DateTimeAtInterval
from
dbo.FromTo( 0, datediff( s, #start, #end ) / #seconds ) ft
...and finally...how would we generate a sequence of numbers that forms the basis for our dbo.TimeHacks function? There's a number of answers to that question...some good ones here. I learned a lot from those answers. You might, too. Anyway, I've got a favorite version I've tweaked out of those answers. It makes a sequence of numbers without doing any I/O and can produce any sequence in order in the range of integers...and it's screaming fast:
create function dbo.FromTo( #start bigint, #end bigint ) returns table as return
with
x0 as (select x from (values (0),(0x00000001),(0x00000002),(0x00000003),(0x00000004),(0x00000005),(0x00000006),(0x00000007),(0x00000008),(0x00000009),(0x0000000A),(0x0000000B),(0x0000000C),(0x0000000D),(0x0000000E),(0x0000000F)) as x0(x)),
x1 as (select x from (values (0),(0x00000010),(0x00000020),(0x00000030),(0x00000040),(0x00000050),(0x00000060),(0x00000070),(0x00000080),(0x00000090),(0x000000A0),(0x000000B0),(0x000000C0),(0x000000D0),(0x000000E0),(0x000000F0)) as x1(x)),
x2 as (select x from (values (0),(0x00000100),(0x00000200),(0x00000300),(0x00000400),(0x00000500),(0x00000600),(0x00000700),(0x00000800),(0x00000900),(0x00000A00),(0x00000B00),(0x00000C00),(0x00000D00),(0x00000E00),(0x00000F00)) as x2(x)),
x3 as (select x from (values (0),(0x00001000),(0x00002000),(0x00003000),(0x00004000),(0x00005000),(0x00006000),(0x00007000),(0x00008000),(0x00009000),(0x0000A000),(0x0000B000),(0x0000C000),(0x0000D000),(0x0000E000),(0x0000F000)) as x3(x)),
x4 as (select x from (values (0),(0x00010000),(0x00020000),(0x00030000),(0x00040000),(0x00050000),(0x00060000),(0x00070000),(0x00080000),(0x00090000),(0x000A0000),(0x000B0000),(0x000C0000),(0x000D0000),(0x000E0000),(0x000F0000)) as x4(x)),
x5 as (select x from (values (0),(0x00100000),(0x00200000),(0x00300000),(0x00400000),(0x00500000),(0x00600000),(0x00700000),(0x00800000),(0x00900000),(0x00A00000),(0x00B00000),(0x00C00000),(0x00D00000),(0x00E00000),(0x00F00000)) as x5(x)),
x6 as (select x from (values (0),(0x01000000),(0x02000000),(0x03000000),(0x04000000),(0x05000000),(0x06000000),(0x07000000),(0x08000000),(0x09000000),(0x0A000000),(0x0B000000),(0x0C000000),(0x0D000000),(0x0E000000),(0x0F000000)) as x6(x)),
x7 as (select x from (values (0),(0x10000000),(0x20000000),(0x30000000),(0x40000000),(0x50000000),(0x60000000),(0x70000000)) as x7(x))
select s.x
from
(
select
x7.x
| x6.x
| x5.x
| x4.x
| x3.x
| x2.x
| x1.x
| x0.x
+ #start
x
from
x7
inner remote join x6 on x6.x <= #end - #start and x7.x <= #end - #start
inner remote join x5 on x5.x <= #end - #start
inner remote join x4 on x4.x <= #end - #start
inner remote join x3 on x3.x <= #end - #start
inner remote join x2 on x2.x <= #end - #start
inner remote join x1 on x1.x <= #end - #start
inner remote join x0 on x0.x <= #end - #start
) s
where #end >= s.x
The way this function works borders on magic. It's based on the question I referenced, and in particular it builds on the answers by Brian Pressler and
mechoid.

This is actually a very good and somewhat complex question. I'm sorry you were treated so. Take a look at this Stack Overflow post for a possible way of attacking it and good luck.

Related

Express Nearest Neighbor Join in Postgresql?

I have two tables Q and T, both containing a column of float numbers.
What I want to do is, for each number in Q, I want to find a number in T that has the smallest distance to it.
For example, for T={1,7,9} and Q={2,6,10}, I want to return Q,T pairs as {(2,1),(6,7),(10,9)}.
How should I express this query with SQL?
In addition, is that possible to accelerate this join by index, e.g. add an operator class which bind "FOR ORDER BY <->" with fabs calculation?
create table t (val_t integer);
create table q (val_q integer);
insert into t values (1),(7),(9);
insert into q values (2),(6),(10);
Start with a query that cross joins the two tables and adds a rank based on the difference:
SELECT val_q, val_t, rank() OVER (PARTITION BY val_q ORDER BY abs(val_t - val_q))
FROM t
JOIN q ON true ;
Use this query in a cte or subquery and filter by rank:
WITH src AS(
SELECT val_q, val_t, rank() OVER (PARTITION BY val_q ORDER BY abs(val_t - val_q))
FROM t
JOIN q ON true )
SELECT val_q, val_t FROM src
WHERE rank = 1;
val_q | val_t
-------+-------
2 | 1
6 | 7
10 | 9
See https://www.postgresql.org/docs/12/tutorial-window.html
Given this schema:
create table t (tn float);
insert into t values (1), (7), (9);
create table q (qn float);
insert into q values (2), (6), (10);
DISTINCT ON is the most straightforward way:
select distinct on (qn) qn, tn
from q
cross join t
order by qn, abs(qn - tn);
Exploiting a numeric range may perform better depending on your data sizes. If performance is an issue, then you can create an actual temp table for the range_tn CTE and put a gist index on it:
with all_tn as (
select tn
from t
union select null
), range_tn as (
select numrange(tn::numeric, (lead(tn) over w)::numeric, '[]') as tr
from all_tn
window w as (order by tn nulls first)
)
select qn,
case
when lower_inf(tr) then upper(tr)
when upper_inf(tr) then lower(tr)
when 2 * qn - lower(tr) - upper(tr) > 0 then upper(tr)
else lower(tr)
end as tn
from q
join range_tn
on qn::numeric <# tr;
Fiddle here

How to calculate the no of iso weeks in a year in t-sql?

I need a simple query to calculate the no of iso weeks in any given year?
I think this should do the trick.
DECLARE #year smallint = 2015;
SELECT
TheYear = #year,
ISOWeeks= MAX(DATEPART(ISOWK,DATEADD(DD,N,CAST(CAST(#year AS char(4))+'1223' AS date))))
FROM (VALUES (1),(2),(3),(4),(5),(6),(7),(8)) t(N);
You could include this logic in a function like this:
CREATE FUNCTION dbo.CalculateISOWeeks(#year smallint)
RETURNS TABLE WITH SCHEMABINDING AS RETURN
SELECT ISOWeeks =
MAX(DATEPART(ISOWK,DATEADD(DD,N,CAST(CAST(#year AS char(4))+'1223' AS date))))
FROM (VALUES (1),(2),(3),(4),(5),(6),(7),(8)) t(N);
and use it like this:
SELECT ISOWeeks FROM dbo.CalculateISOWeeks(2014);
or better yet... because we're calculating a static value, why not just pop those values into a table then index it like this:
SELECT
Yr = ISNULL(CAST(Yr AS smallint),0),
ISOWeeks = ISNULL(CAST(ISOWeeks AS tinyint),0)
INTO dbo.ISOCounts
FROM
(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL))+1949
FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) a(x),
(VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) b(x)
) Years(Yr)
CROSS APPLY dbo.CalculateISOWeeks(Yr+1950);
CREATE UNIQUE CLUSTERED INDEX uci_ISOCounts ON dbo.ISOCounts(Yr);
Now whenever you need to calculate the number of ISO weeks for a given year you can retrieve the pre-calculated value from your table via an index seek.
SELECT * FROM dbo.ISOCounts WHERE yr = 2014;
Results:
Yr ISOWeeks
------ --------
2014 53

Cumulative sum with group by and join

I'm a little struggled with finding a clean way to do this. Assume that I have the following records in my table named Records:
|Name| |InsertDate| |Size|
john 30.06.2015 1
john 10.01.2016 10
john 12.01.2016 100
john 05.03.2016 1000
doe 01.01.2016 1
How do I get the records for year of 2016 and month is equal to or less than 3 grouped by month(even that month does not exists e.g. month 2 in this case) with cumulative sum of Size including that month? I want to get the result as the following:
|Name| |Month| |Size|
john 1 111
john 2 111
john 3 1111
doe 1 1
As other commenters have already stated, you simply need a table with dates in that you can join from to give you the dates that your source table does not have records for:
-- Build the source data table.
declare #t table(Name nvarchar(10)
,InsertDate date
,Size int
);
insert into #t values
('john','20150630',1 )
,('john','20160110',10 )
,('john','20160112',100 )
,('john','20160305',1000)
,('doe' ,'20160101',1 );
-- Specify the year you want to search for by storing the first day here.
declare #year date = '20160101';
-- This derived table builds a set of dates that you can join from.
-- LEFT JOINing from here is what gives you rows for months without records in your source data.
with Dates
as
(
select #year as MonthStart
,dateadd(day,-1,dateadd(month,1,#year)) as MonthEnd
union all
select dateadd(month,1,MonthStart)
,dateadd(day,-1,dateadd(month,2,MonthStart))
from Dates
where dateadd(month,1,MonthStart) < dateadd(yyyy,1,#year)
)
select t.Name
,d.MonthStart
,sum(t.Size) as Size
from Dates d
left join #t t
on(t.InsertDate <= d.MonthEnd)
where d.MonthStart <= '20160301' -- Without knowing what your logic is for specifying values only up to March, I have left this part for you to automate.
group by t.Name
,d.MonthStart
order by t.Name
,d.MonthStart;
If you have a static date reference table in your database, you don't need to do the derived table creation and can just do:
select d.DateValue
,<Other columns>
from DatesReferenceTable d
left join <Other Tables> o
on(d.DateValue = o.AnyDateColumn)
etc
Here's another approach that utilizes a tally table (aka numbers table) to create the date table. Note my comments.
-- Build the source data table.
declare #t table(Name nvarchar(10), InsertDate date, Size int);
insert into #t values
('john','20150630',1 )
,('john','20160110',10 )
,('john','20160112',100 )
,('john','20160305',1000)
,('doe' ,'20160101',1 );
-- A year is fine, don't need a date data type
declare #year smallint = 2016;
WITH -- dummy rows for a tally table:
E AS (SELECT E FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t(e)),
dateRange(totalDays, mn, mx) AS -- Get the range and number of months to create
(
SELECT DATEDIFF(MONTH, MIN(InsertDate), MAX(InsertDate)), MIN(InsertDate), MAX(InsertDate)
FROM #t
),
iTally(N) AS -- Tally Oh! Create an inline Tally (aka numbers) table starting with 0
(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 1))-1
FROM E a CROSS JOIN E b CROSS JOIN E c CROSS JOIN E d
),
RunningTotal AS -- perform a running total by year/month for each person (Name)
(
SELECT
yr = YEAR(DATEADD(MONTH, n, mn)),
mo = MONTH(DATEADD(MONTH, n, mn)),
Name,
Size = SUM(Size) OVER
(PARTITION BY Name ORDER BY YEAR(DATEADD(MONTH, n, mn)), MONTH(DATEADD(MONTH, n, mn)))
FROM iTally
CROSS JOIN dateRange
LEFT JOIN #t ON MONTH(InsertDate) = MONTH(DATEADD(MONTH, n, mn))
WHERE N <= totalDays
) -- Final output will only return rows where the year matches #year:
SELECT
name = ISNULL(name, LAG(Name, 1) OVER (ORDER BY yr, mo)),
yr, mo,
size = ISNULL(Size, LAG(Size, 1) OVER (ORDER BY yr, mo))
FROM RunningTotal
WHERE yr = #year
GROUP BY yr, mo, name, size;
Results:
name yr mo size
---------- ----------- ----------- -----------
doe 2016 1 1
john 2016 1 111
john 2016 2 111
john 2016 3 1111

query for a range of records in result

I am wondering if there is some easy way, a function, or other method to return data from a query with the following results.
I have a SQL Express DB 2008 R2, a table that contains numerical data in a given column, say col T.
I am given a value X in code and would like to return up to three records. The record where col T equals my value X, and the record before and after, and nothing else. The sort is done on col T. The record before may be beginning of file and therefore not exist, likewise, if X equals the last record then the record after would be non existent, end of file/table.
The value of X may not exist in the table.
This I think is similar to get a range of results in numerical order.
Any help or direction in solving this would be greatly appreciated.
Thanks again,
It might not be the most optimal solution, but:
SELECT T
FROM theTable
WHERE T = X
UNION ALL
SELECT *
FROM
(
SELECT TOP 1 T
FROM theTable
WHERE T > X
ORDER BY T
) blah
UNION ALL
SELECT *
FROM
(
SELECT TOP 1 T
FROM theTable
WHERE T < X
ORDER BY T DESC
) blah2
DECLARE #x int = 100
;WITH t as
(
select ROW_NUMBER() OVER (ORDER BY T ASC) AS row_nm,*
from YourTable
)
, t1 as
(
select *
from t
WHERE T = #x
)
select *
from t
CROSS APPLY t1
WHERE t.row_nm BETWEEN t1.row_nm -1 and t1.row_nm + 1

Getting the minimum of two values in SQL

I have two variables, one is called PaidThisMonth, and the other is called OwedPast. They are both results of some subqueries in SQL. How can I select the smaller of the two and return it as a value titled PaidForPast?
The MIN function works on columns, not variables.
SQL Server 2012 and 2014 supports IIF(cont,true,false) function. Thus for minimal selection you can use it like
SELECT IIF(first>second, second, first) the_minimal FROM table
While IIF is just a shorthand for writing CASE...WHEN...ELSE, it's easier to write.
The solutions using CASE, IIF, and UDF are adequate, but impractical when extending the problem to the general case using more than 2 comparison values. The generalized
solution in SQL Server 2008+ utilizes a strange application of the VALUES clause:
SELECT
PaidForPast=(SELECT MIN(x) FROM (VALUES (PaidThisMonth),(OwedPast)) AS value(x))
Credit due to this website:
http://sqlblog.com/blogs/jamie_thomson/archive/2012/01/20/use-values-clause-to-get-the-maximum-value-from-some-columns-sql-server-t-sql.aspx
Use Case:
Select Case When #PaidThisMonth < #OwedPast
Then #PaidThisMonth Else #OwedPast End PaidForPast
As Inline table valued UDF
CREATE FUNCTION Minimum
(#Param1 Integer, #Param2 Integer)
Returns Table As
Return(Select Case When #Param1 < #Param2
Then #Param1 Else #Param2 End MinValue)
Usage:
Select MinValue as PaidforPast
From dbo.Minimum(#PaidThisMonth, #OwedPast)
ADDENDUM:
This is probably best for when addressing only two possible values, if there are more than two, consider Craig's answer using Values clause.
For SQL Server 2022+ (or MySQL or PostgreSQL 9.3+), a better way is to use the LEAST and GREATEST functions.
SELECT GREATEST(A.date0, B.date0) AS date0,
LEAST(A.date1, B.date1, B.date2) AS date1
FROM A, B
WHERE B.x = A.x
With:
GREATEST(value [, ...]) : Returns the largest (maximum-valued) argument from values provided
LEAST(value [, ...]) Returns the smallest (minimum-valued) argument from values provided
Documentation links :
MySQL http://dev.mysql.com/doc/refman/5.0/en/comparison-operators.html
Postgres https://www.postgresql.org/docs/current/functions-conditional.html
SQL Server https://learn.microsoft.com/en-us/sql/t-sql/functions/logical-functions-least-transact-sql
I just had a situation where I had to find the max of 4 complex selects within an update.
With this approach you can have as many as you like!
You can also replace the numbers with aditional selects
select max(x)
from (
select 1 as 'x' union
select 4 as 'x' union
select 3 as 'x' union
select 2 as 'x'
) a
More complex usage
#answer = select Max(x)
from (
select #NumberA as 'x' union
select #NumberB as 'x' union
select #NumberC as 'x' union
select (
Select Max(score) from TopScores
) as 'x'
) a
I'm sure a UDF has better performance.
Here is a trick if you want to calculate maximum(field, 0):
SELECT (ABS(field) + field)/2 FROM Table
returns 0 if field is negative, else, return field.
Use a CASE statement.
Example B in this page should be close to what you're trying to do:
http://msdn.microsoft.com/en-us/library/ms181765.aspx
Here's the code from the page:
USE AdventureWorks;
GO
SELECT ProductNumber, Name, 'Price Range' =
CASE
WHEN ListPrice = 0 THEN 'Mfg item - not for resale'
WHEN ListPrice < 50 THEN 'Under $50'
WHEN ListPrice >= 50 and ListPrice < 250 THEN 'Under $250'
WHEN ListPrice >= 250 and ListPrice < 1000 THEN 'Under $1000'
ELSE 'Over $1000'
END
FROM Production.Product
ORDER BY ProductNumber ;
GO
This works for up to 5 dates and handles nulls. Just couldn't get it to work as an Inline function.
CREATE FUNCTION dbo.MinDate(#Date1 datetime = Null,
#Date2 datetime = Null,
#Date3 datetime = Null,
#Date4 datetime = Null,
#Date5 datetime = Null)
RETURNS Datetime AS
BEGIN
--USAGE select dbo.MinDate('20120405',null,null,'20110305',null)
DECLARE #Output datetime;
WITH Datelist_CTE(DT)
AS (
SELECT #Date1 AS DT WHERE #Date1 is not NULL UNION
SELECT #Date2 AS DT WHERE #Date2 is not NULL UNION
SELECT #Date3 AS DT WHERE #Date3 is not NULL UNION
SELECT #Date4 AS DT WHERE #Date4 is not NULL UNION
SELECT #Date5 AS DT WHERE #Date5 is not NULL
)
Select #Output=Min(DT) FROM Datelist_CTE;
RETURN #Output;
END;
Building on the brilliant logic / code from mathematix and scottyc, I submit:
DECLARE #a INT, #b INT, #c INT = 0;
WHILE #c < 100
BEGIN
SET #c += 1;
SET #a = ROUND(RAND()*100,0)-50;
SET #b = ROUND(RAND()*100,0)-50;
SELECT #a AS a, #b AS b,
#a - ( ABS(#a-#b) + (#a-#b) ) / 2 AS MINab,
#a + ( ABS(#b-#a) + (#b-#a) ) / 2 AS MAXab,
CASE WHEN (#a <= #b AND #a = #a - ( ABS(#a-#b) + (#a-#b) ) / 2)
OR (#a >= #b AND #a = #a + ( ABS(#b-#a) + (#b-#a) ) / 2)
THEN 'Success' ELSE 'Failure' END AS Status;
END;
Although the jump from scottyc's MIN function to the MAX function should have been obvious to me, it wasn't, so I've solved for it and included it here: SELECT #a + ( ABS(#b-#a) + (#b-#a) ) / 2. The randomly generated numbers, while not proof, should at least convince skeptics that both formulae are correct.
Use a temp table to insert the range of values, then select the min/max of the temp table from within a stored procedure or UDF. This is a basic construct, so feel free to revise as needed.
For example:
CREATE PROCEDURE GetMinSpeed() AS
BEGIN
CREATE TABLE #speed (Driver NVARCHAR(10), SPEED INT);
'
' Insert any number of data you need to sort and pull from
'
INSERT INTO #speed (N'Petty', 165)
INSERT INTO #speed (N'Earnhardt', 172)
INSERT INTO #speed (N'Patrick', 174)
SELECT MIN(SPEED) FROM #speed
DROP TABLE #speed
END
Select MIN(T.V) FROM (Select 1 as V UNION Select 2 as V) T
SELECT (WHEN first > second THEN second ELSE first END) the_minimal FROM table