T-SQL - Padding data points - tsql

I have a series of data points in a table in an Azure SQL database. When plotted, the points have the following distribution:
What I am trying to do is add padding so that they have the appearance of being in a continuous line - or at lease more continuous than it is now.
I tried adding points between each point, but the problem with that is that it's all relative. If you look closely, you can see some of the points are blue, and some are dark red. The blue points are the ones I added, but the line looks the same.
I'm looking for advice on the logic I should use to solve this issue. I want to add x number of points between each data point based on the distance between the nearest points... if that makes sense.

I think this works
declare #t table (x smallmoney primary key, y smallmoney);
declare #inc smallmoney = 1;
insert into #t(x, y) values
(1, 1)
, (5, 3)
, (8, 4)
, (10, 5)
, (11, 6);
with cte as
( select x, x as x0, y, y as y0, cnt = cast(1 as smallmoney)
, lead(x) over (order by x) as nextX
, lead(y) over (order by x) as nextY
from #t t
union all
select x + #inc, x0, y + #inc/(nextX-x0)*(nextY-y0), y0, cnt+1, nextX, nextY
from cte t
where x + #inc < nextX
)
select *
from cte t
order by t.x;

I'm not confident that this is the best solution, but I think you could build off of it. Here's an sqlfiddle
SELECT x + COALESCE(((nextx-x)/10)*inc, 0) as x, y + COALESCE(((nexty-y)/10)*inc, 0) as y
FROM
(SELECT x, y, nextx, nexty, inc.n + 0.0 as inc FROM
(SELECT x, y, lead(x) over (order by x) as nextx, lead(y) over (order by x) as nexty
FROM points) p inner join (VALUES(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) inc(n)
ON nextx is not null or inc.n = 0
) a ORDER BY x
This will add 9 points between each point (10 points total, including the "real" one).
The basic idea is that I'm using lead for each row to get the next x and next y, then I join that to a hardcoded list of values 0 to 9. Then for each value, I increment x by 1/10 of the difference between nextx and x, and increment y by 1/10 of the difference between nexty and y.
The join condition nextx is not null or inc.n = 0 is so that I only join inc(0) to the last x value (rather than joining 10 times).
You could change my hardcoded list of values and the hardcoded 10s to increment differently. Similarly, you'd probably need some changes if you only want integers, but the principle will be the same.

Related

Insert a row for each array item -- for many arrays

Say you have a table:
CREATE TABLE tab (
x integer,
y integer,
);
And multiple rows of the form x, [y1, y2, ...]:
x ys
1 [2,3]
4 [5,6,7]
There might be an arbitrary but non-zero (or non-nil) amount of ys in each row. x always non-nil.
How to insert such that you end up with a row for each x, y pair, that is:
x y
1 2
1 3
4 5
4 6
4 7
You can unnest() the arrays:
insert into tab (x,y)
select v.x, t.y
from (
values
(1, array[2,3]),
(4, array[5,6,7])
) as v(x,ya)
cross join unnest(v.ya) as t(y)
Online example

Initialize / warm-start search [duplicate]

I am using the CP-Sat solver to optimise a timetable I am making. However, this now takes a long time to solve. Is it possible to seed the solver with an old result, to act as a starting point, with the goal of reducing the time required to find the optimal result?
Take a look at this solution hinting example:
https://github.com/google/or-tools/blob/stable/ortools/sat/docs/model.md#solution-hinting
num_vals = 3
x = model.NewIntVar(0, num_vals - 1, 'x')
y = model.NewIntVar(0, num_vals - 1, 'y')
z = model.NewIntVar(0, num_vals - 1, 'z')
model.Add(x != y)
model.Maximize(x + 2 * y + 3 * z)
# Solution hinting: x <- 1, y <- 2
model.AddHint(x, 1)
model.AddHint(y, 2)
Edit: you should also try to
Reduce the amount of variables.
Reduce the domain of the integer variables.
Run the solver with multiples threads usingsolver.parameters.num_search_workers = 8.
Prefer boolean over integer variables/contraints.
Set redundant constraints and/or symmetry breaking constraints.
Segregate your problem and merge the results.

How to do a linear regression in postgresql?

In a regression Y=aX+b, regr_intercept(Y, X) equals "b" and rregr_slope(Y, X) equals "a"?
You have not supplied much details but here you go.
Regression
A regression line is simply a line
y = ax + b
that is able to compute an output variable y for an input variable x. A line can be described by two parameters, also called coefficients:
the slope a
the intercept b
Finding Slope & Intercept
Suppose you have two numeric columns, Y and X populated with the desired X and Y
CREATE TABLE foo(
id serial PRIMARY KEY,
X integer NOT NULL,
Y integer NOT NULL
);
INSERT INTO foo VALUES (0,10,3);
INSERT INTO foo VALUES (1,20,5);
You can find slope as follows.
SELECT regr_slope(y, x) slope FROM foo;
SELECT regr_intercept(y, x) intercept FROM foo;
Results of query:
slope: 0.2
intercept: 1
SQL Fiddle

kdb conditional update and select in one query

I can do this:
x:([]v: 4 2; w: 10 100)
x: update z:`test from x where v = 4
x
But i'd really like to be able to do the conditional update and select all in one hit. something like
select v, w, (select `test from x where v = v) from z
Is this possible in kdb?
You could try
update z:?[v=4;`test;`] from x
Is the vector conditional if what you're looking for?
q)select v,w,z:?[v=4;`test;`] from x
v w z
----------
4 10 test
2 100
http://code.kx.com/q/ref/lists/#vector-conditional

Media-of-3 partitioning in QuickSort

I was trying to understand quicksort with median-of-3 partitioning. After finding the median of the first, middle and last element in an array, a common practice is to swap median with the second last element in array(n-1th index). Is there a specific reason we do that?
The reason is that the algorithm does not only find the median, it also sorts the low, middle and high elements. After the three permutations you know that a[middle]<=a[high]. So you need only to partition the elements before high, because a[high] is greater or equal to pivot.
Let's look at an example: low=0, middle=4 and high=8. Your array is like this:
lowerOrEqualToPivot X X X pivot X X X greaterOrEqualToPivot
If you swap middle with high, you need to partition the 8 elements between brackets :
[ lowerOrEqualToPivot X X X greaterOrEqualToPivot X X X ] pivot
If you swap middle with high-1, you need to split only 7 elements:
[ lowerOrEqualToPivot X X X X X X ] pivot greaterOrEqualToPivot
By the way there is a bug in the first line:
int middle = ( low + high ) / 2; //Wrong
int middle = ( low + high ) >>> 1; //Correct
The reason is that if (low + high) is greater than Integer.MAX_VALUE you will have an overflow and middle will be a negative number. The second line will always give you a positive result.