Fit a piecewise regression in matlab and find change point - matlab

In matlab, I want to fit a piecewise regression and find where on the x-axis the first change-point occurs. For example, for the following data, the output might be changepoint=20 (I don't actually want to plot it, just want the change point).
data = [1 4 4 3 4 0 0 4 5 4 5 2 5 10 5 1 4 15 4 9 11 16 23 25 24 17 31 42 35 45 49 54 74 69 63 46 35 31 27 15 10 5 10 4 2 4 2 2 3 5 2 2];
x = 1:52;
plot(x,data,'.')

If you have the Signal Processing Toolbox, you can directly use the findchangepts function (see https://www.mathworks.com/help/signal/ref/findchangepts.html for documentation):
data = [1 4 4 3 4 0 0 4 5 4 5 2 5 10 5 1 4 15 4 9 11 16 23 25 24 17 31 42 35 45 49 54 74 69 63 46 35 31 27 15 10 5 10 4 2 4 2 2 3 5 2 2];
x = 1:52;
ipt = findchangepts(data);
x_cp = x(ipt);
data_cp = data(ipt);
plot(x,data,'.',x_cp,data_cp,'o')
The index of the change point in this case is 22.
Plot of data and its change point circled in red:

I know this is an old question but just want to provide some extra thoughts. In Maltab, an alternative implemented by me is a Bayesian changepoint detection algorithm that estimates not just the number and locations of the changepoints but also reports the occurrence probability of changepoints. In its current implementation, it deals with only time-series-like data (aka, 1D sequential data). More info about the tool is available at this FileExchange entry (https://www.mathworks.com/matlabcentral/fileexchange/72515-bayesian-changepoint-detection-time-series-decomposition).
Here is its quick application to your sample data:
% Automatically install the Rbeast or BEAST library to local drive
eval(webread('http://b.link/beast')) %
data = [1 4 4 3 4 0 0 4 5 4 5 2 5 10 5 1 4 15 4 9 11 16 23 25 24 17 31 42 35 45 49 54 74 69 63 46 35 31 27 15 10 5 10 4 2 4 2 2 3 5 2 2];
out = beast(data, 'season','none') % season='none': there is no seasonal/periodic variation in the data
printbeast(out)
plotbeast(out)
Below is a summary of the changepoint, given by printbeast():
#####################################################################
# Trend Changepoints #
#####################################################################
.-------------------------------------------------------------------.
| Ascii plot of probability distribution for number of chgpts (ncp) |
.-------------------------------------------------------------------.
|Pr(ncp = 0 )=0.000|* |
|Pr(ncp = 1 )=0.000|* |
|Pr(ncp = 2 )=0.000|* |
|Pr(ncp = 3 )=0.859|*********************************************** |
|Pr(ncp = 4 )=0.133|******** |
|Pr(ncp = 5 )=0.008|* |
|Pr(ncp = 6 )=0.000|* |
|Pr(ncp = 7 )=0.000|* |
|Pr(ncp = 8 )=0.000|* |
|Pr(ncp = 9 )=0.000|* |
|Pr(ncp = 10)=0.000|* |
.-------------------------------------------------------------------.
| Summary for number of Trend ChangePoints (tcp) |
.-------------------------------------------------------------------.
|ncp_max = 10 | MaxTrendKnotNum: A parameter you set |
|ncp_mode = 3 | Pr(ncp= 3)=0.86: There is a 85.9% probability |
| | that the trend component has 3 changepoint(s).|
|ncp_mean = 3.15 | Sum{ncp*Pr(ncp)} for ncp = 0,...,10 |
|ncp_pct10 = 3.00 | 10% percentile for number of changepoints |
|ncp_median = 3.00 | 50% percentile: Median number of changepoints |
|ncp_pct90 = 4.00 | 90% percentile for number of changepoints |
.-------------------------------------------------------------------.
| List of probable trend changepoints ranked by probability of |
| occurrence: Please combine the ncp reported above to determine |
| which changepoints below are practically meaningful |
'-------------------------------------------------------------------'
|tcp# |time (cp) |prob(cpPr) |
|------------------|---------------------------|--------------------|
|1 |33.000000 |1.00000 |
|2 |42.000000 |0.98271 |
|3 |19.000000 |0.69183 |
|4 |26.000000 |0.03950 |
|5 |11.000000 |0.02292 |
.-------------------------------------------------------------------.
Here is the graphic output. Three major changepoints are detected:

You can use sgolayfilt function, that is a polynomial fit to the data, or reproduce OLS method: http://www.utdallas.edu/~herve/Abdi-LeastSquares06-pretty.pdf (there is a+bx notation instead of ax+b)
For linear fit of ax+b:
If you replace x with constant vector of length 2n+1: [-n, ... 0 ... n] on each step, you get the following code for sliding regression coeffs:
for i=1+n:length(y)-n
yi = y(i-n : i+n);
sum_xy = sum(yi.*x);
a(i) = sum_xy/sum_x2;
b(i) = sum(yi)/n;
end
Notice that in this code b means sliding average of your data, and a is a least-square slope estimate (first derivate).

Related

Conditional Group By on Selected rows in KDB/Q-Sql

I have a requirement where I have to execute multiple queries and perform group by on a column with where clause , group by column is fixed and where condition will be perform on fixed column with variable criteria . Only Column name and aggregation type will be varies
For example if I have table :
k1 k2 val1 val2
1 1 10 30
1 1 20 31
1 2 30 32
2 2 40 33
2 3 50 34
2 4 60 35
2 4 70 36
3 4 80 37
3 5 90 38
3 5 100 39
t:([] k1:1 1 1 2 2 2 2 3 3 3; k2:1 1 2 2 3 4 4 4 5 5; val1:10 20 30 40 50 60 70 80 90 100; val2:31 31 32 33 34 35 36 37 38 39)
Queries which I need to perform will be like
select avg_val1:avg val1 by k1 from t where k2 in 2 3 4
select sum_val1:sum val1 by k1 from t where k2 in 2 3
select sum_val2:sum val2 by k1 from t where k2 in 2 3 5
select min_val2:min val2 by k1 from t where k2 in 1 2 3 4 5
I want to execute these queries in a single execution using functional queries. I tried this, but not able to put right condition and syntax
res:?[t;();(enlist`k1)!enlist`k1;(`avg_val1;`sum_val2)!({$[x; y; (::)]}[1b;(avg;`val1)];{$[x; y; (::)]}[1b; (sum;`val2)])];
k1 avg_val1 sum_val2
1 20.0 94
2 55.0 138
3 90.0 114
Instead putting 1b in condition , i want to put real condition like this:
res:?[t;();(enlist`k1)!enlist`k1;(`avg_val1;`sum_val2)!({$[x; y; (::)]}[(in;`k2;2 3 4i);(avg;`val1)];{$[x; y; (::)]}[(in;`k2;2 3i); (sum;`val2)])];
But it will give "type" error, since query will be first group by k1 ,and k2 will be list. So condition is also not right.
I want to know what can be the best solution for this.
May be there can be better approach to solve the same .
Please help me to in same.
Thank you.
The vector conditional (?) operator can get you closer to what you'd like.
Given your table
t:([] k1:1 1 1 2 2 2 2 3 3 3; k2:1 1 2 2 3 4 4 4 5 5; val1:10 20 30 40 50 60 70 80 90 100; val2:31 31 32 33 34 35 36 37 38 39)
k1 k2 val1 val2
---------------
1 1 10 31
1 1 20 31
1 2 30 32
2 2 40 33
2 3 50 34
2 4 60 35
2 4 70 36
3 4 80 37
3 5 90 38
3 5 100 39
you can update, say, the val1 column to hold null values wherever a condition does not hold
update val1:?[k2 in 2 3 4;val1;0N] from t
k1 k2 val1 val2
---------------
1 1 31
1 1 31
1 2 30 32
2 2 40 33
2 3 50 34
2 4 60 35
2 4 70 36
3 4 80 37
3 5 38
3 5 39
and with a little more work you can get the desired aggregate (NB: the aggregate functions ignore null values)
select avg ?[k2 in 2 3 4;val1;0N] by k1 from t
k1| x
--| --
1 | 30
2 | 55
3 | 80
You can wrap this up into a functional select statement like so
?[t;();{x!x}enlist`k1;`avg_val1`sum_val2!((avg;(?;(in;`k2;2 3 4);`val1;0N));(sum;(?;(in;`k2;2 3);`val2;0N)))]
k1| avg_val1 sum_val2
--| -----------------
1 | 30 32
2 | 55 67
3 | 80 0
However, this can break when you use an function that does not ignore nulls, e.g. count. You may be better off using the where operator in you select statement:
select avg val1 where k2 in 2 3 4 by k1 from t
k1| x
--| --
1 | 30
2 | 55
3 | 80
?[t;();{x!x}enlist`k1;`avg_val1`sum_val2!((avg;(`val1;(where;(in;`k2;2 3 4))));(sum;(`val2;(where;(in;`k2;2 3)))))]
k1| avg_val1 sum_val2
--| -----------------
1 | 30 32
2 | 55 67
3 | 80 0

Grafana visualisation of non time series data

I have two columns in my InfluxDB database : Values and Iterator count
I want visualise this on Grafana where my x axis is iterator count and value on y axis is basically corresponding to each iterator count.
EXAMPLE
Iterator Count(X) | Value
1 | 46
2 | 64
3 | 32
4 | 13
5 | 12
6 | 11
7 | 10
8 | 9
9 | 12
10 | 25.
Is it possible to achieve visualisation for the same, having no aspect of time
You can use plot.ly plugin
You just need to specify Iterator Count(X) as the x-axis in the trace section and Value as the y-axis.

Merge 2 vectors according their time values

I would like to merge 2 vectors according their time values. This should look like this (column 1 = time, column 2 = actual value):
A =
1 234
3 121
4 456
6 6756
B =
2 435
5 90
10 365
Result:
C =
1 234
2 435
3 121
4 456
5 90
6 6756
10 365
Is there an elegant way to realize this in Matlab?
Here's an easy one-liner:
C = sortrows([A;B])
C =
1 234
2 435
3 121
4 456
5 90
6 6756
10 365
Note that this assumes that all of the time values in column 1 are unique. If this is not the case, you can use accumarray:
A =
1 234
3 121
4 456
6 6756
B =
2 435
5 90
10 365
B = [B; 1 512]
B =
2 435
5 90
10 365
1 512
C = [A;B];
D = accumarray(C(:,1),C(:,2));
U = unique(C(:,1));
E = [U,D(U)]
E =
1 746 %// 764 = 234 + 512
2 435
3 121
4 456
5 90
6 6756
10 365
First I would merge these matrices and then sort them by first column.
C = [A; B]
[Y, I] = sort(C(:,1))
C = C(I,:)
First you want to vertical concatenation:
A = [1 234; 3 121; 4 456; 6 6756];
B = [2 435; 5 90; 10 365];
C = vertcat(A,B)
Then you want to sort your answer based on the first column:
[~,inx]=sort(C(:,1));
out = C(inx,:);
>> out =
1 234
2 435
3 121
4 456
5 90
6 6756
10 365
So much more difficult than the 1 liner:
out = sortrows(C,1)
Why Matlab, why don't you have an option in sort to keep the index!
In the general case, you will need to do some form of concatenation and sorting. This is a one liner
C = sort([A,B],1);

Split 2 x N matrix into two submatrices in MATLAB

I have a matrix 2 x N (lets call it MyMatrix) containing pairs of elements (element in (1,1) corresponds to element (2,1), element in (1,2) correspords to element (2,2) and so on.) Entries in first row are sorted in ascending order. What I would like to do is split this matrix into 2 matrices 2 x K and 2 x N-K. First matrix will contain part of MyMatrix where entries in row 1 are less than some given value (in my example it will be (max-min)/2 , where max = maximum value in row 1, min = minimum walue in row 1) and second matrix will consist of the rest of MyMatrix. I'm sorry if it is confusing but I tried my best to explain to you what I would like to achieve.
Here is an example:
MyMat =
|1 2 4 6 13 52 65 120 125|
|4 132 53 1 64 34 5 2 66 |
min = 1 , max = 125, avg = (125-1)/2 = 62.
so result will be as follows:
a =
|1 2 4 6 13 52 |
|4 132 53 1 64 34 |
b=
|65 120 125|
|5 2 66 |
Thanks in advance for your help.
Kind regards,
Tom.
You can simply do
a=MyMat(:,MyMat(1,:)<avg);
b=MyMat(:,MyMat(1,:)>=avg);

Need a Logic to say Bingo

I am creating an iphone app where I have a grid view of 25 images as:
0 1 2 3 4
5 6 7 8 9
10 11 12 13 14
15 16 17 18 19
20 21 22 23 24
Now when any 5 consecutive images are selected it should say bingo, like if 0,6, 12, 18, 24 are selected it should say Bingo.
How will i do that, please help me.
Many Thanks for your help.
Rs
iPhone Developer
-----------------------------------
| 0 | 1 | 2 | 3 | 4 | 5 |
-----------------------------------
| 6 | 7 | 8 | 9 | 10 | 11 |
-----------------------------------
| 12 | 13 | 14 | 15 | 16 | 17 |
-----------------------------------
| 18 | 19 | 20 | 21 | 22 | 23 |
-----------------------------------
| 24 |
-----------------------------------
Hope this is how your grid looks like.
Associate each column with an array. The array will contain the list of all neighbour elements of that column,
For example, the neighbor array of the column [ 6 ] will ollk like array(0, 7, 12), which are all the immediate neighbors of [ 6 ].
Set counter = 0;
Now, when someone clicks an element, increment the counter (Now counter = 1)
When he clicks the second element, check if the element is in the neighbor list of the previous element OR the 1st element.
If the element clicked is in the neighbor list, increment the counter (now counter = 2)
ELSE
If the element clicked is not in the neighbor array, reset the counter (counter = 0) and start over.
Check if the value of counter = 5. If it is, Say Bingo!
The algorithm is not fully correct, but I hope you got the idea :)