I'm interested in the value of result that is in the same row as the min value of each column (and I have many columns, so I would like to loop over them, or do rowfun but I do not know how to get 'result' then).
Table A
+----+------+------+----+------+------+--------+
| x1 | x2 | x3 | x4 | x5 | x6 | result |
+----+------+------+----+------+------+--------+
| 1 | 4 | 10 | 3 | 12 | 2 | 8 |
| 10 | 2 | 8 | 1 | 12 | 3 | 10 |
| 5 | 10 | 5 | 4 | 2 | 10 | 12 |
+----+------+------+----+------+------+--------+
Solution
8 10 12 10 12 8
I know that I can apply rowfun, but then I don't know how to get result.
And then, I can do this, but cannot loop over all the columns:
A(cell2mat(A.x1) == min(cell2mat(A.x1)), 7)
and I have tried several ways of making this into a variable but I can't make it work, so that:
A(cell2mat(variable) == min(cell2mat(variable)), 7)
Thank you!
Assuming your data is homogeneous you can use table2array and the second output of min to index your results:
% Set up table
x1 = [1 10 5];
x2 = [4 2 10];
x3 = [10 8 5];
x4 = [3 1 4];
x5 = [12 12 2];
x6 = [2 3 10];
result = [8 10 12];
t = table(x1.', x2.', x3.', x4.', x5.', x6.', result.', ...
'VariableNames', {'x1', 'x2', 'x3', 'x4', 'x5', 'x6', 'result'});
% Convert
A = table2array(t);
% When passed a matrix, min finds minimum of each column by default
% Exclude the results column, assumed to be the last
[~, minrow] = min(A(:, 1:end-1));
solution = t.result(minrow)'
Which returns:
solution =
8 10 12 10 12 8
From the documentation for min:
M = min(A) returns the smallest elements of A.
<snip>
If A is a matrix, then min(A) is a row vector containing the minimum value of each column.
Related
Let
M = | 1 2 3 |
| 4 5 6 |
| 7 8 9 |
and
V = | 1 1 1 |
I want to subtract V from every row of M so that M should look like
M = | 0 1 2 |
| 3 4 5 |
| 6 7 8 |
How can I do that without using a for, is there any straightforward command?
You can also use bsxfun.
M = [1 2 3 ; 4 5 6 ; 7 8 9] ;
V = [1 1 1] ;
iwant = bsxfun(#minus,M,V)
>> M = [1 2 3; 4 5 6; 7 8 9];
>> V = [1 1 1];
>> MV = M-repmat(V,size(M,1),1)
MV =
0 1 2
3 4 5
6 7 8
The call to repmat repeats the vector V by the number of rows in M.
User beaker pointed out that an even simpler (though a bit obscure) syntax works in recent versions of MATLAB. If you subtract a vector from a matrix, MATLAB will extend the vector to match the size of the matrix as long as one dimension of vector matches the matrix dimensions. See Compatible Array Sizes for Basic Operations.
>> M-V
ans =
0 1 2
3 4 5
6 7 8
Of course, if you know that V will contain all 1s, the solution is even simpler:
>> MV = M-1
MV =
0 1 2
3 4 5
6 7 8
In matlab, I want to fit a piecewise regression and find where on the x-axis the first change-point occurs. For example, for the following data, the output might be changepoint=20 (I don't actually want to plot it, just want the change point).
data = [1 4 4 3 4 0 0 4 5 4 5 2 5 10 5 1 4 15 4 9 11 16 23 25 24 17 31 42 35 45 49 54 74 69 63 46 35 31 27 15 10 5 10 4 2 4 2 2 3 5 2 2];
x = 1:52;
plot(x,data,'.')
If you have the Signal Processing Toolbox, you can directly use the findchangepts function (see https://www.mathworks.com/help/signal/ref/findchangepts.html for documentation):
data = [1 4 4 3 4 0 0 4 5 4 5 2 5 10 5 1 4 15 4 9 11 16 23 25 24 17 31 42 35 45 49 54 74 69 63 46 35 31 27 15 10 5 10 4 2 4 2 2 3 5 2 2];
x = 1:52;
ipt = findchangepts(data);
x_cp = x(ipt);
data_cp = data(ipt);
plot(x,data,'.',x_cp,data_cp,'o')
The index of the change point in this case is 22.
Plot of data and its change point circled in red:
I know this is an old question but just want to provide some extra thoughts. In Maltab, an alternative implemented by me is a Bayesian changepoint detection algorithm that estimates not just the number and locations of the changepoints but also reports the occurrence probability of changepoints. In its current implementation, it deals with only time-series-like data (aka, 1D sequential data). More info about the tool is available at this FileExchange entry (https://www.mathworks.com/matlabcentral/fileexchange/72515-bayesian-changepoint-detection-time-series-decomposition).
Here is its quick application to your sample data:
% Automatically install the Rbeast or BEAST library to local drive
eval(webread('http://b.link/beast')) %
data = [1 4 4 3 4 0 0 4 5 4 5 2 5 10 5 1 4 15 4 9 11 16 23 25 24 17 31 42 35 45 49 54 74 69 63 46 35 31 27 15 10 5 10 4 2 4 2 2 3 5 2 2];
out = beast(data, 'season','none') % season='none': there is no seasonal/periodic variation in the data
printbeast(out)
plotbeast(out)
Below is a summary of the changepoint, given by printbeast():
#####################################################################
# Trend Changepoints #
#####################################################################
.-------------------------------------------------------------------.
| Ascii plot of probability distribution for number of chgpts (ncp) |
.-------------------------------------------------------------------.
|Pr(ncp = 0 )=0.000|* |
|Pr(ncp = 1 )=0.000|* |
|Pr(ncp = 2 )=0.000|* |
|Pr(ncp = 3 )=0.859|*********************************************** |
|Pr(ncp = 4 )=0.133|******** |
|Pr(ncp = 5 )=0.008|* |
|Pr(ncp = 6 )=0.000|* |
|Pr(ncp = 7 )=0.000|* |
|Pr(ncp = 8 )=0.000|* |
|Pr(ncp = 9 )=0.000|* |
|Pr(ncp = 10)=0.000|* |
.-------------------------------------------------------------------.
| Summary for number of Trend ChangePoints (tcp) |
.-------------------------------------------------------------------.
|ncp_max = 10 | MaxTrendKnotNum: A parameter you set |
|ncp_mode = 3 | Pr(ncp= 3)=0.86: There is a 85.9% probability |
| | that the trend component has 3 changepoint(s).|
|ncp_mean = 3.15 | Sum{ncp*Pr(ncp)} for ncp = 0,...,10 |
|ncp_pct10 = 3.00 | 10% percentile for number of changepoints |
|ncp_median = 3.00 | 50% percentile: Median number of changepoints |
|ncp_pct90 = 4.00 | 90% percentile for number of changepoints |
.-------------------------------------------------------------------.
| List of probable trend changepoints ranked by probability of |
| occurrence: Please combine the ncp reported above to determine |
| which changepoints below are practically meaningful |
'-------------------------------------------------------------------'
|tcp# |time (cp) |prob(cpPr) |
|------------------|---------------------------|--------------------|
|1 |33.000000 |1.00000 |
|2 |42.000000 |0.98271 |
|3 |19.000000 |0.69183 |
|4 |26.000000 |0.03950 |
|5 |11.000000 |0.02292 |
.-------------------------------------------------------------------.
Here is the graphic output. Three major changepoints are detected:
You can use sgolayfilt function, that is a polynomial fit to the data, or reproduce OLS method: http://www.utdallas.edu/~herve/Abdi-LeastSquares06-pretty.pdf (there is a+bx notation instead of ax+b)
For linear fit of ax+b:
If you replace x with constant vector of length 2n+1: [-n, ... 0 ... n] on each step, you get the following code for sliding regression coeffs:
for i=1+n:length(y)-n
yi = y(i-n : i+n);
sum_xy = sum(yi.*x);
a(i) = sum_xy/sum_x2;
b(i) = sum(yi)/n;
end
Notice that in this code b means sliding average of your data, and a is a least-square slope estimate (first derivate).
I want to find out how nearest neighbor interpolation works in MATLAB. I have input data :
A = [1 4 7 4 3 6] % 6 digit vector
I use the following MATLAB code :
B = imresize(A,[1 9],'nearest');
I get the following result :
[1,4,4,7,4,4,3,6,6]
Solving by hand, I get this result :
[1 4 4 7 4 4 3 3 6]
Can you please guide me? Am I going wrong somewhere?
If you apply regular interpolation using interp1, it will give you the result you computed by hand:
>> N = 9;
>> B = interp1(linspace(0,1,numel(A)), A, linspace(0,1,N), 'nearest')
B =
1 4 4 7 4 4 3 3 6
Some time ago, I went through the source code of IMRESIZE trying to understand how it works. See this post for a summary. At some point the code will call an private MEX-function (no corresponding source code available), but the comments are enough to understand the implementation.
For what it's worth, there an also a function imresize_old which provides an older implementation of imresize (used in version R2006b and earlier). It gave yet another different result:
>> B = imresize(A, [1 N], 'nearest')
B =
1 4 4 7 4 4 3 6 6
>> B = imresize_old(A, [1 N], 'nearest')
B =
1 4 4 7 7 4 3 6 6
What's more it was previously observed that the implementation between MATLAB and Octave also differed in some cases.
EDIT:
As you noted, in some cases you have to be mindful about floating-point limitations when working with interp1. So we could do the interpolation by choosing x-numbers to be between [0,1] range, or a more stable range like [1,numel(A)]. Because of rounding errors in edge cases, this might give different results.
For example compare the two codes below:
% interpolation in [0,1]
N = 11;
y = [1 4 7 4 3 6];
x = linspace(0,1,numel(y));
xi = linspace(0,1,N);
yi = interp1(x, y, xi, 'nearest');
% print numbers with extended precision
fprintf('%.17f %g\n',[x;y])
fprintf('%.17f %g\n',[xi;yi])
against:
% interpolation in [1,k]
N = 11;
y = [1 4 7 4 3 6];
x = 1:numel(y);
xi = linspace(1,numel(y),N);
yi = interp1(x, y, xi, 'nearest');
% print numbers with extended precision
fprintf('%.17f %g\n',[x;y])
fprintf('%.17f %g\n',[xi;yi])
Here is the output nicely formatted:
--------------------------------------------------------
[0,1] RANGE | [1,k] RANGE
--------------------------------------------------------
xi yi | xi yi
--------------------------------------------------------
0.00000000000000000 1 | 1.00000000000000000 1 |
0.20000000000000001 4 | 2.00000000000000000 4 |
0.40000000000000002 7 | 3.00000000000000000 7 |
0.59999999999999998 4 | 4.00000000000000000 4 | INPUT
0.80000000000000004 3 | 5.00000000000000000 3 |
1.00000000000000000 6 | 6.00000000000000000 6 |
--------------------------------------------------------
0.00000000000000000 1 | 1.00000000000000000 1 |
0.10000000000000001 4 | 1.50000000000000000 4 |
0.20000000000000001 4 | 2.00000000000000000 4 |
0.29999999999999999 4 | 2.50000000000000000 7 |
0.40000000000000002 7 | 3.00000000000000000 7 |
0.50000000000000000 4 | 3.50000000000000000 4 | OUTPUT
0.59999999999999998 4 | 4.00000000000000000 4 |
0.69999999999999996 4 | 4.50000000000000000 3 |
0.80000000000000004 3 | 5.00000000000000000 3 |
0.90000000000000002 6 | 5.50000000000000000 6 |
1.00000000000000000 6 | 6.00000000000000000 6 |
--------------------------------------------------------
So you can see that some numbers are not exactly representable in double-precision when working in the [0,1] range. So 0.3 which is supposed to be in the middle [0.2, 0.4], turns to be closer to the lower end 0.2 than 0.4 because of roundoff error. While on the other side, 2.5 is exactly in the middle of [2,3] (all numbers exactly represented), and is assigned to the upper end 3 using nearest neighbor.
Also be aware that colon and linspace can produce different outputs sometimes:
>> (0:0.1:1)' - linspace(0,1,11)'
ans =
0
0
0
5.5511e-17
0
0
0
0
0
0
0
NN is the simplest form of interpolation. It has the following recipe: use the value at the nearest sample location. The NN interpolation in MATLAB is computationally efficient but if you need more accuracy, I recommend you to use the bilinear or the bicubic interpolation. You can also check interp1() instead.
Here provides an explanation with an example: http://www.mathworks.com/help/vision/ug/interpolation-methods.html
I have no reference for this so I suggest you test it against other examples using imresize but I can recover Mat;ab's values like this.
Assume that A represents y values and the positions of elements in A represent x values. so now
n = length(A);
N = 9;
x = 1:n %// i.e. 1:6
now we need to find interpolating position i.e. xi points. I would have done it like this:
xi = round((1:N)/N)*n
which gives
xi =
1 1 2 3 3 4 5 5 6
which results in a yi of
yi = A(xi)
yi =
1 1 4 7 7 4 3 3 6
which differs from both yours and Matlab's answers (how did you get yours?)
So then I tried:
xi = round(((0:N-1)/N)*n)+1
yi = A(xi)
which makes just as much sense and gets me Matlab's result of
yi =
1 4 4 7 4 4 3 6 6
So I'm guessing that's how they do it. But I don't have imresize to test other cases
A = [1 2 3; 7 6 5]
B = [3 7];
A-B = [1-3 2-3 3-3; 7-7 6-7 5-7];
ans =[-2 -1 0; 0 -1 -2]
This is the operation I want to have done. How could I do it by matrix functions other than the iterative solutions?
You do this most conveniently with bsxfun, which automatically expands the arrays to match in size (so that you don't need to use repmat). Note that I need to transpose B so that it's a 2-by-1 array.
A = [1 2 3; 7 6 5]
B = [3 7];
result = bsxfun(#minus,A,B')
result =
-2 -1 0
0 -1 -2
I think that Jonas answer is the best. But just for the record, here is the solution using an explicit repmat:
A = [1 2 3; 7 6 5];
B = [3 7];
sz = size(A);
C = A - repmat(B', [1 sz(2:end)]);
Not only is Jonas' answer simpler, it is actually faster by a factor of 2 for large matrices on my machine.
It's also interesting to note that in the case where A is an n-d array, both these solutions do something quite reasonable. The matrix C will have the following property:
C(k,:,...,:) == A(k,:,...,:) - B(k)
In fact, Jonas' answer will run, and very likely do what you want, in the case where B is m-d, as long as the initial dimensions of A and B' have the same size. You can change the repmat solution to mimic this ... at which point you are starting to reimplement bsxfun!
Normally you can't. Iterative solutions will be necessary, because the problem is poorly defined. Matrix addition/subtraction is only defined for matrices of the same dimensions.
ie:
A = | 1 2 3 |
| 7 6 5 |
B = | 3 7 |
It makes no sense to subtract a 1x2 matrix from a 2x3 matrix.
However, if you multiplied B by some intermediate matrix to make the result a 2x3 matrix, that would work, ie:
B' * Y = | 3 3 3 |
| 7 7 7 |
eg:
B' = diag(B)
= | 3 0 |
| 0 7 |
B' * Y = | 3 3 3 |
| 7 7 7 |
Y = | 1 1 1 |
| 1 1 1 |
Therefore, A-B'*Y gives a valid, non-iterative solution.
A-(B'*Y) = | 1 2 3 | - | 3 3 3 |
| 7 6 5 | | 7 7 7 |
= A - (diag(B) * Y )
The only "cheat" here is the use of the diag() function, which converts a vector to a strictly-diagonal-matrix. There is a way to manually decompose a set of matrix/vector multiplication operations to manually re-create the diag() function, but that would be more work than my solution above itself.
Good luck!
I am trying to identify if a value is repeated sequentially in a vector N times. The challenge I am facing is that it could be repeated sequentially N times several times within the vector. The purpose is to determine how many times in a row certain values fall above the mean value. For example:
>> return_deltas
return_deltas =
7.49828129642663
11.5098198572327
15.1776644881294
11.256677995536
6.22315734182976
8.75582103474613
21.0488849115947
26.132605745393
27.0507649089989
...
(I only printed a few values for example but the vector is large.)
>> mean(return_deltas)
ans =
10.50007490258002
>> sum(return_deltas > mean(return_deltas))
ans =
50
So there are 50 instances of a value in return_deltas being greater than the mean of return_deltas.
I need to identify the number of times, sequentially, the value in return_deltas is greater than its mean 3 times in a row. In other words, if the values in return_deltas are greater than its mean 3 times in a row, that is one instance.
For example:
---------------------------------------------------------------------
| `return_delta` value | mean | greater or less | sequence |
|--------------------------------------------------------------------
| 7.49828129642663 |10.500074902 | LT | 1 |
| 11.5098198572327 |10.500074902 | GT | 1 |
| 15.1776644881294 |10.500074902 | GT | 2 |
| 11.256677995536 |10.500074902 | GT | 3 * |
| 6.22315734182976 |10.500074902 | LT | 1 |
| 8.75582103474613 |10.500074902 | LT | 2 |
| 21.0488849115947 |10.500074902 | GT | 1 |
| 26.132605745393 |10.500074902 | GT | 2 |
| 27.0507649089989 |10.500074902 | GT | 3 * |
---------------------------------------------------------------------
The star represents a successful sequence of 3 in a row. The result of this set would be two because there were two occasions where the value was greater than the mean 3 times in a row.
What I am thinking is to create a new vector:
>> a = return_deltas > mean(return_deltas)
that of course contains ones where values in return_deltas is greater than the mean and using it to find how many times sequentially, the value in return_deltas is greater than its mean 3 times in a row. I am attempting to do this with a built in function (if there is one, I have not discovered it) or at least avoiding loops.
Any thoughts on how I might approach?
With a little work, this snippet finds the starting index of every run of numbers:
[0 find(diff(v) ~= 0)] + 1
An Example:
>> v = [3 3 3 4 4 4 1 2 9 9 9 9 9]; # vector of integers
>> run_starts = [0 find(diff(v) ~= 0)] + 1 # may be better to diff(v) < EPSILON, for floating-point
run_starts =
1 4 7 8 9
To find the length of each run
>> run_lengths = [diff(run_starts), length(v) - run_starts(end) + 1]
This variables then makes it easy to query which runs were above a certain number
>> find(run_lengths >= 4)
ans =
5
>> find(run_lengths >= 2)
ans =
1 2 5
This tells us that the only run of at least four integers in a row was run #5.
However, there were three runs that were at least two integers in a row, specifically runs #1, #2, and #5.
You can reference where each run starts from the run_starts variable.