I have a matrix 2 x N (lets call it MyMatrix) containing pairs of elements (element in (1,1) corresponds to element (2,1), element in (1,2) correspords to element (2,2) and so on.) Entries in first row are sorted in ascending order. What I would like to do is split this matrix into 2 matrices 2 x K and 2 x N-K. First matrix will contain part of MyMatrix where entries in row 1 are less than some given value (in my example it will be (max-min)/2 , where max = maximum value in row 1, min = minimum walue in row 1) and second matrix will consist of the rest of MyMatrix. I'm sorry if it is confusing but I tried my best to explain to you what I would like to achieve.
Here is an example:
MyMat =
|1 2 4 6 13 52 65 120 125|
|4 132 53 1 64 34 5 2 66 |
min = 1 , max = 125, avg = (125-1)/2 = 62.
so result will be as follows:
a =
|1 2 4 6 13 52 |
|4 132 53 1 64 34 |
b=
|65 120 125|
|5 2 66 |
Thanks in advance for your help.
Kind regards,
Tom.
You can simply do
a=MyMat(:,MyMat(1,:)<avg);
b=MyMat(:,MyMat(1,:)>=avg);
Related
I have two matrices A (51 rows X 5100 columns) and B (51rows X 5100 columns) and I want to subtract every row of A with every row of B to obtain another matric C (2601 rows X 5100 columns). How can I have the matrix C?
You can do that by
permuting the matrices' dimensions to obtain 3D arrays of size (in your example) 51 × 1 × 5100 and 1 × 51 × 5100 respectively;
subtracting with implicit expansion, which gives an array of size 51 × 51 × 5100;
reshaping to collapse the first two dimensions into one, which gives the final 51*51 × 5100 matrix.
A = rand(51, 5100); % example matrix
B = rand(51, 5100); % example matrix, same number of columns
C = reshape(permute(A, [1 3 2]) - permute(B, [3 1 2]), [], size(A, 2));
The crux of the problem lies in getting the correct pairs of rows for both matrices. To do this, you could use the meshgrid() function to generate a matrix that varies from 1:n along its rows, and another that varies along its columns (where n is the number of rows).
For example:
mtx1 = reshape(1:9, 3, 3);
mtx2 = reshape(101:109, 3, 3);
n1 = size(mtx1, 1);
n2 = size(mtx2, 1);
[r1, r2] = meshgrid(1:n1, 1:n2);
This gives:
r1 =
1 2 3
1 2 3
1 2 3
r2 =
1 1 1
2 2 2
3 3 3
Next, flatten both r1 and r2:
f1 = r1(:)
f2 = r2(:)
Now, we have:
f1 =
1
1
1
2
2
2
3
3
3
f2 =
1
2
3
1
2
3
1
2
3
We can use f1 and f2 as the indices for our pairs of rows:
mtx1(f1, :) repeats the first row of mtx1 three times, then the second row, then the third row
mtx1(f1, :)
1 4 7
1 4 7
1 4 7
2 5 8
2 5 8
2 5 8
3 6 9
3 6 9
3 6 9
mtx2(f2, :) repeats the entire matrix mtx2 three times
mtx2(f2, :)
101 104 107
102 105 108
103 106 109
101 104 107
102 105 108
103 106 109
101 104 107
102 105 108
103 106 109
Subtract these two, and you get your pairwise difference of rows:
mtx2(f2, :) - mtx1(f1, :)
100 100 100
101 101 101
102 102 102
99 99 99
100 100 100
101 101 101
98 98 98
99 99 99
100 100 100
This also works when mtx1 and mtx2 have different row counts.
In matlab, I want to fit a piecewise regression and find where on the x-axis the first change-point occurs. For example, for the following data, the output might be changepoint=20 (I don't actually want to plot it, just want the change point).
data = [1 4 4 3 4 0 0 4 5 4 5 2 5 10 5 1 4 15 4 9 11 16 23 25 24 17 31 42 35 45 49 54 74 69 63 46 35 31 27 15 10 5 10 4 2 4 2 2 3 5 2 2];
x = 1:52;
plot(x,data,'.')
If you have the Signal Processing Toolbox, you can directly use the findchangepts function (see https://www.mathworks.com/help/signal/ref/findchangepts.html for documentation):
data = [1 4 4 3 4 0 0 4 5 4 5 2 5 10 5 1 4 15 4 9 11 16 23 25 24 17 31 42 35 45 49 54 74 69 63 46 35 31 27 15 10 5 10 4 2 4 2 2 3 5 2 2];
x = 1:52;
ipt = findchangepts(data);
x_cp = x(ipt);
data_cp = data(ipt);
plot(x,data,'.',x_cp,data_cp,'o')
The index of the change point in this case is 22.
Plot of data and its change point circled in red:
I know this is an old question but just want to provide some extra thoughts. In Maltab, an alternative implemented by me is a Bayesian changepoint detection algorithm that estimates not just the number and locations of the changepoints but also reports the occurrence probability of changepoints. In its current implementation, it deals with only time-series-like data (aka, 1D sequential data). More info about the tool is available at this FileExchange entry (https://www.mathworks.com/matlabcentral/fileexchange/72515-bayesian-changepoint-detection-time-series-decomposition).
Here is its quick application to your sample data:
% Automatically install the Rbeast or BEAST library to local drive
eval(webread('http://b.link/beast')) %
data = [1 4 4 3 4 0 0 4 5 4 5 2 5 10 5 1 4 15 4 9 11 16 23 25 24 17 31 42 35 45 49 54 74 69 63 46 35 31 27 15 10 5 10 4 2 4 2 2 3 5 2 2];
out = beast(data, 'season','none') % season='none': there is no seasonal/periodic variation in the data
printbeast(out)
plotbeast(out)
Below is a summary of the changepoint, given by printbeast():
#####################################################################
# Trend Changepoints #
#####################################################################
.-------------------------------------------------------------------.
| Ascii plot of probability distribution for number of chgpts (ncp) |
.-------------------------------------------------------------------.
|Pr(ncp = 0 )=0.000|* |
|Pr(ncp = 1 )=0.000|* |
|Pr(ncp = 2 )=0.000|* |
|Pr(ncp = 3 )=0.859|*********************************************** |
|Pr(ncp = 4 )=0.133|******** |
|Pr(ncp = 5 )=0.008|* |
|Pr(ncp = 6 )=0.000|* |
|Pr(ncp = 7 )=0.000|* |
|Pr(ncp = 8 )=0.000|* |
|Pr(ncp = 9 )=0.000|* |
|Pr(ncp = 10)=0.000|* |
.-------------------------------------------------------------------.
| Summary for number of Trend ChangePoints (tcp) |
.-------------------------------------------------------------------.
|ncp_max = 10 | MaxTrendKnotNum: A parameter you set |
|ncp_mode = 3 | Pr(ncp= 3)=0.86: There is a 85.9% probability |
| | that the trend component has 3 changepoint(s).|
|ncp_mean = 3.15 | Sum{ncp*Pr(ncp)} for ncp = 0,...,10 |
|ncp_pct10 = 3.00 | 10% percentile for number of changepoints |
|ncp_median = 3.00 | 50% percentile: Median number of changepoints |
|ncp_pct90 = 4.00 | 90% percentile for number of changepoints |
.-------------------------------------------------------------------.
| List of probable trend changepoints ranked by probability of |
| occurrence: Please combine the ncp reported above to determine |
| which changepoints below are practically meaningful |
'-------------------------------------------------------------------'
|tcp# |time (cp) |prob(cpPr) |
|------------------|---------------------------|--------------------|
|1 |33.000000 |1.00000 |
|2 |42.000000 |0.98271 |
|3 |19.000000 |0.69183 |
|4 |26.000000 |0.03950 |
|5 |11.000000 |0.02292 |
.-------------------------------------------------------------------.
Here is the graphic output. Three major changepoints are detected:
You can use sgolayfilt function, that is a polynomial fit to the data, or reproduce OLS method: http://www.utdallas.edu/~herve/Abdi-LeastSquares06-pretty.pdf (there is a+bx notation instead of ax+b)
For linear fit of ax+b:
If you replace x with constant vector of length 2n+1: [-n, ... 0 ... n] on each step, you get the following code for sliding regression coeffs:
for i=1+n:length(y)-n
yi = y(i-n : i+n);
sum_xy = sum(yi.*x);
a(i) = sum_xy/sum_x2;
b(i) = sum(yi)/n;
end
Notice that in this code b means sliding average of your data, and a is a least-square slope estimate (first derivate).
I am a novice at Matlab and am struggling a bit with creating a loop that will a convert a 283080 x 2 matrix - column 1 lists all stockID numbers (each repeated 60 times) and column 2 contains all lagged monthly returns (60 observations for each stock) into a 60 x 4718 matrix with a column for each stockID and its corresponding lagged returns falling in 60 rows underneath each ID number.
My aim is to then try to calculate a variance-covariance matrix of the returns.
I believe I need a loop because I will be repeating this process over 70 times as I have multiple data sets in this same current format
Thanks so much for the help!
Let data denote your matrix. Then:
aux = sortrows(data,1); %// sort rows according to value in column 1
result = reshape(aux(:,2),60,[]); %// reshape second column as desired
If you need to insert the stockID values as headings (first row of result), add this as a last line:
result = [ unique(aux(:,1)).'; result ];
A simple example, replacing 60 by 2:
>> data = [1 100
2 200
1 101
2 201
4 55
3 0
3 33
4 56];
>> aux = sortrows(data,1);
>> result = reshape(aux(:,2),2,[])
>> result = [ unique(aux(:,1)).'; result ];
result =
1 2 3 4
100 200 0 55
101 201 33 56
This question is an outgrowth of MatLab (or any other language) to convert a matrix or a csv to put 2nd column values to the same row if 1st column value is the same? and Group values in different rows by their first-column index
If
A = [2 3 234 ; 2 44 99999; 2 99999 99999; 3 123 99; 3 1232 45; 5 99999 57]
1st column | 2nd column | 3rd column
--------------------------------------
2 3 234
2 44 99999
2 99999 99999
3 123 99
3 1232 45
5 99999 57
I want to make
1st col | 2nd col | 3rd col | 4th col | 5th col | 6th col| 7th col
--------------------------------------------------------------------
2 3 234 44
3 123 99 1232 45
5 57
That is, for each numbers in the 1st col of A, I want to put numbers EXCEPT "99999"
If we disregard the "except 99999" part, we can code as Group values in different rows by their first-column index
[U, ix, iu] = unique(A(:,1));
vals = reshape(A(:, 2:end).', [], 1); %'// Columnize values
subs = reshape(iu(:, ones(size(A, 2) - 1, 1)).', [], 1); %'// Replicate indices
r = accumarray(subs, vals, [], #(x){x'});
But obviously this code won't ignore 99999.
I guess there are two ways
1. first make r, and then remove 99999
2. remove 99999 first, and then make r
Whichever, I just want faster one.
Thank you in advance!
I think options 1 is better, i.e. first make r, and then remove 99999. Having r, u can remove 99999 as follows:
r2 = {}; % new cell array without 99999
for i = 1:numel(r)
rCell = r{i};
whereIs9999 = rCell == 99999;
rCell(whereIs9999) = []; % remove 99999
r2{i} = rCell;
end
Or more fancy way:
r2= cellfun(#(c) {c(c~=99999)}, r);
This question is an outgrowth of MatLab (or any other language) to convert a matrix or a csv to put 2nd column values to the same row if 1st column value is the same?
If
A = [2 3 234 ; 2 44 33; 2 12 22; 3 123 99; 3 1232 45; 5 224 57]
1st column | 2nd column | 3rd column
2 3 234
2 44 33
2 12 22
3 123 99
3 1232 45
5 224 57
then running
[U ix iu] = unique(A(:,1) );
r= accumarray( iu, A(:,2:3), [], #(x) {x'} )
will show me the error
Error using accumarray
Second input VAL must be a vector with one element for each row in SUBS, or a
scalar.
I want to make
1st col | 2nd col | 3rd col | 4th col | 5th col | 6th col| 7th col
2 3 234 44 33 12 22
3 123 99 1232 45
5 224 57
I know how to do it using for and if, but that spends too much time for big data.
How can I do this?
Thank you in advance!
You're misusing accumarray in the solution provided to your previous question. The first parameter iu is the vector of indices and the second parameter should be a vector of values, of the same length. What you did here is specify a matrix as the second parameter, which in fact has twice more values than indices in iu.
What you need to do in order to make it work is create a vector of indices both for the second column and for the third column (they are the same indices, not coincidentally!) and specify a matching column vector of values, like so:
[U, ix, iu] = unique(A(:,1));
vals = reshape(A(:, 2:end).', [], 1); %'// Columnize values
subs = reshape(iu(:, ones(size(A, 2) - 1, 1)).', [], 1); %'// Replicate indices
r = accumarray(subs, vals, [], #(x){x'});
This solution is generalized for any number of columns that you want to pass to accumarray.