I have a matrix which I have created which is initially a column vector
matrix1 = [1:42]'
I have another matrix which is for arguments sake a 2000-by-2 matrix called matrix2.
Column 1 of matrix2 will always be a number between 1 and 42 and is in any order.
I want to loop through matrix2 column 1 and populate matrix1 column 2 with the result from matrix2 column2 against the corresponding number - at the end of each iteration of the loop I'm going to sum up column2 of matrix2.
so in pseudo code it would be something like this:
for i = 1:length(matrix2)
look at i,1 its a "4" for example - take matrix2 column2
and populate matrix1 next to the 4
(ie column 2 in matrix 1 next to the 4 with this number)
so matrix 1 initially
1
2
3
4
5
6
matrix 2
3 100
1 250
2 200
1 80
4 40
5 50
so after one iteration matrix1 looks like this
1
2
3 100
4
5
6
after iteration 2 matrix1 looks like this
1 250
2
3 100
4
5
6
after iteration 3 matrix1 looks like this
1 250
2 200
3 100
4
5
6
As mentioned, I'll perform a calculation after each iteration but the important thing is to populate matrix 1 column2. There will obviously be many over writes or replacements of the numbers in matrix1 column2 but thats fine since im going to perform a calculation after each iteration.
Try this:
for ii =1:length(matrix2)
matrix1(matrix2(ii,1),2) = matrix2(ii,2);
end
Might need to make matrix1 2D from the start (can just initialize 2nd column to be all 0s).
Related
I am working in MATLAB and I have a row (or column) Matrix calculated on each iteration of for loop.
Suppose on first iteration of for loop
A=[1 2 5 7]
On second iteration of for loop
A=[2 5 1 8 4 5]
The length of A is variable. I want the result as follows.
B=[1 2 5 7 2 5 1 8 4 5]
i.e. B should have the results of all the previous iterations in single row or column matrix.
How can this be achieved?
I have a cell in Matlab composed as follow, where each entry can have multiple integer.
For instance:
A=cell(2,10);
A{1,1}=[5];
A{1,2}=[5 7];
A{1,3}=[5];
A{1,4}=[5];
A{1,5}=[5];
A{1,6}=[5];
A{1,7}=[5];
A{1,8}=[5];
A{1,9}=[5];
A{1,10}=[5];
A{2,1}=[5];
A{2,2}=[3];
A{2,3}=[1];
A{2,4}=[5];
A{2,5}=[2];
A{2,6}=[6];
A{2,7}=[2];
A{2,8}=[2];
A{2,9}=[1];
A{2,10}=[5 4];
I would obtain a Matrix which contains the elements of the cells. When the rows in the cells contain multiple entry (for example A{1,2}) the entry should be included (all of them) one time. For instance the Matrix output should be:
B=[5 5 5 5 5 5 5 5 5 5; %A{1,:}first column in the cell
5 7 5 5 5 5 5 5 5 5; %A{1,:}first column and the second element in row
A{1,2}
5 3 1 5 2 6 2 2 1 5;
5 3 1 5 2 6 2 2 1 4];
Could you help me?
Thanks in advance
This will do it:
[r,c]= size(A); %Finding the size of A for the next step
B=zeros(r*2,c); %Pre-allocating the memory
for iter=1:r
k=find(cellfun('length',A(iter,:))==2); %finding which elements have length =2
temp=cell2mat(A(iter,:)); %converting cell to matrix
k1= k+ [0:size(k,2)-1]; %finding which elements should come in the next row instead of in next column
temp1= temp(k1+1); %storing those elements in 'temp1' matrix
temp(k1+1)=[]; %removing those elements from original 'temp' matrix
B(2*(iter-1)+1:2*(iter-1)+2, :)=[temp; temp];
B(2+(iter-1)*2,k)=temp1; %replacing the elements from temp1
end
B
I am aware of MATLAB's datasample which allows to select k times from a certain population. Suppose population=[1,2,3,4] and I want to uniformly sample, with replacement, k=5 times from it. Then:
datasample(population,k)
ans =
1 3 2 4 1
Now, I want to repeat the above experiment N=10000 times without using a for loop. I tried doing:
datasample(repmat(population,N,1),5,2)
But the output I get is (just a short excerpt below):
1 3 2 1 3
1 3 2 1 3
1 3 2 1 3
1 3 2 1 3
1 3 2 1 3
1 3 2 1 3
1 3 2 1 3
1 3 2 1 3
1 3 2 1 3
Every row (result of an experiment) is the same! But obviously they should be different... It's as though some random seed is not updating between rows. How can I fix this? Or some other method I could use that avoids a for loop? Thanks!
You seem to be confusing the way datasample works. If you read the documentation on the function, if you specify a matrix, it will generate a data sampling from a selection of rows in the matrix. Therefore, if you simply repeat the population vector 10000 times, and when you specify the second parameter of the function - which in this case is how many rows of the matrix to extract, even though the actual row locations themselves are different, the actual rows over all of the matrix is going to be the same which is why you are getting that "error".
As such, I wouldn't use datasample here if it is your intention to avoid looping. You can use datasample, but you'd have to loop over each call and you explicitly said that this is not what you want.
What I would recommend you do is first create your population vector to have whatever you desire in it, then generate a random index matrix where each value is between 1 up to as many elements as there are in population. This matrix is in such a way where the number of columns is the number of samples and the number of rows is the number of trials. Once you create this matrix, simply use this to index into your vector to achieve the desired sampling matrix. To generate this random index matrix, randi is a fine choice.
Something like this comes to mind:
N = 10000; %// Number of trials
M = 5; %// Number of samples per trial
population = 1:4; %// Population vector
%// Generate random indices
ind = randi(numel(population), N, M);
%// Get the stuff
out = population(ind);
Here's the first 10 rows of the output:
>> out(1:10,:)
ans =
4 3 1 4 2
4 4 1 3 4
3 2 2 2 3
1 4 2 2 2
1 2 3 4 2
2 2 3 2 1
4 1 3 2 4
1 4 1 3 1
1 1 2 4 4
1 2 4 2 1
I think the above does what you want. Also keep in mind that the above code generalizes to any population vector you want. You simply have to change the vector and it will work as advertised.
datasample interprets each column of your data as one element of your population, sampling among all columns.
To fix this you could call datasample N times in a loop, instead I would use randi
population(randi(numel(population),N,5))
assuming your population is always 1:p, you could simplify to:
randi(p,N,5)
Ok so both of the current answers both say don't use datasample and use randi instead. However, I have a solution for you with datasample and arrayfun.
>> population = [1 2 3 4];
>> k = 5; % Number of samples
>> n = 1000; % Number of times to execute datasample(population, k)
>> s = arrayfun(#(k) datasample(population, k), n*ones(k, 1), 'UniformOutput', false);
>> s = cell2mat(s);
s =
1 4 1 4 4
4 1 2 2 4
2 4 1 2 1
1 4 3 3 1
4 3 2 3 2
We need to make sure to use 'UniformOutput', false with arrayfun as there is more than one output. The cell2mat call is needed as the result of arrayfun is a cell array.
I have a matrix S in Matlab that looks like the following:
2 2 1 2
2 3 1 1
3 3 1 1
3 4 1 1
3 1 2 1
4 1 3 1
1 1 3 1
I would like to count patterns of values column-wise. I am interested into the frequency of the numbers that follow right after number 3 in any of the columns. For instance, number 3 occurs three times in the first column. The first time we observe it, it is followed by 3, the second time it is followed by 3 again and the third time it is followed by 4. Thus, the frequency for the patters observed in the first column would look like:
3-3: 66.66%
3-4: 33.33%
3-1: 0%
3-2: 0%
To generate the output, you could use the convenient tabulate
S = [
2 2 1 2
2 3 1 1
3 3 1 1
3 4 1 1
3 1 2 1
4 1 3 1
1 1 3 1];
idx = find(S(1:end-1,:)==3);
S2 = S(2:end,:);
tabulate(S2(idx))
Value Count Percent
1 0 0.00%
2 0 0.00%
3 4 66.67%
4 2 33.33%
Here's one approach, finding the 3's then looking at the following digits
[i,j]=find(S==3);
k=i+1<=size(S,1);
T=S(sub2ind(size(S),i(k)+1,j(k))) %// the elements of S that are just below a 3
R=arrayfun(#(x) sum(T==x)./sum(k),1:max(S(:))).' %// get the number of probability of each digit
I'm going to restate your problem statement in a way that I can understand and my solution will reflect this new problem statement.
For a particular column, locate the locations that contain the number 3.
Look at the row immediately below these locations and look at the values at these locations
Take these values and tally up the total number of occurrences found.
Repeat these for all of the columns and update the tally, then determine the percentage of occurrences for the values.
We can do this by the following:
A = [2 2 1 2
2 3 1 1
3 3 1 1
3 4 1 1
3 1 2 1
4 1 3 1
1 1 3 1]; %// Define your matrix
[row,col] = find(A(1:end-1,:) == 3);
vals = A(sub2ind(size(A), row+1, col));
h = 100*accumarray(vals, 1) / numel(vals)
h =
0
0
66.6667
33.3333
Let's go through the above code slowly. The first few lines define your example matrix A. Next, we take a look at all of the rows except for the last row of your matrix and see where the number 3 is located with find. We skip the last row because we want to be sure we are within the bounds of your matrix. If there is a number 3 located at the last row, we would have undefined behaviour if we tried to check the values below the last because there's nothing there!
Once we do this, we take a look at those values in the matrix that are 1 row beneath those that have the number 3. We use sub2ind to help us facilitate this. Next, we use these values and tally them up using accumarray then normalize them by the total sum of the tallying into percentages.
The result would be a 4 element array that displays the percentages encountered per number.
To double check, if we look at the matrix, we see that the value of 3 follows other values of 3 for a total of 4 times - first column, row 3, row 4, second column, row 2 and third column, row 6. The value of 4 follows the value of 3 two times: first column, row 6, second column, row 3.
In total, we have 6 numbers we counted, and so dividing by 6 gives us 4/6 or 66.67% for number 3 and 2/6 or 33.33% for number 4.
If I got the problem statement correctly, you could efficiently implement this with MATLAB's logical indexing and an approach that is essentially of two lines -
%// Input 2D matrix
S = [
2 2 1 2
2 3 1 1
3 3 1 1
3 4 1 1
3 1 2 1
4 1 3 1
1 1 3 1]
Labels = [1:4]'; %//'# Label array
counts = histc(S([false(1,size(S,2)) ; S(1:end-1,:) == 3]),Labels)
Percentages = 100*counts./sum(counts)
Verify/Present results
The styles for presenting the output results listed next use MATLAB's table for a well human-readable format of data.
Style #1
>> table(Labels,Percentages)
ans =
Labels Percentages
______ ___________
1 0
2 0
3 66.667
4 33.333
Style #2
You can do some fancy string operations to present the results in a more "representative" manner -
>> Labels_3 = strcat('3-',cellstr(num2str(Labels','%1d')'));
>> table(Labels_3,Percentages)
ans =
Labels_3 Percentages
________ ___________
'3-1' 0
'3-2' 0
'3-3' 66.667
'3-4' 33.333
Style #3
If you want to present them in descending sorted manner based on the percentages as listed in the expected output section of the question, you can do so with an additional step using sort -
>> [Percentages,idx] = sort(Percentages,'descend');
>> Labels_3 = strcat('3-',cellstr(num2str(Labels(idx)','%1d')'));
>> table(Labels_3,Percentages)
ans =
Labels_3 Percentages
________ ___________
'3-3' 66.667
'3-4' 33.333
'3-1' 0
'3-2' 0
Bonus Stuff: Finding frequency (counts) for all cases
Now, let's suppose you would like repeat this process for say 1, 2 and 4 as well, i.e. find occurrences after 1, 2 and 4 respectively. In that case, you can iterate the above steps for all cases and for the same you can use arrayfun -
%// Get counts
C = cell2mat(arrayfun(#(n) histc(S([false(1,size(S,2)) ; S(1:end-1,:) == n]),...
1:4),1:4,'Uni',0))
%// Get percentages
Percentages = 100*bsxfun(#rdivide, C, sum(C,1))
Giving us -
Percentages =
90.9091 20.0000 0 100.0000
9.0909 20.0000 0 0
0 60.0000 66.6667 0
0 0 33.3333 0
Thus, in Percentages, the first column are the counts of [1,2,3,4] that occur right after there is a 1 somewhere in the input matrix. As as an example, one can see column -3 of Percentages is what you had in the sample output when looking for elements right after 3 in the input matrix.
If you want to compute frequencies independently for each column:
S = [2 2 1 2
2 3 1 1
3 3 1 1
3 4 1 1
3 1 2 1
4 1 3 1
1 1 3 1]; %// data: matrix
N = 3; %// data: number
r = max(S(:));
[R, C] = size(S);
[ii, jj] = find(S(1:end-1,:)==N); %// step 1
count = full(sparse(S(ii+1+(jj-1)*R), jj, 1, r, C)); %// step 2
result = bsxfun(#rdivide, count, sum(S(1:end-1,:)==N)); %// step 3
This works as follows:
find is first applied to determine row and col indices of occurrences of N in S except its last row.
The values in the entries right below the indices of step 1 are accumulated for each column, in variable count. The very convenient sparse function is used for this purpose. Note that this uses linear indexing into S.
To obtain the frequencies for each column, count is divided (with bsxfun) by the number of occurrences of N in each column.
The result in this example is
result =
0 0 0 NaN
0 0 0 NaN
0.6667 0.5000 1.0000 NaN
0.3333 0.5000 0 NaN
Note that the last column correctly contains NaNs because the frequency of the sought patterns is undefined for that column.
I have a testfile.txt, which is a 4 x 4 matrix and tab delimited
1 1 3 4
2 2 3 4
3 1 3 4
4 2 3 4
The output i want is like so:
If it detects that the second column has a 1, insert a new column on the right side, and the new column should contains something like x=[1 1 0 3]
If it detects that the second column has a 2, insert a new column on the right side, and the new column should contains something like y=[2 3 4 5]
This is how the output should look like:
1 1 x=[1 1 0 3] 3 4
2 2 y=[2 3 4 5] 3 4
3 1 x=[1 1 0 3] 3 4
4 2 y=[2 3 4 5] 3 4
Ultimately, in MATLAB this is the output I want to get:
1 1 1 1 0 3 3 4
2 2 2 3 4 5 3 4
3 1 1 1 0 3 3 4
4 2 2 3 4 5 3 4
What I've tried is:
test=dlmread('testfile.txt','\t');
m=length(test);
for i=1:m
if find(test(:,2)==1)>0
x=[1 1 0 3];
test=[test(:,1) x test(:,3:4)];
elseif find(test(:,2)==2)>0
y=[2 3 4 5];
test=[test(:,1) y test(:,3:4)];
dlmwrite('testfile.txt',test,'delimiter','\t','precision','%.4f');
end
end
The error I get is the following:
Dimensions of matrices being concatenated are not consistent.
The error is from the following statement:
Error in : test=[test(:,1) x test(:,3:4)]
I'll be really appreciative if someone can help me, since i'm quite new in MATLAB.
Thanks in advance!
Here is a completely vectorized solution for you. Let's go through this one step at a time. You obviously are reading in the text data right, so let's keep that code the same.
test = dlmread('testfile.txt','\t');
What I'm going to do is create a 2D array where the first row corresponds to your x that you want to insert, while the second row corresponds to the y that you want to insert. In other words, create a variable called insertData such that:
insertData = [1 1 0 3; 2 3 4 5];
Next, you simply have to use the second column to figure out which row of data from insertData you want to insert into your final matrix. You can then use this to create your final matrix, which we will store in testOut. In other words:
testOut = [test(:,1:2) insertData(test(:,2),:) test(:,3:4)]
The output I get is:
testOut =
1 1 1 1 0 3 3 4
2 2 2 3 4 5 3 4
3 1 1 1 0 3 3 4
4 2 2 3 4 5 3 4
Let's walk through the above code slowly. The first two columns of your data stored in test and the last two columns in your data stored in test are the same. You want to insert the data right in the middle. As such, you create a new matrix called testOut where the first two columns are the same, and then in the middle is where it gets interesting. Every time the second column has a 1, we access the first row of insertData, and we place our data in the corresponding row. Every time the second column has a 2, we access the second row of insertData, and we place our data in the corresponding row. To finish everything off, the last two columns should be the same.
Minor Note
If you want to understand why your code isn't working, it's because you are not concatenating the rows properly. In addition, in your for loop, you are using : to access all of the rows for a particular column when you should be accessing one row at a time... at least that's how I'm interpreting your for loop. This change must also be done in your if statements. Also, you are adding onto the test variable, when you need to declare a NEW variable. Also, you need to move the dlmwrite method so that it is called AFTER the for loop has finished and you have finished creating the new matrix. The combination of all of these things is ultimately why you are getting errors in your code.
Basically, what you need to do, if you want to use your code, is do this:
test=dlmread('testfile.txt','\t');
m=length(test);
testOut = []; %// Must declare NEW variable
for i=1:m
if find(test(i,2)==1)>0 %// Change
x=[1 1 0 3];
testOut=[testOut; test(i,1) x test(i,3:4)]; %// NEW
elseif find(test(i,2)==2)>0 %// Change
y=[2 3 4 5];
testOut=[testOut; test(i,1) y test(i,3:4)]; %// NEW
end
end
%// Move this out!
dlmwrite('testfile.txt',testOut,'delimiter','\t','precision','%.4f');
Take a look at how testOut is being concatenated in the for loop. You are going to take the current state of testOut, move to the next row using ; then add your new data in.
This code should now work, but you can easily achieve what you want to do in just two lines.
Hope this helped!