reformatting a matrix in matlab with nan values - matlab

This post follows a previous question regarding the restructuring of a matrix:
re-formatting a matrix in matlab
An additional problem I face is demonstrated by the following example:
depth = [0:1:20]';
data = rand(1,length(depth))';
d = [depth,data];
d = [d;d(1:20,:);d];
Here I would like to alter this matrix so that each column represents a specific depth and each row represents time, so eventually I will have 3 rows (i.e. days) and 21 columns (i.e. measurement at each depth). However, we cannot reshape this because the number of measurements for a given day are not the same i.e. some are missing. This is known by:
dd = sortrows(d,1);
for i = 1:length(depth);
e(i) = length(dd(dd(:,1)==depth(i),:));
end
From 'e' we find that the number of depth is different for different days. How could I insert a nan into the matrix so that each day has the same depth values? I could find the unique depths first by:
unique(d(:,1))
From this, if a depth (from unique) is missing for a given day I would like to insert the depth to the correct position and insert a nan into the respective location in the column of data. How can this be achieved?

You were thinking correctly that unique may come in handy here. You also need the third output argument, which maps the unique depths onto the positions in the original d vector. have a look at this code - comments explain what I do
% find unique depths and their mapping onto the d array
[depths, ~, j] = unique(d(:,1));
% find the start of every day of measurements
% the assumption here is that the depths for each day are in increasing order
days_data = [1; diff(d(:,1))<0];
% count the number of days
ndays = sum(days_data);
% map every entry in d to the correct day
days_data = cumsum(days_data);
% construct the output array full of nans
dd = nan(numel(depths), ndays);
% assing the existing measurements using linear indices
% Where data does not exist, NaN will remain
dd(sub2ind(size(dd), j, days_data)) = d(:,2)
dd =
0.5115 0.5115 0.5115
0.8194 0.8194 0.8194
0.5803 0.5803 0.5803
0.9404 0.9404 0.9404
0.3269 0.3269 0.3269
0.8546 0.8546 0.8546
0.7854 0.7854 0.7854
0.8086 0.8086 0.8086
0.5485 0.5485 0.5485
0.0663 0.0663 0.0663
0.8422 0.8422 0.8422
0.7958 0.7958 0.7958
0.1347 0.1347 0.1347
0.8326 0.8326 0.8326
0.3549 0.3549 0.3549
0.9585 0.9585 0.9585
0.1125 0.1125 0.1125
0.8541 0.8541 0.8541
0.9872 0.9872 0.9872
0.2892 0.2892 0.2892
0.4692 NaN 0.4692
You may want to transpose the matrix.

It's not entirely clear from your question what your data looks like exactly, but the following might help you towards an answer.
Suppose you have a column vector
day1 = 1:21';
and, initially, all the values are NaN
day1(:) = NaN
Suppose next that you have a 2d array of measurements, in which the first column represents depths, and the second the measurements at those depths. For example
msrmnts = [1,2;2,3;4,5;6,7] % etc
then the assignment
day1(msrmnts(:,1)) = msrmnts(:,2)
will set values in only those rows of day1 whose indices are found in the first column of msrmnts. This second statement uses Matlab's capabilities for using one array as a set of indices into another array, for example
d([9 7 8 12 4]) = 1:5
would set elements [9 7 8 12 4] of d to the values 1:5. Note that the indices of the elements do not need to be in order. You could even insert the same value several times into the index array, eg [4 4 5 6 3 4] though it's not terribly useful.

Related

How to pick entries from each column of a matrix such that their sum is equal to an already specified number?

An example to clarify my question is that Let's assume a matrix
A=[ 0.8147 0.9134; 0.9058 0.6324; 0.1270 0.0975];
I want to select entries from each column such that their sum is always equal or approximately equal to a number say 1 for the above matrix. The selected entry from the first and second column would be 0.9058 and 0.0975 respectively, which leads to a summation of 1 (approximately 0.9058+000975=1.0033) or any other possible combination which result in a summation of 1. How can I do this?
Edit: Here, Matrix A(3x2) is given only as an example. Actual matrix is quite large with many rows and columns. Any exhaustive search is taking too much time for a large matrix.
Here is a compact way to do it for two column A.
B = nchoosek(1:numel(A),2);
B(B(:,1)>length(A) | B(:,2)<(length(A)+1),:) = [];
S = sum(A(B),2);
P = A(B(S>.98 & S<1.02,:)) % Set your tolerance. Here: abs(S-1)<0.02
Might not be the most eloquent solution, but I think it should get the job done. You might need to play with the digit rounded to and the number of significant figures in 1.0 to get the exact answer you want.
This solution loops through each value in column 1 of A once with each value in column 2 of A, so it is possible a value in column 1 could sum to ~1 more than once and therefore be duplicated. To find only the first match you could introduce a "break" in the loop if there is an answer already, or make a sub_matrix to find the sum closest to 1 if you only want one possible match included.
% Define matrix:
A = [ 0.8147 0.9134; 0.9058 0.6324; 0.1270 0.0975];
% Loop through column A to sum with each value in column B
% if the value is equal to 1.0, then it will group the pair numbers in
% Matched_up matrix and the indices of the match from the original
% matrix in Matched_indices matrix.
Matched_up = [];
Matched_indices = [];
for indexA = 1:length(A)
for indexB = 1:length(A)
if round(A(indexA,1)+A(indexB,2),1) == 1.0
Matched_up = [Matched_up; A(indexA,1) A(indexB,2)];
Matched_indices = [Matched_indices; indexA indexB];
end
end
end
% Pair up values and check:
Row_sum = sum(Matched_up,2);
% If what you are ultimately after is the row which has a sum closest to 1
% You could look at the abs difference between the row_sum and 1
[Abs_diff, Row_Closest_to_one] = min(abs(1-Row_sum));

Merge 2 vectors on equal time values

I have collected two types of data. One is a struct Outputs with 3 fields: Outputs.time, Outputs.signals and an unimportant one. Outputs.time is a columnvector containing all the time values (where the data is sampled), Outputs.signals has 15 rows, on each row the values and properties of a signal (so there are 15 signals in total). Consequently Outputs.signals(i).values has the same number of rows as Outputs.time.
Now i have another table with 4 columns: LabData.time, LabData.NdBoiler, LabData.NdOutput and an unimportant one. Outputs.time contains all the computer sampled data, LabData.time only some measurements taken by hand. Ergo, Outputs.time is way larger than LabData.time, but at certain times (where Outputs.time = LabData.time) there are values for both Outputs.signals and the other columns of LabData.
The goal is to put the values of LabData.NdBoiler and LabData.NdOutput in Outputs.signals(16) and Outputs.signals(17) for the time-samples where the value is known. For the other values, Outputs.signals(16) = NaN and Outputs.signals(17) = NaN. But i dont know how to do that, could you help me?
Example:
Outputs.time = [1; 2; 3; 4; 5];
Outputs.signals(1).values = [1111; 2222; 3333; 4444; 5555]; %and so on for the other signals
LabData.time = [2; 4];
LabData.NdBoiler = [1.23; 1.32];
%% Now the final result should be
Outputs.signals(16).values = [NaN; 1.23; NaN; 1.32; NaN]
The idea is to first create the vector of NaNs whereafter you match the timepoints using ismember to substitute the values you know in.
Outputs.signals(16).values = nan(1,length(Outputs.time)); %Vector of nans
Lia = ismember(Outputs.time,LabData.time); %Where does the times match?
Outputs.signals(16).values(Lia) = LabData.NdBoiler; %substitute

Grouping by nested unique values

I have a matrix A in Matlab:
A = [176 5406 1 4 7903;
155 5406 1 5 7903;
122 5407 0 4 7903;
140 5407 0 5 7904;
130 5407 0 3 7904];
Just for information - the second column is a user ID, while the fourth column is a time. So 5406 is one user and 5407 is another user. Both of these users have some information stored in the first column and the 4th column which I am interested in accessing.
So basically what I want to do is:
For each user take the median of their values in the first column. I have written code (below) that works for this.
If there are two equal "time" values in column 5 for each user then I want to average the values in column 4. So like for user 5406 the time values are both 7903, I want to the average of values in column 4 - i.e. the average of 4 and 5 to end up with one value (4.5).
But for example for the next user 5407 I will have two final values - one will be the average of 5 and 3 (because 7904 is repeated) and one will be 4 (because 7903 is not repeated).
I am a bit confused about how to do this, I know there needs to be an if statement of some sort, but I've been stuck on it for ages. Can anyone help?
Thanks
Code for the first part:
u=unique(A(:,2));
for i=1:size(u,1)
M=find(A(i,2)==u(i));
med(i)=median(A(M,1));
end
You could run unique for the time values of each user (within the loop) and do a similar sub loop to collect the mean of unique timestamp for that user.
But here I think it's neater to use accumarray. In first example below, I've modified your code just a bit.
% Get unique
[user, ~, userIdx] = unique(A(:,2));
nUser = numel(user);
% Allocate container for result
med = zeros(nUser,1);
men = cell(nUser,1); % <-- Need a cell since length of result could vary
for i = 1:nUser
% Median of col #1
med(i) = median(A(userIdx == i, 1));
% Mean of col #4 for unique times
[~, ~, timeIdx] = unique(A(userIdx == i, 5));
men{i} = accumarray(timeIdx, A(userIdx == i, 4), [], #mean);
end
Result:
>> med =
165.5
130
>> celldisp(men)
men{1} =
4.5
men{2} =
4
4
To squeeze it a bit more, you could take unique time for entire A and use accumarray for both
[~, ~, userIdx] = unique(A(:,2));
[~, ~, timeIdx] = unique(A(:,5));
med = accumarray(userIdx, A(:,1), [], #median);
men = accumarray([userIdx timeIdx], A(:,4), [], #mean, NaN);
This gives men not as a cell but a matrix. Therefore the blank spaces has to be filled (here I choose NaN since 0 could be a result of #mean).
men =
4.5 NaN
4 4
If you want it as a cell without NaN you could just loop over the rows and pick non-NaN values, or place only the men calculation in the loop, or any other way...
If you are sure that column 4 of A doesn't contain any negative or zero numbers (mean value should never risk being 0), you could collect the result of men as a sparse matrix instead
men = accumarray([userIdx timeIdx], A(:,4), [], #mean, 0, true);
men =
(1,1) 4.5
(2,1) 4
(2,2) 4
I got another solution for your task without using any loops:
Median values.
u=unique(A(:,2));
umedians = arrayfun( #(x) median (A( A(:,2)==x, 1)), u);
Explanation:
find all unique users first. Then using arrayfun to find all data for current user and calculate median for every one of them.
Average values of column 4.
This task is a bit harder. We can go this way:
temp = arrayfun( #(x) unique(A ( A(:,2)==x,5 )), u, 'UniformOutput',false);
result = cellfun( #(y,z) arrayfun( #(x) mean( A( A(:,2) == u(z) & A(:,5) == x ,4) ), ...
y, 'UniformOutput',false), temp , num2cell( [1:size(u,1)]'), 'UniformOutput',false)
Explanation: first of all lets find all unique times for each users. Save it to cell array temp. Now we need for each cell find the same times and calculate mean. So lets use cellfun to made it for each cell of temp and use arrayfun into it to calculate mean.
Hope it helps!

Calculation the elements of different sized matrix in Matlab

Can anybody help me to find out the method to calculate the elements of different sized matrix in Matlab ?
Let say that I have 2 matrices with numbers.
Example:
A=[1 2 3;
4 5 6;
7 8 9]
B=[10 20 30;
40 50 60]
At first,we need to find maximum number in each column.
In this case, Ans=[40 50 60].
And then,we need to find ****coefficient** (k).
Coefficient(k) is equal to 1 divided by quantity of column of matrix A.
In this case, **coefficient (k)=1/3=0.33.
I wanna create matrix C filling with calculation.
Example in MS Excel.
H4 = ABS((C2-C6)/C9)*0.33+ABS((D2-D6)/D9)*0.33+ABS((E2-E6)/E9)*0.33
I4 = ABS((C3-C6)/C9)*0.33+ABS((D3-D6)/D9)*0.33+ABS((E3-E6)/E9)*0.33
J4 = ABS((C4-C6)/C9)*0.33+ABS((D4-D6)/D9)*0.33+ABS((E4-E6)/E9)*0.33
And then (Like above)
H5 = ABS((C2-C7)/C9)*0.33+ABS((D2-D7)/D9)*0.33+ABS((E2-E7)/E9)*0.33
I5 = ABS((C3-C7)/C9)*0.33+ABS((D3-D7)/D9)*0.33+ABS((E3-E7)/E9)*0.33
J5 = ABS((C4-C7)/C9)*0.33+ABS((D4-D7)/D9)*0.33+ABS((E4-E7)/E9)*0.33
C =
0.34 =|(1-10)|/40*0.33+|(2-20)|/50*0.33+|(3-30)|/60*0.33
0.28 =|(4-10)|/40*0.33+|(5-20)|/50*0.33+|(6-30)|/60*0.33
0.22 =|(7-10)|/40*0.33+|(8-20)|/50*0.33+|(9-30)|/60*0.33
0.95 =|(1-40)|/40*0.33+|(2-50)|/50*0.33+|(3-60)|/60*0.33
0.89 =|(4-40)|/40*0.33+|(5-50)|/50*0.33+|(6-60)|/60*0.33
0.83 =|(7-40)|/40*0.33+|(8-50)|/50*0.33+|(9-60)|/60*0.33
Actually A is a 15x4 matrix and B is a 5x4 matrix.
Perhaps,the matrices dimensions are more than this matrices (variables).
How can i write this in Matlab?
Thanks you!
You can do it like so. Let's assume that A and B are defined as you did before:
A = vec2mat(1:9, 3)
B = vec2mat(10:10:60, 3)
A =
1 2 3
4 5 6
7 8 9
B =
10 20 30
40 50 60
vec2mat will transform a vector into a matrix. You simply specify how many columns you want, and it will automatically determine the right amount of rows to transform the vector into a correctly shaped matrix (thanks #LuisMendo!). Let's also define more things based on your post:
maxCol = max(B); %// Finds maximum of each column in B
coefK = 1 / size(A,2); %// 1 divided by number of columns in A
I am going to assuming that coefK is multiplied by every element in A. You would thus compute your desired matrix as so:
cellMat = arrayfun(#(x) sum(coefK*(bsxfun(#rdivide, ...
abs(bsxfun(#minus, A, B(x,:))), maxCol)), 2), 1:size(B,1), ...
'UniformOutput', false);
outputMatrix = cell2mat(cellMat).'
You thus get:
outputMatrix =
0.3450 0.2833 0.2217
0.9617 0.9000 0.8383
Seems like a bit much to chew right? Let's go through this slowly.
Let's start with the bsxfun(#minus, A, B(x,:)) call. What we are doing is taking the A matrix and subtracting with a particular row in B called x. In our case, x is either 1 or 2. This is equal to the number of rows we have in B. What is cool about bsxfun is that this will subtract every row in A by this row called by B(x,:).
Next, what we need to do is divide every single number in this result by the corresponding columns found in our maximum column, defined as maxCol. As such, we will call another bsxfun that will divide every element in the matrix outputted in the first step by their corresponding column elements in maxCol.
Once we do this, we weight all of the values of each row by coefK (or actually every value in the matrix). In our case, this is 1/3.
After, we then sum over all of the columns to give us our corresponding elements for each column of the output matrix for row x.
As we wish to do this for all of the rows, going from 1, 2, 3, ... up to as many rows as we have in B, we apply arrayfun that will substitute values of x going from 1, 2, 3... up to as many rows in B. For each value of x, we will get a numCol x 1 vector where numCol is the total number of columns shared by A and B. This code will only work if A and B share the same number of columns. I have not placed any error checking here. In this case, we have 3 columns shared between both matrices. We need to use UniformOutput and we set this to false because the output of arrayfun is not a single number, but a vector.
After we do this, this returns each row of the output matrix in a cell array. We need to use cell2mat to transform these cell array elements into a single matrix.
You'll notice that this is the result we want, but it is transposed due to summing along the columns in the second step. As such, simply transpose the result and we get our final answer.
Good luck!
Dedication
This post is dedicated to Luis Mendo and Divakar - The bsxfun masters.
Assuming by maximum number in each column, you mean columnwise maximum after vertically concatenating A and B, you can try this one-liner -
sum(abs(bsxfun(#rdivide,bsxfun(#minus,permute(A,[3 1 2]),permute(B,[1 3 2])),permute(max(vertcat(A,B)),[1 3 2]))),3)./size(A,2)
Output -
ans =
0.3450 0.2833 0.2217
0.9617 0.9000 0.8383
If by maximum number in each column, you mean columnwise maximum of B, you can try -
sum(abs(bsxfun(#rdivide,bsxfun(#minus,permute(A,[3 1 2]),permute(B,[1 3 2])),permute(max(B),[1 3 2]))),3)./size(A,2)
The output for this case stays the same as the previous case, owing to the values of A and B.

Variable has "incorrect" value when submitted to Matlab Grader

I am struggling with my Matlab homework:
Write a script to do the following:
Generate a matrix called grades of size 8 x 25 that contains random numbers of type double in the range of 1 to 6.
Calculate the mean of matrix rows (mrow), the mean of matrix columns (mcol), and the overall mean (mall) of the matrix grades.
Copy the matrix grades to a new variable, in which you replace the elements in the 5th row and 20th to 23rd column with NaN. Compute the overall mean (mall_2) of this matrix again, i.e., the mean of the remaining values.
I am done with task 2-5, however, task 1 is not correct. I am not sure what I am doing wrong. I assume that it has something to do with the type of number (double), but I was unable to convert it.
We have to submit our homework to the online tool "Matlab Grader". The system says:
Matrix of random numbers : Variable grades has an incorrect value.
Here is my code:
% Generate matrix 'grades' with random numbers in the range 1 to 6
a = 1;
b = 6;
grades = (b-a).*rand(8,25) + a;
% calculate mean values 'mrow', 'mcol', 'mall'
mrow = mean(grades,2)
mcol = mean(grades,1)
mall = mean(grades(:))
% Replace elements with NaN
grades(5,20:23) = NaN
%Calculate mean of elements omitting NaN
mall_2 = mean(grades(:),'omitnan')
I assume your homework validation system is checking that everything in the variable grades is a (random) number in the range 1 to 6, as required by question 1.
However, by the end of your computation there are also 3 NaN values in the grades variable, because you missed this step of question 3:
Copy the matrix grades to a new variable
Instead, you overrode the elements in grades.
If you did this:
grades_mod = grades;
grades_mod(5,20:23) = NaN;
mall_2 = mean(grades_mod(:),'omitnan');
Then grades would retain its original values (no NaNs) and you can calculate mall_2.