Merging to dataset together according to key - matlab

I have two datasets stored in a cell array and a double array, respectively. The design of the two arrays is:
Array 1 (name: res) (double) is composed of two columns; a unique id column and a data column.
Array 2 (name: config) (cell array) contains 3 column cells, each with a string inside. The last cell in the array contains a id double integer matching the id's in Array 1. The double integer in the cell array is converted to a double when necessary.
I want to merge the two datasets in order to have the 3 cells in the cell array AND the result column in Array 1 in one common cell array. How do I do this?
I have the following code. The code does not return the correct order of the results.
function resMat = buildResultMatrix(res, config)
resMat = {};
count = 1;
count_max = size(res,1)/130;
for i = 1 : size(res,1)
for j = 1 : size(res,1)
if isequal(res(i),str2double(config{j,3}))
if i == 1
resMat(end+1,:) = {config{j,:} res(j,2:end)};
else
if count == 1
resMat(end+1,:) = {config{j,:} res(j,2:end)};
elseif count == count_max
resMat(end+1,:) = {config{j,:} res(j,2:end)};
else
resMat(end+1,:) = {config{j,:} res(j,2:end)};
end
count = count + 1;
end
end
end
count = 1;
end
end

First convert the id in config to numbers:
config(:,3) = num2cell(str2double(config(:,3)));
Then run this:
res = sortrows(res,1);
config(:,4) = num2cell(res(cell2mat(config(:,3)),2))
this will put the data from res in the 4th column in config in the row with the same id.

Related

count number of elements with a specific value in a field of a structure in Matlab

I have a structure myS with several fields, including myField, which in turns includes several other fields such as BB. I need to count how many time *'R_value' appears in BB.
I have tried:
sum(myS.myField.BB = 'R_value')
and this:
count = 0;
for i = 1:numel(myS.myField)
number_of_element = numel(myS.myField(i).BB)=='R_value'
count = count+number_of_element;
end
but it doesn't work. Any suggestion?
If you are just checking if BB is that literal string, then your loop is just:
count = 0;
for i = 1:numel(myS.myField)
count = count+strcmp(myS.myField(i).BB,'R_value')
end
numel counts how many elements are. Zero is an element. so is False. Just sum the array.
count = 0;
for i = 1:numel(myS.myField)
number_of_element = sum(myS.myField(i).BB==R_value)
count = count+number_of_element;
end
Also note you had the parenthesis wrong, so you where counting how many BB where in total, then comparing that number to R_value. I am assuming R_value is a number.
e.g.:
myS.myField(1).BB=[1 2 3 4 1 1 1]
myS.myField(2).BB=[4 5 65 1]
R_value=1

delete range of rows of a cell array under certain condition, MATLAB

I have a very large cell array containing a lot of measures. In general the measurements are in the range of 3 to 15 meters. My problem is that some of these measurements don't have this range, so it's invalid data, I want to remove these range of data from my cell array.
Here is what I have tried (in resume):
ind_cond = find(strcmp('Machine',A{:,1}));
A = table2cell(A);
for i = 1:(length(ind_cond)-1);
cond = ismember(A(ind_cond(i):ind_cond(i+1),11),'15');
if cond == 0
A(ind_cond(i):ind_cond(i+1),11) = [];
end
end
So first I search for the word 'Machine' because this is in all the headers so I can have the total number of measurements. Then I try to find the string '15' (I convert this later to num) on the range of the measurements, and if there is no '15' I want to delete that range of rows from the array.
I get the following error:
"A null assignment can have only one non-colon index"
Many thanks
EDIT:
Here is a picture of how the data looks ( I don't know how to upload this, is a .csv file, sorry)
The 11 column is the important thing, here is the data that I'm interested. The problem is for example that some data sets (they are a lot, from 0.25 to 17 meters) are incomplete, because they don't have the value '15' so I want to delete the entire dataset in that case.
My first attemp was make something like this
for i = 1:(length(ind_cond)-1);
if ind_cond(i+1,1)- ind_cond(i,1) < 30 ;
A(ind_cond(i):ind_cond(i+1),:) = [];
end
end
And it works well but this don't delete all the conflictive data, since I have one (1) very large data set that don't have '15', and the condition above can't eliminate it.
In the picture "What i want to delete" is an example of how are the conflictive data, and I want to delete all that data.
Overview of data
What i want to delete
If the intent is to remove the cells that don't have the string '15', you can do the following:
A = [{'TEST'} {'Machine'} ; ...
{'test1'} {'3'}; ...
{'test2'} {'7'}; ...
{'test3'} {'16'}; ...
{'test4'} {'15'} ; ...
{'test5'} {'1'}; ...
{'test6'} {'8'}];
machine_cell = A(:,2);
% keep only cells that where there in no '15'
new_A = A(contains(machine_cell,'15'),:);
The new cell array will be:
>> new_A =
1×2 cell array
{'test4'} {'15'}
The opposite, keep all cells that doesn't have '15' then just negate contains:
new_A = A(~contains(machine_cell,'15'),:);
>> new_A =
6×2 cell array
{'TEST' } {'Machine'}
{'test1'} {'3' }
{'test2'} {'7' }
{'test3'} {'16' }
{'test5'} {'1' }
{'test6'} {'8' }

Reference to non-existent field 'd'

My mat file contains 40,000 rows and two columns. I have to read it line by line
and then get values of last column in a single row.
Following is my code:
for v = 1:40000
firstRowB = data.d(v,:)
if(firstRowB(1,2)==1)
count1=count1+1;
end
if(firstRowB(1,2)==2)
count2=count2+1;
end
end
FirstRowB gets the row checks whether last column equals 1 or 2 and then increases the value of respective count by 1.
But I keep getting this error:
Reference to non-existent field 'd'.
You could use vectorization (it is always convenient especially in Matlab). Taking advantage of the fact that true is one and false is zero, if you just want to count you can do :
count1 = sum ( data.d(:, 2) == 1 ) ;
count2 = sum (data.d(:,2) == 2 ) ;
in fact in general you could define :
getNumberOfElementsInLastColEqualTo = #(numb) sum (data.d(:,end) == numb ) ;
counts =arrayfun( getNumberOfElementsInLastColEqualTo , [1 2 ] );
Hope this helps.

How to filter and save each variable on matlab

I have a data (matrix) with 3 columns : DATA=[ID , DATE, Value]
I want to filter my data by ID for example DATAid1= DATA where ID==1 and so on ..
for that I write this code in MATLAB
load calibrage_capteur.mat
data = [ID ,DATE , Valeur]
minid = min(data(:,1));
maxid = max(data(:,1));
for i=minid:maxid
ind=find(data(:,1) == i)
dataID = [ID(ind) ,DATE(ind) , Valeur(ind)]
end
As a result he register the last value in this example the max ID=31 so he register dataId31. Now I need how to save the variable each iteration. How can I do this?
You will want to use a cell array to hold your data rather than saving them as independent variables that are named based upon the ID.
data_by_ID = cell();
ids = minid:maxid;
for k = 1:numel(ids)
data_by_ID{k} = data(data(:,1) == ids(k),:);
end
Really though, depending on what you're doing with it, you can use data all of the time since all operations are going to be faster on a numeric matrix than they are on a cell array.
%// Do stuff with data ID = 10
do_stuff(data(data(:,1) == 10, :));
Update
If you absolutely must name your variables you could do the following (but please don't do this and use one of the methods above).
for k = 1:numel(ids)
eval(['dataId', num2str(ids(k)), '= data(k,:);']);
end
Your question is a bit unclear but it sounds like you simply want to save the result at each iteration of the for loop.
I'm assuming min and max id are arbitrary and not necessarily the variable you are trying to index on.
kk = min_id:max_id;
dataID=nan(size(kk));
for ii = 1:numel(kk)
ind=find(data(:,1) == kk(ii))
dataID(kk) = [ID(ind) ,DATE(ind) , Valeur(ind)]
end
This is better than indexing by min_id or max_id since it isn't clear that min_id starts at at 1 (maybe it starts at 0, or something else.)

MATLAB: Copying variables from table to struct based on certain criteria

I have a table
column1data = [11; 22; 33];
column2data = [44; 55; 66];
column3data = [77; 88; 99];
rows = {'name1', 'name2', 'name3'};
T = table(column1data, column2data, column3data);
T.Properties.RowNames = rows
column1data column2data column3data
name1 11 44 77
name2 22 55 88
name3 33 66 99
and a struct array
S(1).rownamefield = 'name3';
S(2).rownamefield = 'name1';
S(3).rownamefield = 'name2';
S(1).columnnumberfield = 1;
S(2).columnnumberfield = 3;
S(3).columnnumberfield = 2;
S(1).field3 = [];
S(2).field3 = [];
S(3).field3 = [];
rownamefield columnnumberfield field3
1 'name3' 1 []
2 'name1' 3 []
3 'name2' 2 []
The struct array S contains criteria needed to pick the variable from table T. Once the variable is picked, it has to be copied from table T to an empty field in struct S.
S(1).rownamefield contains the name of the row in table T where the target variable resides. S(1).columnnumberfield contains the number of the column in table T with the target variable. So S(1).rownamefield plus S(1).columnnumberfield are practically the coordinates of the target variable in table T. I need to copy the target variable from table T to field3 in the struct array: S(1).field3. This has to be done for all structs so it might need to be in a for loop, but I am not sure.
The output should look like this:
rownamefield columnnumberfield field3
1 'name3' 1 33
2 'name1' 3 77
3 'name2' 2 55
I have no idea how to approach this task. This is, of course, a simplified version of the problem. My real data table is 200x200 and the struct array has over 2000 structs. I will greatly appreciate any help with this.
You could do something like the following.
First convert the rownamefield and columnnumberfield fields to cells and arrays to use as indices for the table.
rows = {S.rownamefield};
cols = [S.columnnumberfield];
subtable = T(rows, cols);
This gives you a square table which you can then convert to a cell and take the diagonal elements which are the ones you care about.
values = table2cell(subtable);
values = values(logical(eye(numel(rows))));
Then this gives a cell array of the values corresponding to the entries in S. We can then assign them
[S.field3] = deal(values{:});
disp([S.field3])
33 77 55
This would be much easier if table had an equivalent to sub2ind.
% Extract table data and linearly index it
tdata = T{:,:};
[~,row] = ismember({S.rownamefield}, T.Properties.RowNames);
col = [S.columnnumberfield];
pos = sub2ind(size(tdata),rowpos, col);
val = tdata(pos);
% Assign to struct
for ii = 1:numel(S)
S(ii).field3 = val(ii);
end
Instead of the for-loop, you can use Suever's solution with the deal() to assign values in one go (have to num2cell(val) first). Whatever is faster and more intuitive.