MATLAB - extract selected rows in a table based on some criterion - matlab

Let's say I have a table like this:
post user date
____ ____ ________________
1 A 12.01.2014 13:05
2 B 15.01.2014 20:17
3 A 16.01.2014 05:22
I want to create a smaller table (but not delete the original one!) containing all posts of - for example - user A including the dates that those were posted on.
When looking at MATLAB's documentation (see the very last part for deleting rows) I discovered that MATLAB allows you to create a mask for a table based on some criterion. So in my case if I do something like this:
postsA = myTable.user == 'A'
I get a nice mask vector as follows:
>> postsA =
1
0
1
where the 1s are obviously those rows in myTable, which satisfy the rule I have given.
In the documention I have pointed at above rows are deleted from the original table:
postsNotA = myTable.user ~= 'A' % note that I have to reverse the criterion since I'm choosing stuff that will be removed
myTable(postsNotA,:) = [];
I would however - as stated above - like to not touch my original table. One possible solution here is to create an empty table with two columns:
post date
____ ____
interate through all rows of my original table, while also looking at the current value of my mask vector postsA and if it's equal to 1, copy the two of the columns in that row that I'm interested in and concatenate this shrunk row to my smaller table. What I'd like to know is if there is a more or less 1-2 lines long solution for this problem?

Assuming myTable is your original table.
You can just do
myTable(myTable.user == 'A',:)
Sample Code:
user = ['A';'B';'A';'C';'B'];
Age = [38;43;38;40;49];
Height = [71;69;64;67;64];
Weight = [176;163;131;133;119];
BloodPressure = [124 93; 109 77; 125 83; 117 75; 122 80];
T = table(user,Age,Height,Weight,BloodPressure)
T(T.user=='A',:)
Gives:
T =
user Age Height Weight BloodPressure
____ ___ ______ ______ _________________________
A 38 71 176 124 93
B 43 69 163 109 77
A 38 64 131 125 83
C 40 67 133 117 75
B 49 64 119 122 80
ans =
user Age Height Weight BloodPressure
____ ___ ______ ______ _________________________
A 38 71 176 124 93
A 38 64 131 125 83

Related

MATLAB: Last row in a matrix is not computing :(

So i have a homogenous numeric array as shown below. I converted this array to a table using the array2Table function. What is shown below is simply the variabe names being applied to the array. I have column names but I would like to have row names as well. Is it the fact that the array is of one variable class that I cant do this?
T = array2table(C,'RowNames',{'','T0','T1','T2','T3','T4'},'VariableNames' ,{'to','t1','t2','t3','t4','t5','t6','t7','t8','t9','t10'})
T =
6×11 table
to t1 t2 t3 t4 t5 t6 t7 t8 t9 t10
___ ______ ______ ______ ______ ______ ______ ______ ______ ______ ______
0 18 36 54 72 90 108 126 144 162 180
15 15 15 15 15 15 15 15 15 15 15
325 304.17 303.4 295.01 293.52 288.3 286.56 282.49 280.5 276.99 274.8
325 325 315.67 314.35 308.58 306.86 302.38 300.33 296.49 294.19 290.74
325 325 325 320.82 319.8 315.43 313.61 309.61 307.35 303.69 301.2
325 325 325 325 321.25 319.95 315.9 313.85 310.05 307.63 304.1
The errors that Im getting here are:
Error using matlab.internal.tabular.private.rowNamesDim/validateAndAssignLabels (line 109)
The RowNames property must be a cell array, with each element containing one nonempty character vector.
Error in matlab.internal.tabular.private.tabularDimension/setLabels (line 173)
obj = obj.validateAndAssignLabels(newLabels,indices,fullAssignment,fixDups,fixEmpties,fixIllegal);
Error in matlab.internal.tabular.private.tabularDimension/createLike_impl (line 355)
obj = obj.setLabels(dimLabels,[]);
Error in matlab.internal.tabular.private.tabularDimension/createLike (line 62)
obj = obj.createLike_impl(dimLength,dimLabels);
Error in tabular/initInternals (line 206)
t.rowDim = t.rowDim.createLike(nrows,rowLabels);
Error in table.init (line 327)
t = initInternals(t, vars, numRows, rowLabels, numVars, varnames);
Error in array2table (line 64)
t = table.init(vars,nrows,rownames,nvars,varnames);
The error you're getting is
The RowNames property must be a cell array, with each element containing one nonempty character vector.
Here is a valid version:
T = array2table(C,'RowNames',{'T','T0','T1','T2','T3','T4'},'VariableNames' ,{'to','t1','t2','t3','t4','t5','t6','t7','t8','t9','t10'})
You need to change the first element of RowNames array to be nonempty character vector, e.g. 'T' instead of ''.

MATLAB - show categorical variables as columns instead of rows?

I execute the following line using the "hospital" data set and get the following:
>> statarray = grpstats(dsa,{'Smoker','Sex'},'mean','DataVars',{'Age','Weight'})
statarray =
Smoker Sex GroupCount mean_Age mean_Weight
0_Female false Female 40 37.425 130.32
0_Male false Male 26 38.808 180.04
1_Female true Female 13 38.615 130.92
1_Male true Male 21 39.048 181.14
I was wondering if it's easy to be able to instead have it be like this:
Smoker GroupCount mean_Age mean_Weight Male Female
0 false 66 37.97 149.91 21 40
1 true 34 38.882 161.94 26 13
I can't figure out how to bring the categorical variables to the columns like this of the stat table instead of having them as rows. Maybe this is not possible with grpstats. Just curious. Thanks!
You can count the sex in a separate crosstab, and then concatenate it to one table in statarray:
statarray = grpstats(dataset2table(hospital),{'Smoker'},'mean',...
'DataVars',{'Age','Weight'});
statarray{:,end+1:end+2} = crosstab(hospital.Smoker,hospital.Sex);
statarray.Properties.VariableNames(end-1:end) = categories(hospital.Sex);
Output:
statarray =
Smoker GroupCount mean_Age mean_Weight Female Male
______ __________ ________ ___________ ______ ____
0 false 66 37.97 149.91 40 26
1 true 34 38.882 161.94 13 21
You may notice I converted statarray from a dataset to a table, this is because of this message in Matlab's docs:
The dataset data type might be removed in a future release. To work with heterogeneous data, use the MATLAB® table data type instead. See MATLAB table documentation for more information.
And indeed, table is more friendly...

Missing data in repeated measure model

I am using the matlab function fitrm to fit repeated measures model in order to investigate whether elements grouped according to Grouping1 have statistically different means for the variable measured at time t1,t2,t3,t4 (var_t1,var_t2,var_t3,var_t4).
My data look like the ones in the table:
Grouping1 Grouping2 Gender Age BMI var_t1 var_t2 var_t3 var_t4
______ ___________ ______ ______ ______ ____________ ____________ ____________ ____________
C B Male 60 24.802 836 608 746 NaN
C A Male 67 19.818 242 544 460 483
... ...
D C Female 65 21.631 621 468 NaN NaN
As you can see from I have some missing data for var_t3 and var_t4.
Can I still use fitrm?
If fit a repeated measures model, where var_t1,-var_t4 are the responses and Grouping1, Grouping2, Gender, Age and BMI are the predictor variables
Time = [1:4]';
rm = fitrm(table,'var_t1-var_t4 ~ Grouping1 + Grouping2 + Gender + Age + BMI','WithinDesign',Time)
the function doesn't return error, but I don't know if the results have any meaning...

combine all 3 digit numbers in a row into 1 single number in matlab

I have a matrix of 3 digit numbers, for example
102 106 100 100 100 100 100
106 102 100 100 100 100 100
106 101 120 106 109 119 108
104 115 107 106 109 119 108
I would like to combine each row into a single number, like so
102106100100100100100
106102100100100100100
106101120106109
...etc. I would really appreciate any feedback. Thank you :)
I assume the input is a numeric 2D array.
If you want the result in string form (2D char array where each row represents a number):
result = num2str(A, '%i'); %// or change format specifier if the numers are not naturals
If you want the result in numeric form (column vector of numbers):
result = str2num(num2str(A, '%i'));

Tidying up a list

I'm fairly sure there should be an elegant solution to this (in MATLAB), but I just can't think of it right now.
I have a list with [classIndex, start, end], and I want to collapse consecutive class indices into one group like so:
This
1 1 40
2 46 53
2 55 55
2 57 64
2 67 67
3 68 91
1 94 107
Should turn into this
1 1 40
2 46 67
3 68 91
1 94 107
How do I do that?
EDIT
Never mind, I think I got it - it's almost like fmarc's solution, but gets the indices right
a=[ 1 1 40
2 46 53
2 55 55
2 57 64
2 67 67
3 68 91
1 94 107];
d = diff(a(:,1));
startIdx = logical([1;d]);
endIdx = logical([d;1]);
b = [a(startIdx,1),a(startIdx,2),a(endIdx,3)];
Here is one solution:
Ad = find([1; diff(A(:,1))]~=0);
output = A(Ad,:);
output(:,3) = A([Ad(2:end)-1; Ad(end)],3);
clear Ad
One way to do it if the column in question is numeric:
Build the differences along the id-column. Consecutive identical items will have zero here:
diffind = diff(a(:,1)');
Use that to index your array, using logical indexing.
b = a([true [diffind~=0]],:);
Since the first item is always included and the difference vector starts with the difference from first to second element, we need to prepend one true value to the list.