how can i reduce the following code by using any loop - matlab

num2 = xlsread('CANCER.xls','C2:C102')
[IDX,C1] = kmeans(num2,2)
num3 = xlsread('CANCER.xls','C103:C203')
[IDX,C2] = kmeans(num3,2)
num4 = xlsread('CANCER.xls','C304:C404')
[IDX,C3] = kmeans(num4,2)
num5 = xlsread('CANCER.xls','C405:C505')
[IDX,C4] = kmeans(num5,2)
num6 = xlsread('CANCER.xls','C506:C606')
[IDX,C5] = kmeans(num6,2)
num7 = xlsread('CANCER.xls','C607:C707')
[IDX,C6] = kmeans(num7,2)
num8 = xlsread('CANCER.xls','C708:C808')
[IDX,C7] = kmeans(num8,2)
num9 = xlsread('CANCER.xls','C809:C909')
[IDX,C8] = kmeans(num9,2)
num10 = xlsread('CANCER.xls','C1000:C1099')
[IDX,C9] = kmeans(num10,2)
num11= xlsread('CANCER.xls','C1100:C1199')
[IDX,C10] = kmeans(num11,2)
num12= xlsread('CANCER.xls','C1200:C1299')
[IDX,C11] = kmeans(num12,2)
num13= xlsread('CANCER.xls','C1300:C1399')
[IDX,C12] = kmeans(num13,2)
num14= xlsread('CANCER.xls','C1400:C1499')
[IDX,C13] = kmeans(num14,2)
kmns=[C1;C2;C3;C4;C5;C6;C7;C8;C9;C10;C11;C12;C13;C14]

Try this -
%%// Start and stop row numbers
start_ind = [2 103 304 405 506 607 708 809 1000 :100: 1400];
stop_ind = [start_ind(1:8)+100 start_ind(9:end) + 99];
data = xlsread('CANCER.xls'); %%// Read data in one-go
C = zeros(2,numel(start_ind)); %%// Place holder for C values
for k1 = 1:numel(start_ind)
num = data(start_ind(k1):stop_ind(k1),3); %%// Data for the specified range
[IDX,C(:,k1)] = kmeans(num,2); %%// Do the calculations
end
kmns = reshape(C,[],1); %%// Final result

Related

how to select specific columns or rows of a table in appdesigner of matlab?

I Want to select a row or column of a table with the edit field component and do some actions on these data and show the result in the first cell of table ([1,1])
rowNames={1:100}
columnName={A:ZZ}
like this:
sum(A1:A20) or Max(AA5:AA10)
I want to write above order in the edit field component
and show the result of them in cell[A,1]
How can I do that?
Here's an implementation that might be similar to what you are attempting to achieve.
Any subset can be calculated using the command such as: Sum(A1:A2)U(B2:B3)
Where A indicates the columns apart of the set and B indicates the rows apart of the set.
Some more test functions include:
Sum(A1:A4) and
Sum(B1:B4)
%Random data%
global Data;
Data = [1 2 3 4; 1 2 3 4; 1 2 3 4; 1 2 3 4; 1 2 3 4];
%Converting data into table%
Data = array2table(Data);
%Grabbing the size of the data%
[Number_Of_Rows, Number_Of_Columns] = size(Data);
%Creating arrays for setting the row and column names%
Row_Labels = strings(1,Number_Of_Rows);
Column_Labels = strings(1,Number_Of_Columns);
for Row_Scanner = 1: +1: Number_Of_Rows
Row_Labels(1,Row_Scanner) = ["B" + num2str(Row_Scanner)];
end
for Column_Scanner = 1: +1: Number_Of_Columns
Column_Labels(1,Column_Scanner) = ["A" + num2str(Column_Scanner)];
end
Row_Labels = cellstr(Row_Labels);
Column_Labels = cellstr(Column_Labels);
%UItable%
Main_Figure = uifigure;
Table = uitable(Main_Figure,'Data',Data);
Table.ColumnName = Column_Labels;
Table.RowName = Row_Labels;
set(Table,'ColumnEditable',true(1,Number_Of_Columns))
%Callback function to update the table%
% Table.CellEditCallback = #(Table,event) Update_Table_Data(Table);
%UIeditfield%
Selection_Field = uieditfield(Main_Figure,'text');
Field_Height = 20;
Field_Width = 100;
X_Position = 350;
Y_Position = 200;
Selection_Field.Position = [X_Position Y_Position Field_Width Field_Height];
Result_Label = uilabel(Main_Figure);
Result_Label.Position = [X_Position Y_Position-100 Field_Width Field_Height];
Selection_Field.ValueChangedFcn = #(Selection_Field,event) Compute_Value(Table,Selection_Field,Result_Label);
%Computing value
function [Data] = Compute_Value(Table,Selection_Field,Result_Label)
Data = Table.Data;
User_Input_Function = string(Selection_Field.Value);
Function = extractBefore(User_Input_Function,"(");
% fprintf("Function: %s \n",Function);
Key_Pairs = extractBetween(User_Input_Function,"(",")");
% fprintf("Key Pairs: (%s)\n", Key_Pairs);
Key_1 = extractBefore(Key_Pairs(1,1),":");
Key_2 = extractAfter(Key_Pairs(1,1),":");
Key_1
Key_2
if length(Key_Pairs) == 2
Key_3 = extractBefore(Key_Pairs(2,1),":");
Key_4 = extractAfter(Key_Pairs(2,1),":");
Key_3
Key_4
end
%Exracting the letters of each key
if contains(Key_1, "A") == 1
% fprintf("Function on columns\n")
Minimum_Column = str2num(extractAfter(Key_1,"A"));
Maximum_Column = str2num(extractAfter(Key_2,"A"));
Table_Subset = Data(1,Minimum_Column:Maximum_Column);
end
if contains(Key_1, "B") == 1
% fprintf("Function on rows\n")
Minimum_Row = str2num(extractAfter(Key_1,"B"));
Maximum_Row = str2num(extractAfter(Key_2,"B"));
Table_Subset = Data(Minimum_Row:Maximum_Row,1);
end
if length(Key_Pairs) == 2
Minimum_Column = str2num(extractAfter(Key_1,"A"));
Maximum_Column = str2num(extractAfter(Key_2,"A"));
Minimum_Row = str2num(extractAfter(Key_3,"B"));
Maximum_Row = str2num(extractAfter(Key_4,"B"));
Table_Subset = Data(Minimum_Row:Maximum_Row,Minimum_Column:Maximum_Column);
end
Table_Subset = table2array(Table_Subset);
%Statements for each function%
if (Function == 'Sum' || Function == 'sum')
fprintf("Computing sum\n");
Result_Sum = sum(Table_Subset,'all');
Result_Sum
Result_Label.Text = "Result: " + num2str(Result_Sum);
end
if (Function == 'Max' || Function == 'max')
fprintf("Computing maximum\n");
Result_Max = max(Table_Subset);
Result_Max
Result_Label.Text = "Result: " + num2str(Result_Max);
end
if (Function == 'Min' || Function == 'min')
fprintf("Computing minimum\n");
Result_Min = min(Table_Subset);
Result_Min
Result_Label.Text = "Result: " + num2str(Result_Min);
end
end

How to loop through the columns of a Matlab table by using the headings?

I am trying to make a for-loop for the Matlab code below. I have named each column with JAN90, FEB90, etc. all the way up to AUG19, which can be found in a matrix named "data". At this point I need to change the month and year manually to obtain the result I want. Is there a way to iterate over the columns by the column name? Would it be easier to name the columns Var1, Var2 etc.?
clear;
clc;
data = readtable('Data.xlsx','ReadVariableNames',false);
data(1,:) = [];
data.Var2 = str2double(data.Var2);
data.Var3 = str2double(data.Var3);
data.Var4 = str2double(data.Var4);
data.Var5 = str2double(data.Var5);
data.Var6 = str2double(data.Var6);
data.Var7 = str2double(data.Var7);
data.Var8 = str2double(data.Var8);
data.Var9 = str2double(data.Var9);
data.Var10 = str2double(data.Var10);
data.Var11 = str2double(data.Var11);
data.Var12 = str2double(data.Var12);
data.Var13 = str2double(data.Var13);
data(:,1) = [];
data = table2array(data);
data = array2table(data.');
data = table2cell(data)
data = cell2table(data, 'VariableNames',{'JAN90','FEB90','MAR90','APR90','MAY90','JUN90','JUL90','AUG90'...
,'SEP90','OCT90','NOV90','DEC90','JAN91','FEB91','MAR91','APR91','MAY91','JUN91','JUL91','AUG91'...
,'SEP91','OCT91','NOV91','DEC91','JAN92','FEB92','MAR92','APR92','MAY92','JUN92','JUL92','AUG92'...
,'SEP92','OCT92','NOV92','DEC92','JAN93','FEB93','MAR93','APR93','MAY93','JUN93','JUL93','AUG93'...
,'SEP93','OCT93','NOV93','DEC93','JAN94','FEB94','MAR94','APR94','MAY94','JUN94','JUL94','AUG94'...
,'SEP94','OCT94','NOV94','DEC94','JAN95','FEB95','MAR95','APR95','MAY95','JUN95','JUL95','AUG95'...
,'SEP95','OCT95','NOV95','DEC95','JAN96','FEB96','MAR96','APR96','MAY96','JUN96','JUL96','AUG96'...
,'SEP96','OCT96','NOV96','DEC96','JAN97','FEB97','MAR97','APR97','MAY97','JUN97','JUL97','AUG97'...
,'SEP97','OCT97','NOV97','DEC97','JAN98','FEB98','MAR98','APR98','MAY98','JUN98','JUL98','AUG98'...
,'SEP98','OCT98','NOV98','DEC98','JAN99','FEB99','MAR99','APR99','MAY99','JUN99','JUL99','AUG99'...
,'SEP99','OCT99','NOV99','DEC99','JAN00','FEB00','MAR00','APR00','MAY00','JUN00','JUL00','AUG00'...
,'SEP00','OCT00','NOV00','DEC00','JAN01','FEB01','MAR01','APR01','MAY01','JUN01','JUL01','AUG01'...
,'SEP01','OCT01','NOV01','DEC01','JAN02','FEB02','MAR02','APR02','MAY02','JUN02','JUL02','AUG02'...
,'SEP02','OCT02','NOV02','DEC02','JAN03','FEB03','MAR03','APR03','MAY03','JUN03','JUL03','AUG03'...
,'SEP03','OCT03','NOV03','DEC03','JAN04','FEB04','MAR04','APR04','MAY04','JUN04','JUL04','AUG04'...
,'SEP04','OCT04','NOV04','DEC04','JAN05','FEB05','MAR05','APR05','MAY05','JUN05','JUL05','AUG05'...
,'SEP05','OCT05','NOV05','DEC05','JAN06','FEB06','MAR06','APR06','MAY06','JUN06','JUL06','AUG06'...
,'SEP06','OCT06','NOV06','DEC06','JAN07','FEB07','MAR07','APR07','MAY07','JUN07','JUL07','AUG07'...
,'SEP07','OCT07','NOV07','DEC07','JAN08','FEB08','MAR08','APR08','MAY08','JUN08','JUL08','AUG08'...
,'SEP08','OCT08','NOV08','DEC08','JAN09','FEB09','MAR09','APR09','MAY09','JUN09','JUL09','AUG09'...
,'SEP09','OCT09','NOV09','DEC09','JAN10','FEB10','MAR10','APR10','MAY10','JUN10','JUL10','AUG10'...
,'SEP10','OCT10','NOV10','DEC10','JAN11','FEB11','MAR11','APR11','MAY11','JUN11','JUL11','AUG11'...
,'SEP11','OCT11','NOV11','DEC11','JAN12','FEB12','MAR12','APR12','MAY12','JUN12','JUL12','AUG12'...
,'SEP12','OCT12','NOV12','DEC12','JAN13','FEB13','MAR13','APR13','MAY13','JUN13','JUL13','AUG13'...
,'SEP13','OCT13','NOV13','DEC13','JAN14','FEB14','MAR14','APR14','MAY14','JUN14','JUL14','AUG14'...
,'SEP14','OCT14','NOV14','DEC14','JAN15','FEB15','MAR15','APR15','MAY15','JUN15','JUL15','AUG15'...
,'SEP15','OCT15','NOV15','DEC15','JAN16','FEB16','MAR16','APR16','MAY16','JUN16','JUL16','AUG16'...
,'SEP16','OCT16','NOV16','DEC16','JAN17','FEB17','MAR17','APR17','MAY17','JUN17','JUL17','AUG17'...
,'SEP17','OCT17','NOV17','DEC17','JAN18','FEB18','MAR18','APR18','MAY18','JUN18','JUL18','AUG18'...
,'SEP18','OCT18','NOV18','DEC18','JAN19','FEB19','MAR19','APR19','MAY19','JUN19','JUL19','AUG19'});
m = [1 2 3 6 12 24 36 60 84 120 240 360]';
for i=1:100
t = i;
data.X_1 = (1-exp(-m./t))./(m./t);
data.X_2 = ((1-exp(-m./t))./(m./t))-exp(-m./t);
model_1 = fitlm(data, 'FEB95 ~ X_1 + X_2');
RSS(100,:) = zeros ;
res = model_1.Residuals.Raw;
res(any(isnan(res), 2), :) = [];
RSS(i) = sum(res.^2);
end
RSS(:,2) = [1:1:100];
min = min(RSS(:,1));
t = find(RSS(:,1) == min)
data.X_1 = (1-exp(-m./t))./(m./t);
data.X_2 = ((1-exp(-m./t))./(m./t))-exp(-m./t);
model_1 = fitlm(data, 'FEB95 ~ X_1 + X_2')
res = model_1.Residuals.Raw;
res(any(isnan(res), 2), :) = [];
RSS = sum(res.^2)
intercept = model_1.Coefficients.Estimate(1,1);
beta_1 = model_1.Coefficients.Estimate(2,1);
beta_2 = model_1.Coefficients.Estimate(3,1);
Yhat = intercept + beta_1.*data.X_1 + beta_2.*data.X_2;
plot(m, Yhat)
hold on
scatter(m, data.FEB95)
I.e "FEB95" should be dynamic? Any suggestions?
Her is how I would approach your problem. First realize that VarNames=data.Properties.VariableNames will get a list of all column names in the table. You could then loop over this list. For example
for v=VarNames
current_column = v{1};
% ....
% Define the model spec for the current column
model_spec = [current_column ' ~ X_1 + X_2'];
% and create the model
model_1 = fitlm(data, model_spec);
% ... Continue computation ... and collect results in a table or array
end

Append values from textscan into cell array literately in a loop

Currently I have a txt file with data as shown below:
A11
Temperature=20 Weight=120 Speed=65
B13
Temperature=21 Weight=121 Speed=63
F24
Temperature=18 Weight=117 Speed=78
D43
Temperature=16 Weight=151 Speed=42
C32
Temperature=15 Weight=101 Speed=51
I would like to read the value into a cell array and convert it as matrix.
Below is my code:
% At first I read the data into a 1 column array
fid=fopen('file.txt');
tline = fgetl(fid);
tlines = cell(0,1);
while ischar(tline)
tlines{end+1,1} = tline;
tline = fgetl(fid);
end
fclose(fid);
% Then I check the size of the cell array
CellSize = size(tlines);
DataSize = CellSize(1,1);
% At last I setup a loop and literately read and input the values
Data = cell(0,3);
for i = 1:DataSize
Data{end+1,3} = textscan(tlines{i,1},'Temperature=%f Weight=%f Speed=%f');
end
However, I got 10x3 empty cell array.
[] [] *1x3cell*
[] [] *1x3cell*
[] [] *1x3cell*
[] [] *1x3cell*
[] [] *1x3cell*
[] [] *1x3cell*
[] [] *1x3cell*
[] [] *1x3cell*
[] [] *1x3cell*
[] [] *1x3cell*
I know the problem comes from the input of textscan value into the cell array. Can you help me fix the problem? Also how can I toss the empty value if the data doesn't contain the specific format.
The only mistake you made, was to index the content of Data using {}, and not the cells using (), see the help.
I modified the last part of your script:
% At last I setup a loop and literately read and input the values
Data = cell(DataSize, 3);
for i = 1:DataSize
Data(i, :) = textscan(tlines{i}, 'Temperature=%f Weight=%f Speed=%f')
end
Gives the following output:
Data =
{
[1,1] = [](0x1)
[2,1] = 20
[3,1] = [](0x1)
[4,1] = 21
[5,1] = [](0x1)
[6,1] = 18
[7,1] = [](0x1)
[8,1] = 16
[9,1] = [](0x1)
[10,1] = 15
[1,2] = [](0x1)
[2,2] = 120
[3,2] = [](0x1)
[4,2] = 121
[5,2] = [](0x1)
[6,2] = 117
[7,2] = [](0x1)
[8,2] = 151
[9,2] = [](0x1)
[10,2] = 101
[1,3] = [](0x1)
[2,3] = 65
[3,3] = [](0x1)
[4,3] = 63
[5,3] = [](0x1)
[6,3] = 78
[7,3] = [](0x1)
[8,3] = 42
[9,3] = [](0x1)
[10,3] = 51
}
Afterwards you could do:
% Clean up.
for i = DataSize:-1:1
if (isempty([Data{i, :}]))
Data(i, :) = [];
end
end
So that your output looks like this:
Data =
{
[1,1] = 20
[2,1] = 21
[3,1] = 18
[4,1] = 16
[5,1] = 15
[1,2] = 120
[2,2] = 121
[3,2] = 117
[4,2] = 151
[5,2] = 101
[1,3] = 65
[2,3] = 63
[3,3] = 78
[4,3] = 42
[5,3] = 51
}
Also, please don't do things like Data{end+1,3}, if the size is known in advance. I also modified this accordingly.
If you want to keep the header/name, this is not terribly elegant, but it will work:
fin = 'yourfilename.txt';
fid = fopen(fin);
stuff = textscan(fid, '%s');
fclose(fid);
stuff = stuff{:};
stuff = strrep(stuff, 'Temperature=', '');
stuff = strrep(stuff, 'Weight=', '');
stuff = strrep(stuff, 'Speed=', '');
len = length(stuff) / 4;
name = cell(1,len);
temp = NaN(1,len);
wt = NaN(1,len);
speed = NaN(1,len);
counter = 0;
for ii = 1:len
name(ii) = stuff(ii + counter);
temp(ii) = str2double(stuff(ii + counter +1));
wt(ii) = str2double(stuff(ii + counter +2));
speed(ii) = str2double(stuff(ii + counter +3));
counter = counter + 3;
end
name = cell2table(name', 'VariableNames', {'Name'});
temp = array2table(temp', 'VariableNames', {'Temperture'});
wt = array2table(wt', 'VariableNames', {'Weight'});
speed = array2table(speed', 'VariableNames', {'Speed'});
data = [name temp wt speed];

Why do I get such a bad loss in my implementation of k-Nearest Neighbor?

I'm trying to implement k-NN in matlab. I have a matrix of 214 x's that have 9 columns of attributes with the 10th column being the label. I want to measure loss with a 0-1 function on 10 cross-validation tests. I have the following code:
function q3(file)
data = knnfile(file);
loss(data(:,1:9),'KFold',data(:,10))
losses = zeros(25,3);
new_data = data;
new_data(:,10) = [];
sdd = std(new_data);
meand = mean(new_data);
for s = 1:214
for q = 1:9
new_data(s,q) = (new_data(s,q) - meand(q)) / sdd(q);
end
end
new_data = [new_data data(:,10)];
for k = 1:25
loss1 = 0;
loss2 = 0;
for j = 0:9
index = floor(214/10)*j+1;
curd1 = data([1:index-1,index+21:end],:);
curd2 = new_data([1:index-1,index+21:end],:);
for l = 0:20
c1 = knn(curd1,k,data(index+l,:));
c2 = knn(curd2,k,new_data(index+l,:));
loss1 = loss1 + (c1 ~= data(index+l,10));
loss2 = loss2 + (c2 ~= new_data(index+l,10));
end
end
losses(k,1) = k;
losses(k,2) = 100*loss1/210;
losses(k,3) = 100*loss2/210;
end
function cluster = knn(Data,k,x)
distances = zeros(193,2);
for i = 1:size(Data,1)
row = Data(i,:);
d = norm(row(1:size(row,2)-1) - x(1:size(x,2)-1));
distances(i,:) = [d row(10)];
end
distances = sortrows(distances,1);
cluster = mode(distances(1:k,2));
I'm getting 40%+ loss with almost no correlation to k and I'm sure that something here is wrong but I'm not quite sure.
Any help would be appreciated!

compare more than 2 proportions matlab

Having 4 groups (A,B,C,D)
each of them containing a different number of male and female
male_A = 46
male_B = 241
male_C = 202
male_D = 113
female_A = 43
female_B = 134
female_C = 100
female_D = 53
How can I identify the groups that have a statistically different proportion of male and female? Suggestion using MATLAB would be appreciated...
POSSIBLE SOLUTION (PLEASE CHECK)
% 1st row: male
% 2nd row: female
cont = [46 241 202 113;
43 134 100 53]
mychi(cont)
%this function should calculate the Chi2
function mychi(cont)
cont = [cont, sum(cont,2)];
cont = [cont; sum(cont,1)];
counter = 1;
for i = 1 : size(cont,1)-1
for j = 1 : size(cont,2)-1
Observed(counter) = cont(i,j);
Expected(counter) = cont(i,end)*cont(end,j)/cont(end:end);
O_E_2(counter) = (abs(Observed(counter)-Expected(counter)).^2)/Expected(counter);
counter = counter + 1;
end
end
DOF = (size(cont,1)-2)*(size(cont,2)-2)
CHI = sum(O_E_2)
end
The CHI returned should be compared with the one for p<0.05 that can be found here
In my case
DOF =
3
CHI =
8.0746
CHI is > 0.352 so the groups have a biased number of male and female...
Not sure what comparison you are looking for, but the ratios can be obtained by
p = 0.05;
ratio_A = male_A ./ (male_A + female_A);
ratio_B = male_B ./ (male_B + female_B);
ratio_C = male_C ./ (male_C + female_C);
ratio_D = male_D ./ (male_D + female_D);
%Once you have ratios, you can perform analysis as mentioned on
%http://au.mathworks.com/help/stats/hypothesis-testing.html
Hope this helps
I suggest to arrange your data in a matrix and use the proper indexing according to your pourposes. Here you have an example:
male_A = 46;
male_B = 241;
male_C = 202;
male_D = 113;
female_A = 43;
female_B = 134;
female_C = 100;
female_D = 53;
matrix = [male_A female_A;
male_B female_B;
male_C female_C;
male_D female_D];
groups = ['A', 'B', 'C', 'D'];
total = (matrix(:,1)+matrix(:,2));
male_percentage = matrix(:,1)./total*100
female_percentage = matrix(:,2)./total*100
threshold = 65; %// Example threshold 65%
male_above_threshold = groups(male_percentage>threshold)
female_above_threshold = groups(female_percentage>threshold)
maximum_male_ratio = groups(male_percentage==max(male_percentage))
maximum_female_ratio = groups(female_percentage==max(female_percentage))
In your example you would get:
male_percentage =
51.6854
64.2667
66.8874
68.0723
female_percentage =
48.3146
35.7333
33.1126
31.9277
male_above_threshold =
CD
female_above_threshold =
Empty string: 1-by-0
maximum_male_ratio =
D
maximum_female_ratio =
A
Finding out the groups that are statistically different is another problem. You should provide more information in order to do that.