Ranking of a cell array - matlab

Take the following example:
clear all
Name1 = {'Data1','Data2','Data3','Data4'};
Data = {6.2,6,3.2,8};
CombnsName = nchoosek(Name1,2);
CombnsData = nchoosek(Data,2);
for i = 1:length(CombnsData);
multiplied{i} = CombnsData{i,1}.*CombnsData{i,2};
end
multiplied = multiplied';
Final = [CombnsName, multiplied];
Rankd = sort(cell2mat(multiplied));
Here, Final represents the values gained by multiplying every possible combination of 'Name1'. Now, I'm trying to find a way of changing the order of 'Final' to correspond to the ranking order defined by 'Rankd'. For example the first 'line' of Final should read 'Data2 'Data3' 19.2; and the last 'line' should read 'Data1' Data4' 49.6.
Is there a method for doing this?

There are a couple of options. Firstly, you could use the second output of sort, which gives you the indexes corresponding to the entries in the sorted array:
>> [Rankd Index] = sort(cell2mat(multiplied));
and then do
>> Final(Index,:)
ans =
'Data2' 'Data3' [19.200000000000003]
'Data1' 'Data3' [19.840000000000003]
'Data3' 'Data4' [25.600000000000001]
'Data1' 'Data2' [37.200000000000003]
'Data2' 'Data4' [ 48]
'Data1' 'Data4' [49.600000000000001]
However, an even easier method is to use the function sortrows which was designed for exactly this situation:
>> sortrows(Final,3)
ans =
'Data2' 'Data3' [19.200000000000003]
'Data1' 'Data3' [19.840000000000003]
'Data3' 'Data4' [25.600000000000001]
'Data1' 'Data2' [37.200000000000003]
'Data2' 'Data4' [ 48]
'Data1' 'Data4' [49.600000000000001]

Related

Find a text and replace it with a value in Matlab

I have some data which look like this:
I would like to pre-process the data in a way that I replace all Mostly false with 1, Mostly true with 2 and Definitely true w/ 3. Is there a find and replace command or what is the best way of doing this?
You can use a map object to do the mapping
m = containers.Map( {'Mostly false', 'Mostly true', 'Definitely true'}, ...
{ 1, 2, 3} );
Then for some example data
data = {'Mostly false', 'Mostly false', 'Mostly true', 'Mostly false', 'Definitely true'};
You can perform the conversion with
data = m.values( data );
% >> data = {1, 1, 2, 1, 3}
This assumes there will always be a match in your map.
Alternatively, you could do the operation manually (for the same example data), this will leave non-matches unaltered, and you could use strcmpi for case-insensitivity:
c = {'Mostly false', 'Mostly true', 'Definitely true'
1, 2, 3};
for ii = 1:size(c,2)
% Make the replacement for each column in 'c'
data( strcmp( data, c{1,ii} ) ) = c(2,ii);
end

Extract numbers from string in MATLAB

I'm working with sscanf to extract a number from a string. The strings are usually in the form of:
'44 ppm'
'10 gallons'
'23.4 inches'
but ocassionally they are in the form of:
'<1 ppm'
If I use the following code:
x = sscanf('1 ppm','%f')
I get an output of
1
But if I add the less than sign in front of the one:
x = sscanf('<1 ppm','%f')
I get:
[]
How can I write this code so this actually produces a number? I'm not sure yet what number it should print...but let's just say it should print 1 for the moment.
You can use regexp:
s= '<1 ppm';
x=regexp(s, '.*?(\d+(\.\d+)*)', 'tokens' )
x{1}
Demo :
>> s= {'44 ppm', '10 gallons', '23.4 inches', '<1 ppm' } ;
>> x = regexp(s, '.*?(\d+(\.\d+)*)', 'tokens' );
>> cellfun( #(x) disp(x{1}), x ) % Demo for all
'44'
'10'
'23.4'
'1'

extracting data from excel to matlab

Suppose i have an excel file (data.xlsx) , which contains the following data.
Name age
Tom 43
Dick 24
Harry 32
Now i want to extract the data from it and make 2 cell array (or matrix) which shall contain
name = ['Tom' ; 'Dick';'Harry'] age = [43;24;32]
i have used xlsread(data.xlsx) , but its only extracting the numerical values ,but i want to obtain both as mentioned above . Please help me out
You have to use additional output arguments from xlread in order to get the text.
I created a dummy Excel file with your data and here is the output (nevermind about the NaNs):
[ndata, text, alldata] = xlsread('DummyExcel.xlsx')
ndata =
43
24
32
text =
'Name' 'Age'
'Tom' ''
'Dick' ''
'Harry' ''
alldata =
[NaN] 'Name' 'Age'
[NaN] 'Tom' [ 43]
[NaN] 'Dick' [ 24]
[NaN] 'Harry' [ 32]
Now if you use this:
text{2:end,1}
you get
ans =
Tom
ans =
Dick
ans =
Harry
You can use the function called importdata.
Example:
%Import Data
filename = 'yourfilename.xlsx';
delimiterIn = ' ';
headerlinesIn = 1;
A = importdata(filename,delimiterIn,headerlinesIn);
This will help to take both the text data and numerical data. Textdata will be under A.textdata and numerical data will be under A.data.

Extract Cumulative N-grams Matlab

I have an array of words:
x=['ae' ; 'be' ; 'ce' ; 'de' ; 'ee' ; 'fe']
I would like to extract sets of characters. So assume each set has N = 2 words, how can I go about getting return values that look like this
'ae' 'be'
'be' 'ce'
'ce' 'de'
'de' 'ee'
'ee' 'fe'
So if N = 2, I get back a matrix where each row contains pairs of the current and previous characters. If N=3 i will get back current and previous 2 chars for each row. I want to avoid loops if possible.
Any ideas?
You can use the Circulant Matrix Maltlab provides, truncate it as needed and use it as an index vector:
x = {'ae' ; 'be' ; 'ce' ; 'de' ; 'ee' ; 'fe'}
N = 3;
n = numel(x);
A = gallery('circul',n:-1:1)
B = fliplr( A(1:n-N+1,n-N+1:end) )
result = x(B)
or a little shorter:
A = fliplr( gallery('circul',n:-1:1) )
result = x( A(1:n-N+1,1:n-N) )
or another option using the hankel-Matrix:
A = hankel(1:n,1:N)
result = x( A(1:n-N+1,:) )
gives:
result =
'ae' 'be' 'ce'
'be' 'ce' 'de'
'ce' 'de' 'ee'
'de' 'ee' 'fe'

matlab: grouping variables for observations that can be in multiple groups

I would like to use MATLAB group statistics functions (like grpstats) on data where each observation can be in multiple groups. For example, pizzas can have {'pepperoni', 'mushroom','onions'} or {'pepperoni'} or whatever and then I want group stats by topping: all of the pizzas with 'pepperoni', all of them with 'mushroom', etc.
Alternatively if you know a way to do this by hand without iterating like an idiot that would also be helpful.
Just put repeated measures in different rows. For example:
store = repmat(cellstr(num2str((1:3)')), 3, 1);
type = repmat({'pepperoni', 'mushrooms', 'onions'}, 3, 1);
type = Type(:);
score = dataset({randn(9,3), 'taste', 'looks', 'price'});
data = [dataset(store, type) score];
grpstats(data(:,2:end), 'type')
Raw data:
>> data
data =
store type taste looks price
'1' 'pepperoni' -0.19224 -0.44463 -0.50782
'2' 'pepperoni' -0.27407 -0.15594 -0.32058
'3' 'pepperoni' 1.5301 0.27607 0.012469
'1' 'mushrooms' -0.24902 -0.26116 -3.0292
'2' 'mushrooms' -1.0642 0.44342 -0.45701
'3' 'mushrooms' 1.6035 0.39189 1.2424
'1' 'onions' 1.2347 -1.2507 -1.0667
'2' 'onions' -0.22963 -0.94796 0.93373
'3' 'onions' -1.5062 -0.74111 0.35032
Group stats:
>> grpstats(data(:,2:end), 'type')
ans =
type GroupCount mean_taste mean_looks mean_price
pepperoni 'pepperoni' 3 0.35459 -0.10817 -0.27197
mushrooms 'mushrooms' 3 0.09674 0.19138 -0.74791
onions 'onions' 3 -0.16704 -0.97992 0.072449