Implement huffmandict() function in matlab using arrays - matlab

I would like to implement the huffmandict() function in Matlab. I have already written a code in which I create an array with all the probabilities. Each time I add the 2 last probabilities , I update my array by adding the new sum probability at the next row in the right place. I also have an array with the sums only. The problem is I don't know how to continue to assign '0' and '1'. Any idea?
This is my code:
function code_words = my_huffmandict_func(init_symbols,probs)
my_symbol_array = [];
my_symbol_array = init_symbols;
my_probs = [];
my_probs = probs;
if length(my_symbol_array)~=length(my_probs)
error('Number of symbols and number of probabilities are not the same.');
end
for i=1:length(my_probs) %sorting the probabilities in descending order and
change the sequence of the symbols
for j=1:length(my_probs)
if (my_probs(i)> my_probs(j))
temp1=my_probs(i);
temp2=my_symbol_array(i);
my_probs(i)= my_probs(j);
my_symbol_array(i)= my_symbol_array(j);
my_probs(j)= temp1;
my_symbol_array(j)= temp2;
end
end
end
my_sum_array = [];
k=1;
init_lengthpr = length(my_probs);
all_occured_probs = [];
all_occured_probs(1,:) = my_probs;
while length(my_probs)>2 %we need this while loop as long as there are more
than 2 symbols left
my_temp_sum = my_probs(length(my_probs)) + my_probs(length(my_probs-1)); %we add the the possibilities of the two less possible outputs
my_sum_array = [my_sum_array,my_temp_sum]; %in this array we keep all the sums that occured
my_probs = [my_probs(1:length(my_probs)-2), my_temp_sum];%we update the possibilities' array
my_probs = sort(my_probs,'descend'); %we sort the array again
k=k+1;
all_occured_probs(k,:) = [my_probs,zeros(1,init_lengthpr-length(my_probs))];
end
end

Related

Not sure what to do about error message "Conversion to double from cell is not possible."

I'm writing a program that finds the indices of a matrix G where there is only a single 1 for either a column index or a row index and removes any found index if it has a 1 for both the column and row index. Then I want to take these indices and use them as indices in an array U, which is where the trouble comes. The indices do not seem to be stored as integers and I'm not sure what they are being stored as or why. I'm quite new to Matlab (but thats probably obvious) and so I don't really understand how types work for Matlab or how they're assigned. So I'm not sure why I',m getting the error message mentioned in the title and I'm not sure what to do about it. Any assistance you can provide would be greatly appreciated.
I forgot to mention this before but G is a matrix that only contains 1s or 0s and U is an array of strings (i think what would be called a cell?)
function A = ISClinks(U, G)
B = [];
[rownum,colnum] = size(G);
j = 1;
for i=1:colnum
s = sum(G(:,i));
if s == 1
B(j,:) = i;
j = j + 1;
end
end
for i=1:rownum
s = sum(G(i,:));
if s == 1
if ismember(i, B)
B(B == i) = [];
else
B(j,:) = i;
j = j+1;
end
end
end
A = [];
for i=1:size(B,1)
s = B(i,:);
A(i,:) = U(s,:);
end
end
This is the problem code, but I'm not sure what's wrong with it.
A = [];
for i=1:size(B,1)
s = B(i,:);
A(i,:) = U(s,:);
end
Your program seems to be structured as though it had been written in a language like C. In MATLAB, you can usually substitute specialized functions (e.g. any() ) for low-level loops in many cases. Your function could be written more efficiently as:
function A = ISClinks(U, G)
% Find columns and rows that are set in the input
active_columns=any(G,1);
active_rows=any(G,2).';
% (Optional) Prevent columns and rows with same index from being simultaneously set
%exclusive_active_columns = active_columns & ~active_rows; %not needed; this line is only for illustrative purposes
%exclusive_active_rows = active_rows & ~active_columns; %same as above
% Merge column state vector and row state vector by XORing them
active_indices=xor(active_columns,active_rows);
% Select appropriate rows of matrix U
A=U(active_indices,:);
end
This function does not cause errors with the example input matrices I tested. If U is a cell array (e.g. U={'Lorem','ipsum'; 'dolor','sit'; 'amet','consectetur'}), then return value A will also be a cell array.

Function call with variable number of input arguments when number of input arguments is not explicitly known

I have a variable pth which is a cell array of dimension 1xn where n is a user input. Each of the elements in pth is itself a cell array and length(pth{k}) for k=1:n is variable (result of another function). Each element pth{k}{kk} where k=1:n and kk=1:length(pth{k}) is a 1D vector of integers/node numbers of again variable length. So to summarise, I have a variable number of variable-length vectors organised in a avriable number of cell arrays.
I would like to try and find all possible intersections when you take a vector at random from pth{1}, pth{2}, {pth{3}, etc... There are various functions on the File Exchange that seem to do that, for example this one or this one. The problem I have is you need to call the function this way:
mintersect(v1,v2,v3,...)
and I can't write all the inputs in the general case because I don't know explicitly how many there are (this would be n above). Ideally, I would like to do some thing like this;
mintersect(pth{1}{1},pth{2}{1},pth{3}{1},...,pth{n}{1})
mintersect(pth{1}{1},pth{2}{2},pth{3}{1},...,pth{n}{1})
mintersect(pth{1}{1},pth{2}{3},pth{3}{1},...,pth{n}{1})
etc...
mintersect(pth{1}{1},pth{2}{length(pth{2})},pth{3}{1},...,pth{n}{1})
mintersect(pth{1}{1},pth{2}{1},pth{3}{2},...,pth{n}{1})
etc...
keep going through all the possible combinations, but I can't write this in code. This function from the File Exchange looks like a good way to find all possible combinations but again I have the same problem with the function call with the variable number of inputs:
allcomb(1:length(pth{1}),1:length(pth{2}),...,1:length(pth{n}))
Does anybody know how to work around this issue of function calls with variable number of input arguments when you can't physically specify all the input arguments because their number is variable? This applies equally to MATLAB and Octave, hence the two tags. Any other suggestion on how to find all possible combinations/intersections when taking a vector at random from each pth{k} welcome!
EDIT 27/05/20
Thanks to Mad Physicist's answer, I have ended up using the following which works:
disp('Computing intersections for all possible paths...')
grids = cellfun(#(x) 1:numel(x), pth, 'UniformOutput', false);
idx = cell(1, numel(pth));
[idx{:}] = ndgrid(grids{:});
idx = cellfun(#(x) x(:), idx, 'UniformOutput', false);
idx = cat(2, idx{:});
valid_comb = [];
k = 1;
for ii = idx'
indices = reshape(num2cell(ii), size(pth));
selection = cellfun(#(p,k) p{k}, pth, indices, 'UniformOutput', false);
if my_intersect(selection{:})
valid_comb = [valid_comb k];
endif
k = k+1;
end
My own version is similar but uses a for loop instead of the comma-separated list:
disp('Computing intersections for all possible paths...')
grids = cellfun(#(x) 1:numel(x), pth, 'UniformOutput', false);
idx = cell(1, numel(pth));
[idx{:}] = ndgrid(grids{:});
idx = cellfun(#(x) x(:), idx, 'UniformOutput', false);
idx = cat(2, idx{:});
[n_comb,~] = size(idx);
temp = cell(n_pipes,1);
valid_comb = [];
k = 1;
for k = 1:n_comb
for kk = 1:n_pipes
temp{kk} = pth{kk}{idx(k,kk)};
end
if my_intersect(temp{:})
valid_comb = [valid_comb k];
end
end
In both cases, valid_comb has the indices of the valid combinations, which I can then retrieve using something like:
valid_idx = idx(valid_comb(1),:);
for k = 1:n_pipes
pth{k}{valid_idx(k)} % do something with this
end
When I benchmarked the two approaches with some sample data (pth being 4x1 and the 4 elements of pth being 2x1, 9x1, 8x1 and 69x1), I got the following results:
>> benchmark
Elapsed time is 51.9075 seconds.
valid_comb = 7112
Elapsed time is 66.6693 seconds.
valid_comb = 7112
So Mad Physicist's approach was about 15s faster.
I also misunderstood what mintersect did, which isn't what I wanted. I wanted to find a combination where no element present in two or more vectors, so I ended writing my version of mintersect:
function valid_comb = my_intersect(varargin)
% Returns true if a valid combination i.e. no combination of any 2 vectors
% have any elements in common
comb_idx = combnk(1:nargin,2);
[nr,nc] = size(comb_idx);
valid_comb = true;
k = 1;
% Use a while loop so that as soon as an intersection is found, the execution stops
while valid_comb && (k<=nr)
temp = intersect(varargin{comb_idx(k,1)},varargin{comb_idx(k,2)});
valid_comb = isempty(temp) && valid_comb;
k = k+1;
end
end
Couple of helpful points to construct a solution:
This post shows you how to construct a Cartesian product between arbitrary arrays using ndgrid.
cellfun accepts multiple cell arrays simultaneously, which you can use to index specific elements.
You can capture a variable number of arguments from a function using cell arrays, as shown here.
So let's get the inputs to ndgrid from your outermost array:
grids = cellfun(#(x) 1:numel(x), pth, 'UniformOutput', false);
Now you can create an index that contains the product of the grids:
index = cell(1, numel(pth));
[index{:}] = ndgrid(grids{:});
You want to make all the grids into column vectors and concatenate them sideways. The rows of that matrix will represent the Cartesian indices to select the elements of pth at each iteration:
index = cellfun(#(x) x(:), index, 'UniformOutput', false);
index = cat(2, index{:});
If you turn a row of index into a cell array, you can run it in lockstep over pth to select the correct elements and call mintersect on the result.
for i = index'
indices = num2cell(i');
selection = cellfun(#(p, i) p{i}, pth, indices, 'UniformOutput', false);
mintersect(selection{:});
end
This is written under the assumption that pth is a row array. If that is not the case, you can change the first line of the loop to indices = reshape(num2cell(i), size(pth)); for the general case, and simply indices = num2cell(i); for the column case. The key is that the cell from of indices must be the same shape as pth to iterate over it in lockstep. It is already generated to have the same number of elements.
I believe this does the trick. Calls mintersect on all possible combinations of vectors in pth{k}{kk} for k=1:n and kk=1:length(pth{k}).
Using eval and messing around with sprintf/compose a bit. Note that typically the use of eval is very much discouraged. Can add more comments if this is what you need.
% generate some data
n = 5;
pth = cell(1,n);
for k = 1:n
pth{k} = cell(1,randi([1 10]));
for kk = 1:numel(pth{k})
pth{k}{kk} = randi([1 100], randi([1 10]), 1);
end
end
% get all combs
str_to_eval = compose('1:length(pth{%i})', 1:numel(pth));
str_to_eval = strjoin(str_to_eval,',');
str_to_eval = sprintf('allcomb(%s)',str_to_eval);
% use eval to get all combinations for a given pth
all_combs = eval(str_to_eval);
% and make strings to eval in intersect
comp = num2cell(1:numel(pth));
comp = [comp ;repmat({'%i'}, 1, numel(pth))];
str_pattern = sprintf('pth{%i}{%s},', comp{:});
str_pattern = str_pattern(1:end-1); % get rid of last ,
strings_to_eval = cell(length(all_combs),1);
for k = 1:size(all_combs,1)
strings_to_eval{k} = sprintf(str_pattern, all_combs(k,:));
end
% and run eval on all those strings
result = cell(length(all_combs),1);
for k = 1:size(all_combs,1)
result{k} = eval(['mintersect(' strings_to_eval{k} ')']);
%fprintf(['mintersect(' strings_to_eval{k} ')\n']); % for debugging
end
For a randomly generated pth, the code produces the following strings to evaluate (where some pth{k} have only one cell for illustration):
mintersect(pth{1}{1},pth{2}{1},pth{3}{1},pth{4}{1},pth{5}{1})
mintersect(pth{1}{1},pth{2}{1},pth{3}{1},pth{4}{2},pth{5}{1})
mintersect(pth{1}{1},pth{2}{1},pth{3}{1},pth{4}{3},pth{5}{1})
mintersect(pth{1}{1},pth{2}{1},pth{3}{2},pth{4}{1},pth{5}{1})
mintersect(pth{1}{1},pth{2}{1},pth{3}{2},pth{4}{2},pth{5}{1})
mintersect(pth{1}{1},pth{2}{1},pth{3}{2},pth{4}{3},pth{5}{1})
mintersect(pth{1}{2},pth{2}{1},pth{3}{1},pth{4}{1},pth{5}{1})
mintersect(pth{1}{2},pth{2}{1},pth{3}{1},pth{4}{2},pth{5}{1})
mintersect(pth{1}{2},pth{2}{1},pth{3}{1},pth{4}{3},pth{5}{1})
mintersect(pth{1}{2},pth{2}{1},pth{3}{2},pth{4}{1},pth{5}{1})
mintersect(pth{1}{2},pth{2}{1},pth{3}{2},pth{4}{2},pth{5}{1})
mintersect(pth{1}{2},pth{2}{1},pth{3}{2},pth{4}{3},pth{5}{1})
mintersect(pth{1}{3},pth{2}{1},pth{3}{1},pth{4}{1},pth{5}{1})
mintersect(pth{1}{3},pth{2}{1},pth{3}{1},pth{4}{2},pth{5}{1})
mintersect(pth{1}{3},pth{2}{1},pth{3}{1},pth{4}{3},pth{5}{1})
mintersect(pth{1}{3},pth{2}{1},pth{3}{2},pth{4}{1},pth{5}{1})
mintersect(pth{1}{3},pth{2}{1},pth{3}{2},pth{4}{2},pth{5}{1})
mintersect(pth{1}{3},pth{2}{1},pth{3}{2},pth{4}{3},pth{5}{1})
mintersect(pth{1}{4},pth{2}{1},pth{3}{1},pth{4}{1},pth{5}{1})
mintersect(pth{1}{4},pth{2}{1},pth{3}{1},pth{4}{2},pth{5}{1})
mintersect(pth{1}{4},pth{2}{1},pth{3}{1},pth{4}{3},pth{5}{1})
mintersect(pth{1}{4},pth{2}{1},pth{3}{2},pth{4}{1},pth{5}{1})
mintersect(pth{1}{4},pth{2}{1},pth{3}{2},pth{4}{2},pth{5}{1})
mintersect(pth{1}{4},pth{2}{1},pth{3}{2},pth{4}{3},pth{5}{1})
As Madphysicist pointed out, I misunderstood the initial structure of your initial cell array, however the point stands. The way to pass an unknown number of arguments to a function is via comma-separated-list generation, and your function needs to support it by being declared with varargin. Updated example below.
Create a helper function to collect a random subcell from each main cell:
% in getRandomVectors.m
function Out = getRandomVectors(C) % C: a double-jagged array, as described
N = length(C);
Out = cell(1, N);
for i = 1 : length(C)
Out{i} = C{i}{randi( length(C{i}) )};
end
end
Then assuming you already have an mintersect function defined something like this:
% in mintersect.m
function Intersections = mintersect( varargin )
Vectors = varargin;
N = length( Vectors );
for i = 1 : N; for j = 1 : N
Intersections{i,j} = intersect( Vectors{i}, Vectors{j} );
end; end
end
Then call this like so:
C = { { 1:5, 2:4, 3:7 }, {1:8}, {2:4, 3:9, 2:8} }; % example double-jagged array
In = getRandomVectors(C); % In is a cell array of randomly selected vectors
Out = mintersect( In{:} ); % Note the csl-generator syntax
PS. I note that your definition of mintersect differs from those linked. It may just be you didn't describe what you want too well, in which case my mintersect function is not what you want. What mine does is produce all possible intersections for the vectors provided. The one you linked to produces a single intersection which is common to all vectors provided. Use whichever suits you best. The underlying rationale for using it is the same though.
PS. It is also not entirely clear from your description whether what you're after is a random vector k for each n, or the entire space of possible vectors over all n and k. The above solution does the former. If you want the latter, see MadPhysicist's solution on how to create a cartesian product of all possible indices instead.

matlab vectorization if statement

Can someone please tell me the vectorized implementation of following matlab code. Predicted is an array containing either of the two values "pos" or "neg". I have to copy the values when condition comes true.
p = 1;
box = zeros(size(bbox));
for k = 1: size(predicted)
if predicted(k) == 'pos'
box(p,:) = bbox(k,:);
p = p + 1;
end
end
bbox=rand(100); %demo data
predicted = rand(1,100)>0.5; %logical values
%You want to convert your array of strings into an array of logical values
%predicted=strcmp(predicted,'pos');
box=bbox(predicted,:);

MATLAB list manipulation

Consider I have a code segment as follows:
Case 1
n = 20;
for i = 2 : n
mat = rand([2,i]);
mat = [mat, mat(:,1)]; %add the first column to the last
%feed the variable 'mat' to a function
end
Case 2
n = 10;
list = [];
for i = 1 : n
a = rand([2,1]);
b = rand([2,2])
list = [list, [a,b]];
end
In this way, MATLAB gives the below suggestion:
The variable 'mat' appears to change size on every loop. Consider preallocating for speed up.
The variable 'list' appears to change size on every loop. Consider preallocating for speed up.
I am a MATLAB newconer, So I would like to know how to deal with this issue. How to do this in native MATLAB style? Thanks in advance.
I'll focus on the second case, as it's the only one that makes sense:
n = 10;
list = [];
for i = 1 : n
a = rand([2,1]);
b = rand([2,2])
list = [list, [a,b]];
end
That you are doing here, for each loop, is to create two vectors with random numbers, a and b. a has dimension 2x1, and b has dimension 2x2. Then, you concatenate these two, with the matrix list.
Note that each call to rand are independent, so rand(2,3) will behave the same way [rand(2,2), rand(2,1)] does.
Now, since you loop 10 times, and you add rand(2,3) every time, you're essentially doing [rand(2,2), rand(2,1), rand(2,2), rand(2,1) ...]. This is equivalent to rand(2,30), which is a lot faster. Therefore, "Consider preallocating for speed up."
Now, if your concatenations doesn't contain random matrices, but are really the output from some function that can't output the entire matrix you want, then preallocate and insert it to the matrix using indices:
Let's define a few functions:
function x = loopfun(n)
x = n*[1; 2];
end
function list = myfun1(n)
list = zeros(2, n);
for ii = 1:n
list(:,ii) = loopfun(ii);
end
end
function list = myfun2(n)
list = [];
for ii = 1:n
list = [list, loopfun(ii)];
end
end
f1 = #() myfun1(100000); f2 = #() myfun2(100000);
fprintf('Preallocated: %f\nNot preallocated: %f\n', timeit(f1), timeit(f2))
Preallocated: 0.141617
Not preallocated: 0.318272
As you can see, the function with preallocation is twice as fast as the function with an increasing sized matrix. The difference is smaller if there are few iterations, but the general idea is the same.
f1 = #() myfun1(5); f2 = #() myfun2(5);
fprintf('Preallocated: %f\nNot preallocated: %f\n', timeit(f1), timeit(f2))
Preallocated: 0.000010
Not preallocated: 0.000018

How to write a string in replace of a number in MATLAB?

I am trying to categorise the years based on some conditions. If any year has rainfall less than a specific number it will indicate as a dry year. I have tried the following, but it gives me error "In an assignment A(I) = B, the number of elements in B and I must be the same."
The code is
Year_Category = zeros(ny,1);
for i = 1:ny;
if (xy(i)< Lower_Limit)
Year_Category(i) = 'Dry';
elseif (xy(i)> Upper_Limit)
Year_Category(i) = 'Wet';
else
Year_Category(i) = 'Average';
end
end
Any help would be appreciated.
Best regards
You are trying to assign characters to a numeric array. That's why you're getting a dimension mismatch. Each character is a single slot and you can't do that in this case. Use cell arrays instead:
Year_Category = cell(ny,1); %// Change
for i = 1:ny;
if (xy(i)< Lower_Limit)
Year_Category{i} = 'Dry'; %// Change
elseif (xy(i)> Upper_Limit)
Year_Category{i} = 'Wet'; %// Change
else
Year_Category{i} = 'Average'; %// Change
end
end