Increasing the length of a column in MATLAB - matlab

I'm just beginning to teach myself MATLAB, and I'm making a 501x6 array. The columns will contain probabilities for flipping 101 sided die, and as such, the columns contain 101,201,301 entries, not 501. Is there a way to 'stretch the column' so that I add 0s above and below the useful data? So far I've only thought of making a column like a=[zeros(200,1);die;zeros(200,1)] so that only the data shows up in rows 201-301, and similarly, b=[zeros(150,1);die2;zeros(150,1)], if I wanted 200 or 150 zeros to precede and follow the data, respectively in order for it to fit in the array.
Thanks for any suggestions.

You can do several thing:
Start with an all-zero matrix, and only modify the elements you need to be non-zero:
A = zeros(501,6);
A(someValue:someOtherValue, 5) = value;
% OR: assign the range to a vector:
A(someValue:someOtherValue, 5) = 1:20; % if someValue:someOtherValue is the same length as 1:20

Related

How do I reshape a non-quadratic matrix?

I have a column vector A with dimensions (35064x1) that I want to reshape into a matrix with 720 lines and as many columns as it needs.
In MATLAB, it'd be something like this:
B = reshape(A,720,[])
in which B is my new matrix.
However, if I divide 35604 by 720, there'll be a remainder.
Ideally, MATLAB would go about filling every column with 720 values until the last column, which wouldn't have 720 values; rather, 504 values (48x720+504 = 35064).
Is there any function, as reshape, that would perform this task?
Since I am not good at coding, I'd resort to built-in functions first before going into programming.
reshape preserves the number of elements but you achieve the same in two steps
b=zeros(720*ceil(35604/720),1); b(1:35604)=a;
reshape(b,720,[])
A = rand(35064,1);
NoCols = 720;
tmp = mod(numel(A),NoCols ); % get the remainder
tmp2 = NoCols -tmp;
B = reshape([A; nan(tmp2,1)],720,[]); % reshape the extended column
This first gets the remainder after division, and then subtract that from the number of columns to find the amount of missing values. Then create an array with nan (or zeros, whichever suits your purpose best) to pad the original and then reshape. One liner:
A = rand(35064,1);
NoCols = 720;
B = reshape([A; nan(NoCols-mod(numel(A),NoCols);,1)],720,[]);
karakfa got the right idea, but some error in his code.
Fixing the errors and slightly simplifying it, you end up with:
B=nan(720,ceil(numel(a)/720));
B(1:numel(A))=A;
Create a matrix where A fits in and assingn the elemnent of A to the first numel(A) elements of the matrix.
An alternative implementation which is probably a bit faster but manipulates your variable b
%pads zeros at the end
A(720*ceil(numel(A)/720))=0;
%reshape
B=reshape(A,720,[]);

Using Matlab to randomly split an Excel Sheet

I have an Excel sheet containing 1838 records and I need to RANDOMLY split these records into 3 Excel Sheets. I am trying to use Matlab but I am quite new to it and I have just managed the following code:
[xlsn, xlst, raw] = xlsread('data.xls');
numrows = 1838;
randindex = ceil(3*rand(numrows, 1));
raw1 = raw(:,randindex==1);
raw2 = raw(:,randindex==2);
raw3 = raw(:,randindex==3);
Your general procedure will be to read the spreadsheet into some matlab variables, operate on those matrices such that you end up with three thirds and then write each third back out.
So you've got the read covered with xlsread, that results in the two matrices xlsnum and xlstxt. I would suggest using the syntax
[~, ~, raw] = xlsread('data.xls');
In the xlsread help file (you can access this by typing doc xlsread into the command window) it says that the three output arguments hold the numeric cells, the text cells and the whole lot. This is because a matlab matrix can only hold one type of value and a spreadsheet will usually be expected to have text or numbers. The raw value will hold all of the values but in a 'cell array' instead, a different kind of matlab data type.
So then you will have a cell array valled raw. From here you want to do three things:
work out how many rows you have (I assume each record is a row) by using the size function and specifying the appropriate dimension (again check the help file to see how to do this)
create an index of random numbers between 1 and 3 inclusive, which you can use as a mask
randindex = ceil(3*rand(numrows, 1));
apply the mask to your cell array to extract the records matching each index
raw1 = raw(:,randindex==1); % do the same for the other two index values
write each cell back to a file
xlswrite('output1.xls', raw1);
You will probably have to fettle the arguments to get it to work the way you want but be sure to check the doc functionname page to get the syntax just right. Your main concern will be to get the indexing correct - matlab indexes row-first whereas spreadsheets tend to be column-first (e.g. cell A2 is column A and row 2, but matlab matrix element M(1,2) is the first row and the second column of matrix M, i.e. cell B1).
UPDATE: to split the file evenly is surprisingly more trouble: because we're using random numbers for the index it's not guaranteed to split evenly. So instead we can generate a vector of random floats and then pick out the lowest 33% of them to make index 1, the highest 33 to make index 3 and let the rest be 2.
randvec = rand(numrows, 1); % float between 0 and 1
pct33 = prctile(randvec,100/3); % value of 33rd percentile
pct67 = prctile(randvec,200/3); % value of 67th percentile
randindex = ones(numrows,1);
randindex(randvec>pct33) = 2;
randindex(randvec>pct67) = 3;
It probably still won't be absolutely even - 1838 isn't a multiple of 3. You can see how many members each group has this way
numel(find(randindex==1))

MATLAB vector: prevent consecutive values from same range

Okay, this might seem like a weird question, but bear with me.
So I have a random vector in a .m file, with certain constraints built into it. Here is my code:
randvecall = randsample(done, done, true);
randvec = randvecall([1;diff(randvecall(:))]~=0);
"Done" is just the range of values we take the sample from, so don't worry about that. As you can see, this randsamples from a range of values, and then prunes this random vector with the diff function, so that consecutive duplicate values are removed. There is still the potential for duplicate values in the vector, but they simply cannot be consecutive.
This is all well and good, and works perfectly fine.
So, say, randvec looks like this:
randvec =
54
47
52
26
39
2
14
51
24
6
19
56
34
46
12
7
41
18
29
7
It is actually a lot longer, with something like 60-70 values, but you get the point.
What I want to do is add a little extra constraint on to this vector. When I sample from this vector, the values are classified according to their range. So values from 1-15 are category 1, 16-30 are category 2, and so on. The reasons for this are unimportant, but it is a pretty important part of the program. So if you look at the values I provided above, you see a section like this:
7
41
18
29
7
This is actually bad for my program. Because the value ranges are treated separately, 41, 18, and 29 are used differently than 7 is. So, for all intents and purposes, 7 is appearing consecutively in my script. What I want to do is somehow parse/modify/whatever the vector when it is generated so that the same number from a certain range cannot appear twice "in a row," regardless of how many other numbers from different ranges are between them. Does this make sense/did I describe this well? So, I want MATLAB to search the vector, and for all values within certain ranges (1-15,16-30,31-45,46-60) make sure that "consecutive" values from the same range are not identical.
So, then, that is what I want to do. This may not by any means be the best way to do this, so any advice/alternatives are, of course, appreciated. I know I can do this better with multiple vectors, but for various reasons I need this to be a single, long vector (the way my script is designed it just wouldn't work if I had a separate vector for each range of values).
What you may want to do is create four random vectors, one for each category, ensure that they do not contain any two consecutive equal values, and then build your final random vector by ordered picking of values from random categories, i.e.
%# make a 50-by-nCategories array of random numbers
categories = [1,16,31,46;15,30,45,60]; %# category min/max
nCategories = size(categories,2);
randomCategories = zeros(50,nCategories);
for c=1:nCategories
%# draw 100 numbers for good measure
tmp = randi(categories(:,c),[100 1]);
tmp(diff(tmp==0)) = []; %# remove consecutive duplicates
%# store
randomCategories(:,c) = tmp(1:50);
end
%# select from which bins to pick. Use half the numbers, so that we don't force the
%# numbers of entries per category to be exactly equal
bins = randi(nCategories,[100,1]);
%# combine the output, i.e. replace e.g. the numbers
%# '3' in 'bins' with the consecutive entries
%# from the third category
out = zeros(100,1);
for c = 1:nCategories
cIdx = find(bins==c);
out(cIdx) = randomCategories(1:length(cIdx),c);
end
First we assign each element the bin number of the range it lies into:
[~,bins] = histc(randvec, [1 16 31 46 61]);
Next we loop for each range, and find elements in those categories. For example for the first range of 1-16, we get:
>> ind = find(bins==1); %# bin#1 of 1-16
>> x = randvec(ind)
ans =
2
14
6
12
7
7
now you can apply the same process of removing consecutive duplicates:
>> idx = ([1;diff(x)] == 0)
idx =
0
0
0
0
0
1
>> problematicIndices = ind(idx) %# indices into the vector: randvec
Do this for all ranges, and collect those problematic indices. Next decide how you want to deal with them (remove them, generate other numbers in their place, etc...)
If I understand your problem correct, I think that is one solution. It uses unique, but applies it to each of the subranges of the vector. The values that are duplicated within a range of indices are identified so you can deal with them.
cat_inds = [1,16,31,46,60]; % need to include last element
for i=2:numel(cat_inds)
randvec_part = randvec( cat_inds(i-1):cat_inds(i) );
% Find the indices for the first unique elements in this part of the array
[~,uniqInds] = unique(randvec_part,'first');
% this binary vector identifies the indices that are duplicated in
% this part of randvec
%
% NB: they are indices into randvec_part
%
inds_of_duplicates = ~ismember(1:numel(randvec_part), uniqInds);
% code to deal with the problem indices goes here. Modify randvec_part accordingly...
% Write it back to the original vector (assumes that the length is the same)
randvec( cat_inds(i-1):cat_inds(i) ) = randvec_part;
end
Here's a different approach than what everyone else has been tossing up. The premise that I'm working on here is that you want to have a random arrangement of values in a vector without repitition. I'm not sure what other constraints you are applying prior to the point where we are giving out input.
My thoughts is to use the randperm function.
Here's some sample code how it would work:
%randvec is your vector of random values
randvec2 = unique(randvec); % This will return the sorted list of values from randvec.
randomizedvector = randvec2(randperm(length(randvec2));
% Note: if randvec is multidimensional you'll have to use numel instead of length
At this point randomizedvector should contain all the unique values from the initial randvec and but 'shuffled' or re-randomized after the unique function call. Now you could just seed the randvec differently to avoid needing the unique function call as simply calling randperm(n) will returning a randomized vector with values ranging from 1 to n.
Just an off the wall 2 cents there =P enjoy!

MATLAB - Index exceeds matrix dimensions

Hi I have problem with matrix..
I have many .txt files with different number of rows but have the same number of column (1 column)
e.g. s1.txt = 1234 rows
s2.txt = 1200 rows
s2.txt = 1100 rows
I wanted to combine the three files. Since its have different rows .. when I write it to a new file I got this error = Index exceeds matrix dimensions.
How I can solved this problem? .
You can combine three matrices simply by stacking them: Assuming that s1, etc are the matrices you read in, you can make a new one like this:
snew = [s1; s2; s3];
You could also use the [] style stacking without creating the new matrix variable if you only need to do it once.
You have provided far too little information for an accurate diagnosis of your problem. Perhaps you have loaded the data from your files into variables in your workspace. Perhaps s1 has 1 column and 1234 rows, etc. Then you can concatenate the variables into one column vector like this:
totalVector = [s1; s2; s3];
and write it out to a file with a save() statement.
Does that help ?
Let me make an assumption that this question is connecting with your another question, and you want to combine those matrices by columns, leaving empty values in columns with fewer data.
In this case this code should work:
BaseFile ='s';
n=3;
A = cell(1,n);
for k=1:n
A{k} = dlmread([BaseFile num2str(k) '.txt']);
end
% create cell array with maximum number of rows and n number of columns
B = cell(max(cellfun(#numel,A)),n);
% convert each matrix in A to cell array and store in B
for k=1:n
B(1:numel(A{k}),k) = num2cell(A{k});
end
% save the data
xlswrite('output.txt',B)
The code assumes you have one column in each file, otherwise it will not work.

How can I generate a binary matrix with specific patterns?

I have a binary matrix of size m-by-n. Given below is a sample binary matrix (the real matrix is much larger):
1010001
1011011
1111000
0100100
Given p = m*n, I have 2^p possible matrix configurations. I would like to get some patterns which satisfy certain rules. For example:
I want not less than k cells in the jth column as zero
I want the sum of cell values of the ith row greater than a given number Ai
I want at least g cells in a column continuously as one
etc....
How can I get such patterns satisfying these constraints strictly without sequentially checking all the 2^p combinations?
In my case, p can be a number like 2400, giving approximately 2.96476e+722 possible combinations.
Instead of iterating over all 2^p combinations, one way you could generate such binary matrices is by performing repeated row- and column-wise operations based on the given constraints you have. As an example, I'll post some code that will generate a matrix based on the three constraints you have listed above:
A minimum number of zeroes per column
A minimum sum for each row
A minimum sequential length of ones per column
Initializations:
First start by initializing a few parameters:
nRows = 10; % Row size of matrix
nColumns = 10; % Column size of matrix
minZeroes = 5; % Constraint 1 (for columns)
minRowSum = 5; % Constraint 2 (for rows)
minLengthOnes = 3; % Constraint 3 (for columns)
Helper functions:
Next, create a couple of functions for generating column vectors that match constraints 1 and 3 from above:
function vector = make_column
vector = [false(minZeroes,1); true(nRows-minZeroes,1)]; % Create vector
[vector,maxLength] = randomize_column(vector); % Randomize order
while maxLength < minLengthOnes, % Loop while constraint 3 is not met
[vector,maxLength] = randomize_column(vector); % Randomize order
end
end
function [vector,maxLength] = randomize_column(vector)
vector = vector(randperm(nRows)); % Randomize order
edges = diff([false; vector; false]); % Find rising and falling edges
maxLength = max(find(edges == -1)-find(edges == 1)); % Find longest
% sequence of ones
end
The function make_column will first create a logical column vector with the minimum number of 0 elements and the remaining elements set to 1 (using the functions TRUE and FALSE). This vector will undergo random reordering of its elements until it contains a sequence of ones greater than or equal to the desired minimum length of ones. This is done using the randomize_column function. The vector is randomly reordered using the RANDPERM function to generate a random index order. The edges where the sequence switches between 0 and 1 are detected using the DIFF function. The indices of the edges are then used to find the length of the longest sequence of ones (using FIND and MAX).
Generate matrix columns:
With the above two functions we can now generate an initial binary matrix that will at least satisfy constraints 1 and 3:
binMat = false(nRows,nColumns); % Initialize matrix
for iColumn = 1:nColumns,
binMat(:,iColumn) = make_column; % Create each column
end
Satisfy the row sum constraint:
Of course, now we have to ensure that constraint 2 is satisfied. We can sum across each row using the SUM function:
rowSum = sum(binMat,2);
If any elements of rowSum are less than the minimum row sum we want, we will have to adjust some column values to compensate. There are a number of different ways you could go about modifying column values. I'll give one example here:
while any(rowSum < minRowSum), % Loop while constraint 2 is not met
[minValue,rowIndex] = min(rowSum); % Find row with lowest sum
zeroIndex = find(~binMat(rowIndex,:)); % Find zeroes in that row
randIndex = round(1+rand.*(numel(zeroIndex)-1));
columnIndex = zeroIndex(randIndex); % Choose a zero at random
column = binMat(:,columnIndex);
while ~column(rowIndex), % Loop until zero changes to one
column = make_column; % Make new column vector
end
binMat(:,columnIndex) = column; % Update binary matrix
rowSum = sum(binMat,2); % Update row sum vector
end
This code will loop until all the row sums are greater than or equal to the minimum sum we want. First, the index of the row with the smallest sum (rowIndex) is found using MIN. Next, the indices of the zeroes in that row are found and one of them is randomly chosen as the index of a column to modify (columnIndex). Using make_column, a new column vector is continuously generated until the 0 in the given row becomes a 1. That column in the binary matrix is then updated and the new row sum is computed.
Summary:
For a relatively small 10-by-10 binary matrix, and the given constraints, the above code usually completes in no more than a few seconds. With more constraints, things will of course get more complicated. Depending on how you choose your constraints, there may be no possible solution (for example, setting minRowSum to 6 will cause the above code to never converge to a solution).
Hopefully this will give you a starting point to begin generating the sorts of matrices you want using vectorized operations.
If you have enough constraints, exploring all possible matrices could be attempted:
// Explore all possibilities starting at POSITION (0..P-1)
explore(int position)
{
// Check if one or more constraints can't be verified anymore with
// all values currently set.
invalid = ...;
if (invalid) return;
// Do we have a solution?
if (position >= p)
{
// print the matrix
return;
}
// Set one more value and continue exploring
for (int value=0;value<2;value++)
{ matrix[position] = value; explore(position+1); }
}
If the number of constraints is low, this approach will take too much time.
In this case, for the kind of constraints you gave as examples, simulated annealing may be a good solution.
You must design an energy function, high when all constraints are met. That would be something like that:
Generate a random matrix
Compute energy E0
Change one cell
Compute energy E1
If E1>E0, or E0-E1 is smaller than f(temperature), keep it, otherwise reverse the move
Update temperature, and goto 2 unless stop criterion is reached
If all the contraints relate to columns (as is the case in the question), then you can find all possible valid columns and check that each column in the matrix is in this set. (i.e. when you consider each column independently, you reduce the number of possibilities a lot.)
I might be way off here, but I remember doing something similar once with some genetic algorithm.
Check out pseudo boolean constraints (also called 0-1 integer programming).
This is virtually impossible if your constraint set is complex enough. You might try to use a stochastic optimizer, like simulated annealing, particle swarm optimization, or a genetic algorithm to find a feasible solution.
However, if you can generate one (non-random) solution to such a problem, then often you can generate others by random permutations made to the existing solution.