Matlab Bootstrap not complete resampling - matlab

I would like to perform bootstrap in Matlab. I have 100 original data points and I would like each iteration of bootstrap to choose only 57 points with replacement randomly. How do I accomplish it?
I cannot seem to find this functionality in Matlab function bootstrp.
Regards,

To choose n points out of a vector randomly with replacement: use randi to generate the (possibly repeated) indices:
vector = (1:100).^2; %// example data
n = 57;
ind = randi(numel(vector),1,n); %// n random integers between 1 and numel(vector)
sample = vector(ind);
To do it directly with bootstrp: let fun denote the function you would pass to bootstrp. You just need to pick the first 57 values of each 100-value sample:
vector = (1:100).^2; %// example data
n = 57;
nboot = 10;
bootstrp(nboot, #(x) fun(x(1:57)), vector)

Related

Create a submatrix using random columns and loop

I have a 102-by-102 matrix. I want to select square sub-matrices of orders from 2 up to 8 using random column numbers. Here is what I have done so far.
matt is the the original matrix of size 102-by-102.
ittr = 30
cols = 3;
for i = 1:ittr
rr = randi([2,102], cols,1);
mattsub = matt([rr(1) rr(2) rr(3)], [rr(1) rr(2) rr(3)]);
end
I have to extract matrices of different orders from 2 to 8. Using the above code I would have to change the mattsub line every time I change cols. I believe it is possible to do with another loop inside but cannot figure out how. How can I do this?
There is no need to extract elements of a vector and concatenate them, just use the vector to index a matrix.
Instead of :
mattsub = matt([rr(1) rr(2) rr(3)], [rr(1) rr(2) rr(3)]);
Use this:
mattsub = matt(rr, rr);
Defining a set of random sizes is pretty easy using the randi function. Once this is done, they can be projected along your iterations number N using arrayfun. Within the iterations, the randperm and sort functions can be used in order to build the random indexers to the original matrix M.
Here is the full code:
% Define the starting parameters...
M = rand(102);
N = 30;
% Retrieve the matrix rows and columns...
M_rows = size(M,1);
M_cols = size(M,2);
% Create a vector of random sizes between 2 and 8...
sizes = randi(7,N,1) + 1;
% Generate the random submatrices and insert them into a vector of cells...
subs = arrayfun(#(x)M(sort(randperm(M_rows,x)),sort(randperm(M_cols,x))),sizes,'UniformOutput',false);
This can work on any type of matrix, even non-squared ones.
You don't need another loop, one is enough. If you use randi to get a random integer as size of your submatrix, and then use those to get random column and row indices you can easily get a random submatrix. Do note that the ouput is a cell, as the submatrices won't all be of the same size.
N=102; % Or substitute with some size function
matt = rand(N); % Initial matrix, use your own
itr = 30; % Number of iterations
mattsub = cell(itr,1); % Cell for non-uniform output
for ii = 1:itr
X = randi(7)+1; % Get random integer between 2 and 7
colr = randi(N-X); % Random column
rowr = randi(N-X); % random row
mattsub{ii} = matt(rowr:(rowr+X-1),colr:(colr+X-1));
end

Multiple constant to a matrix and convert them into block diagonal matrix in matlab

I have a1 a2 a3. They are constants. I have a matrix A. What I want to do is to get a1*A, a2*A, a3*A three matrices. Then I want transfer them into a diagonal block matrix. For three constants case, this is easy. I can let b1 = a1*A, b2=a2*A, b3=a3*A, then use blkdiag(b1, b2, b3) in matlab.
What if I have n constants, a1 ... an. How could I do this without any looping?I know this can be done by kronecker product but this is very time-consuming and you need do a lot of unnecessary 0 * constant.
Thank you.
Discussion and code
This could be one approach with bsxfun(#plus that facilitates in linear indexing as coded in a function format -
function out = bsxfun_linidx(A,a)
%// Get sizes
[A_nrows,A_ncols] = size(A);
N_a = numel(a);
%// Linear indexing offsets between 2 columns in a block & between 2 blocks
off1 = A_nrows*N_a;
off2 = off1*A_ncols+A_nrows;
%// Get the matrix multiplication results
vals = bsxfun(#times,A,permute(a,[1 3 2])); %// OR vals = A(:)*a_arr;
%// Get linear indices for the first block
block1_idx = bsxfun(#plus,[1:A_nrows]',[0:A_ncols-1]*off1); %//'
%// Initialize output array base on fast pre-allocation inspired by -
%// http://undocumentedmatlab.com/blog/preallocation-performance
out(A_nrows*N_a,A_ncols*N_a) = 0;
%// Get linear indices for all blocks and place vals in out indexed by them
out(bsxfun(#plus,block1_idx(:),(0:N_a-1)*off2)) = vals;
return;
How to use: To use the above listed function code, let's suppose you have the a1, a2, a3, ...., an stored in a vector a, then do something like this out = bsxfun_linidx(A,a) to have the desired output in out.
Benchmarking
This section compares or benchmarks the approach listed in this answer against the other two approaches listed in the other answers for runtime performances.
Other answers were converted to function forms, like so -
function B = bsxfun_blkdiag(A,a)
B = bsxfun(#times, A, reshape(a,1,1,[])); %// step 1: compute products as a 3D array
B = mat2cell(B,size(A,1),size(A,2),ones(1,numel(a))); %// step 2: convert to cell array
B = blkdiag(B{:}); %// step 3: call blkdiag with comma-separated list from cell array
and,
function out = kron_diag(A,a_arr)
out = kron(diag(a_arr),A);
For the comparison, four combinations of sizes of A and a were tested, which are -
A as 500 x 500 and a as 1 x 10
A as 200 x 200 and a as 1 x 50
A as 100 x 100 and a as 1 x 100
A as 50 x 50 and a as 1 x 200
The benchmarking code used is listed next -
%// Datasizes
N_a = [10 50 100 200];
N_A = [500 200 100 50];
timeall = zeros(3,numel(N_a)); %// Array to store runtimes
for iter = 1:numel(N_a)
%// Create random inputs
a = randi(9,1,N_a(iter));
A = rand(N_A(iter),N_A(iter));
%// Time the approaches
func1 = #() kron_diag(A,a);
timeall(1,iter) = timeit(func1); clear func1
func2 = #() bsxfun_blkdiag(A,a);
timeall(2,iter) = timeit(func2); clear func2
func3 = #() bsxfun_linidx(A,a);
timeall(3,iter) = timeit(func3); clear func3
end
%// Plot runtimes against size of A
figure,hold on,grid on
plot(N_A,timeall(1,:),'-ro'),
plot(N_A,timeall(2,:),'-kx'),
plot(N_A,timeall(3,:),'-b+'),
legend('KRON + DIAG','BSXFUN + BLKDIAG','BSXFUN + LINEAR INDEXING'),
xlabel('Datasize (Size of A) ->'),ylabel('Runtimes (sec)'),title('Runtime Plot')
%// Plot runtimes against size of a
figure,hold on,grid on
plot(N_a,timeall(1,:),'-ro'),
plot(N_a,timeall(2,:),'-kx'),
plot(N_a,timeall(3,:),'-b+'),
legend('KRON + DIAG','BSXFUN + BLKDIAG','BSXFUN + LINEAR INDEXING'),
xlabel('Datasize (Size of a) ->'),ylabel('Runtimes (sec)'),title('Runtime Plot')
Runtime plots thus obtained at my end were -
Conclusions: As you can see, either one of the bsxfun based methods could be looked into, depending on what kind of datasizes you are dealing with!
Here's another approach:
Compute the products as a 3D array using bsxfun;
Convert into a cell array with one product (matrix) in each cell;
Call blkdiag with a comma-separated list generated from the cell array.
Let A denote your matrix, and a denote a vector with your constants. Then the desired result B is obtained as
B = bsxfun(#times, A, reshape(a,1,1,[])); %// step 1: compute products as a 3D array
B = mat2cell(B,size(A,1),size(A,2),ones(1,numel(a))); %// step 2: convert to cell array
B = blkdiag(B{:}); %// step 3: call blkdiag with comma-separated list from cell array
Here's a method using kron which seems to be faster and more memory efficient than Divakar's bsxfun based solution. I'm not sure if this is different to your method, but the timing seems pretty good. It might be worth doing some testing between the different methods to work out which is more efficient for you problem.
A=magic(4);
a1=1;
a2=2;
a3=3;
kron(diag([a1 a2 a3]),A)

Delete rows in a vector starting from a specific index Matlab

I have an nx1 vector in Matlab. I want to delete rows starting from a specific index. For example, if n is 100 and the index is 60 then all rows from 60 till 100 will be removed. I've found this REMOVEROWS but I don't know how to make it do this.
The removerows function is probably overkill for a vector, but here is how it can be used along with the usual linear indexing method:
n = 100;
index = 60;
a = rand(n,1); % An n-by-1 column vector
b1 = a(1:index-1)
b2 = removerows(a,'ind',index:n) % Or removerows(a,'ind',index:size(a,1))
Note that the removerows function is in the Neural Network toolbox and thus not part of core Matlab.
This should the trick:
your_vector(index:end) = [];
If you want to remove from the end that you can do:
your_vector(end-index+1:end) = [];

How can I build a Scilab / MATLAB program that averages a 3D matrix?

I need to make a scilab / MATLAB program that averages the values of a 3D matrix in cubes of a given size(N x N x N).I am eternally grateful to anyone who can help me.
Thanks in advance
In MATLAB, mat2cell and cellfun make a great team for working on N-dimensional non-overlapping blocks, as I think is the case in the question. An example scenario:
[IN]: A = [30x30x30] array
[IN]: bd = [5 5 5], size of cube
[OUT]: B = [6x6x6] array of block means
To accomplish the above, the solution is:
dims = [30 30 30]; bd = [5 5 5];
A = rand(dims);
f = floor(dims./bd);
remDims = mod(dims,bd); % handle dims that are not a multiple of block size
Ac = mat2cell(A,...
[bd(1)*ones(f(1),1); remDims(1)*ones(remDims(1)>0)], ....
[bd(2)*ones(f(2),1); remDims(2)*ones(remDims(2)>0)], ....
[bd(3)*ones(f(3),1); remDims(3)*ones(remDims(3)>0)] );
B = cellfun(#(x) mean(x(:)),Ac);
If you need a full size output with the mean values replicated, there is a straightforward solution involving the 'UniformOutput' option of cellfun followed by cell2mat.
If you want overlapping cubes and the same size output as input, you can simply do convn(A,ones(blockDims)/prod(blockDims),'same').
EDIT: Simplifications, clarity, generality and fixes.
N = 10; %Same as OP's parameter
M = 10*N;%The input matrix's size in each dimensiona, assumes M is an integer multiple of N
Mat = rand(M,M,M); % A random input matrix
avgs = zeros((M/N)^3,1); %Initializing output vector
l=1; %indexing
for i=1:M/N %indexing 1st coord
for j=1:M/N %indexing 2nd coord
for k=1:M/N % indexing third coord
temp = Mat((i-1)*N+1:i*N,(j-1)*N+1:j*N,(k-1)*N+1:k*N); %temporary copy
avg(l) = mean(temp(:)); %averaging operation on the N*N*N copy
l = l+1; %increment indexing
end
end
end
The for loops and copying can be eliminated once you get the gist of indexing.

How to see resampled data after BOOTSTRAP

I was trying to resample (with replacement) my database using 'bootstrap' in Matlab as follows:
D = load('Data.txt');
lead = D(:,1);
depth = D(:,2);
X = D(:,3);
Y = D(:,4);
%Bootstraping to resample 100 times
[resampling100,bootsam] = bootstrp(100,'corr',lead,depth);
%plottig the bootstraping result as histogram
hist(resampling100,10);
... ... ...
... ... ...
Though the script written above is correct, I wonder how I would be able to see/load the resampled 100 datasets created through bootstrap? 'bootsam(:)' display the indices of the data/values selected for the bootstrap samples, but not the new sample values!! Isn't it funny that I'm creating fake data from my original data and I can't even see what is created behind the scene?!?
My second question: is it possible to resample the whole matrix (in this case, D) altogether without using any function? However, I know how to create random values from a vector data using 'unidrnd'.
Thanks in advance for your help.
The answer to question 1 is that bootsam provides the indices of the resampled data. Specifically, the nth column of bootsam provides the indices of the nth resampled dataset. In your case, to obtain the nth resampled dataset you would use:
lead_resample_n = lead(bootsam(:, n));
depth_resample_n = depth(bootsam(:, n));
Regarding the second question, I'm guessing what you mean is, how would you just get a re-sampled dataset without worrying about applying a function to the resampled data. Personally, I would use randi, but in this situation, it is irrelevant whether you use randi or unidrnd. An example follows that assumes 4 columns of some data matrix D (as in your question):
%# Build an example dataset
T = 10;
D = randn(T, 4);
%# Obtain a set of random indices, ie indices of draws with replacement
Ind = randi(T, T, 1);
%# Obtain the resampled data
DResampled = D(Ind, :);
To create multiple re-sampled data, you can simply loop over the creation of random indices. Or you could do it in one step by creating a matrix of random indices and using that to index D. With careful use of reshape and permute you can turn this into a T*4*M array, where indexing m = 1, ..., M along the third dimension yields the mth resampled dataset. Example code follows:
%# Build an example dataset
T = 10;
M = 3;
D = randn(T, 4);
%# Obtain a set of random indices, ie indices of draws with replacement
Ind = randi(T, T, M);
%# Obtain the resampled data
DResampled = permute(reshape(D(Ind, :)', 4, T, []), [2 1 3]);