I have a severely unbalanced data set. I want to perform a uniform resampling with 200% of the original data set size.
the resample function seems cannot perform as I expected. Anyone knows any toolbox or function can perform this? Thanks.
If you want to randomly resample with replacement from a data set of size N, you can use randi(N,1,N*2) to return a vector of size N*2 of random integers between 1 and N. Then use that vector to index into your original matrix. For example,
N = 100;
data = rand(1,N); % This simulates your original data set
idx = randi(N, 1, N*2);
newData = data(idx);
Related
i'm trying to extract HOG_features for mathematical symbols classification (i will use SVM classifier). I get a 1xn vector then i have to put all the vectors in a single matrix. The problem is that the size of the feature vector is different for each image so I can't concatenate them.
Is there a way to make all vectors having the same size ?
Thank you in advance.
Here is the code:
rep1 = 'D:\mémoire MASTER\data';
ext = '*.tif' ;
chemin = fullfile(rep1, ext);
list = dir(chemin);
for i=1:length(list)
I = imread(fullfile(rep1, list(i).name), ext(3:end));
if size(I,3)==3 % RGB image
I = rgb2gray(I);
end
I1 = imbinarize(I);
% Extract HOG features data
HOG_feat = extractHOGFeatures(I1,'CellSize', [2 2]);
HOG_feat1 = HOG_feat';
end
You can pad each one with zeros to be as long as the longest one:
e.g. to put two vectors, v1 and v2, into a matrix M:
M = zeros(2,max(length(v1),length(v2)));
M(1,1:length(v1)) = v1;
M(2,1:length(v2)) = v2;
You have the problem that all your vectors are a different size. Instead of trying to coerce them to be the sesame size by zero-padding or interpolating (both I think are bad ideas), change your computation so that the length of the output vector does not depend on the size of the image.
This is your current code:
HOG_feat = extractHOGFeatures(I1,'CellSize', [2 2]);
% ^^^
% the image is split in cells of 2x2 pixels
2x2 cells are way too small for this method anyway. You could instead divide your image into a set number of cells, say 100 cells:
cellSize = ceil(size(I1)/10);
HOG_feat = extractHOGFeatures(I1,'CellSize', cellSize);
(I’m using ceil in the division because I figure it’s necessary to have an integer size. But I’m not sure whether ceil or floor or round is needed here, and I don’t have access to this function to test it. A bit of trial and error should show which method gives consistent output size.)
I have a 36x256x2232 3d matrix in Matlab created by M = ones(36,256,2232) and I want to reduce the size of the matrix by sum rows by interval 3. The result matrix should be 12x256x2232 and each cell should have the value 3.
I tried using reshape and sum function but I get 1x256x2232 matrix.
How can I do this without using the for-loop ?
This should do it:
M = ones(36,256,2232)
reduced = reshape(sum(reshape(M, 3,[], 256,2232), 1),[], 256, 2232);
reshape makes a 4d matrix with the given intervals
sum reduce it
second reshape transform it to 3d again
you can use also squeeze, which removes singleton dimensions:
reduced = squeeze(sum(reshape(M, 3,[], 256,2232), 1));
You can use the new-ish splitapply function (which is similar to accumarray but can handle data with multiple dimensions). This approach works even if the number of rows is not a multiple of the group size:
M = ones(4,5,2); % example data
n = 3; % group size
result = splitapply(#(x)sum(x,1), M, floor((0:size(M,1)-1).'/n)+1);
I have a multidimensional time series in MATLAB. Let's say it's of M dimensions, N samples, and as such I have it stored in terms of NxM matrix.
I want interpolate the time series, to fit a new length (N1), where always N is always less than N1.
In other words, if I have multiple time series (all sampled at the same rate, just of different lengths), I want to interpolate them all to be of length N0.
How can one achieve this with MATLAB?
EDIT: Could one achieve this with imresize?
i.e.:
A = randn(5,10) % 10 dimensions, 5 samples
desiredLength = 15; % we want 15 samples in lenght
newA = imresize(A, [desiredLength 10], 'bilinear');
A procedure like the following might do what you want. The new data will be a linear interpolation of the old data.
[initSize1, initSize2] = ndgrid(1:size(Data, 1), 1:size(Data, 2));
[newSize1, newSize2] = ndgrid(linspace(1, size(Data, 1), newlength), 1:size(Data, 2));
newData = interpn(initSize1, initSize2, Data, newSize1, newSize2);
As coded up, only dimension 1 should change, as the second gridded dimension is the same in the first and second calls to ndgrid.
If you have a timeseries object, you might also want to look at the resample method for the timeseries object:
http://www.mathworks.co.uk/help/matlab/ref/timeseries.resample.html
I have limited data RV for which I can find the mean mu and standard deviation sigma. Now I want to generate more data points keeping the same mu and sigma. How would I go about doing this in MATLAB? I did the following, however when I plot mean of the generated data (mu_2) it does not match mu...
N = 15
R = mean(RV) + std(RV)*randn(N, 1);
mu = mean(RV)*ones(N,1);
mu_2 = mean(R)*ones(N,1);
I think you should use normrnd(mu,sigma) function
go to documentation to get more details
Best regards
That looks correct. For such a small sample size, it's unlikely that you'll get a very good match. Try a much bigger value of N.
If you want to force your dataset to a particular mean and stddev, then you could just generate a set of samples, then measure their mean and stddev, and then just adjust by scaling and scalar addition.
For example:
R = randn(N,1);
% Measure
mu_tmp = mean(R);
std_tmp = std(R);
% Normalise and denormalise
R = (R - mu_tmp) / std_tmp;
R = (R * std_desired) + mu_desired;
You can also generate Gaussian mixtures using the Netlab library (its free!)
mix=gmm(8,3,'spherical');
[Data, Label]=gmmsamp(mix,1000);
The above generates a data set with 8 dimensions and three centers (spherical) over 1000 observations.
Data: Say I have a 2000 rows by 500 column matrix (image)
What I need: Compute the FFT of 64 rows by 10 column chunks of above data. In other words, I want to compute the FFT of 64X10 window that is run across the entire data matrix. The FFT result is used to compute a scalar value (say peak amplitude frequency) which is used to create a new "FFT value" image.
Now, I need the final FFT image to be the same size as the original data (2000 X 500).
What is the fastest way to accomplish this in MATLAB? I am currently using for loops which is relatively slow. Also I use interpolation to size up the final image to the original data size.
As #EitanT pointed out, you can use blockproc for batch block processing of an image J. However you should define your function handle as
fun = #(block_struct) fft2(block_struct.data);
B = blockproc(J, [64 10], fun);
For a [2000 x 500] matrix this will give you a [2000 x 500] output of complex Fourier values, evaluated at sub-sampled pixel locations with a local support (size of the input to FFT) of [64 x 10]. Now, to replace those values with a single, e.g. with the peak log-magnitude, you can further specify
fun = #(block_struct) max(max(log(abs(fft2(block_struct.data)))));
B = blockproc(J, [64 10], fun);
The output then is a [2000/64 x 500/10] output of block-patch values, which you can resize by nearest-neighbor interpolation (or something else for smoother versions) to the desired [2000 x 500] original size
C = imresize(B, [2000 500], 'nearest');
I can include a real image example if it will further help.
Update: To get overlapping blocks you can use the 'Bordersize' option of blockproc by setting the overlap [V H] such that the final windows of size [M + 2*V, N + 2*H] will still be [64, 10] in size. Example:
fun = #(block_struct) log(abs(fft2(block_struct.data)));
V = 16; H = 3; % overlap values
overlap = [V H];
M = 32; N = 4; % non-overlapping values
B1 = blockproc(J, [M N], fun, 'BorderSize', overlap); % final windows are 64 x 10
However, this will work with keeping the full Fourier response, not the single-value version with max(max()) above.
See also this post for filtering using blockproc:Dealing with “Really Big” Images: Block Processing.
If you want to apply the same function (in your case, the 2-D Fourier transform) on individual distinct blocks in a larger matrix, you can do that with the blkproc function, which is replaced in newer MATLAB releases by blockproc.
However, I infer that you wish to apply apply fft2 on overlapping blocks in a "sliding window" fashion. For this purpose you can use colfilt with the 'sliding' option. Note that the function that we're applying on each block is the fft:
block_size = [64, 10];
temp_size = 5 * block_size;
col_func = #(x)cellfun(#(y)max(max(abs(fft2(y)))), num2cell(x, 1), 'Un', 0);
B = colfilt(A, block_size, 10 * block_size, 'sliding', col_func);
How does this work? colfilt processes the matrix A by rearranging each "sliding" block into a separate column of a new temporary matrix, and then applying the col_func to this new matrix. col_func in turn restores each column into the original block and applies fft2 on it, returning the largest amplitude value for each column.
Important things to note:
Since this mentioned temporary matrix includes all possible "sliding" blocks, memory could be a limitation. Therefore, in order to use less memory in calculations, colfilt breaks up the original matrix A into sub-matrices of temp_size, and performs calculations separately on each. The resulting matrix B is still the same, of course.
Each element in the resulting matrix B is computed from the corresponding block neighborhood. The larger your image is, the more blocks you will need to process, so the computation time will increase geometrically. I believe that you'll have to wait quite a bit until MATLAB finishes processing all sliding windows on your 2000-by-500 matrix.