choosing a percentage of a dataset

choosing a percentage of a dataset - matlab

I am new to matlab and I cant find anything in the documentation for this, I have a method of sampling a dataset but I was wondering rather than using direct numbers how I can use a percentage:
normIdx = strmatch('normal.', TestDataLabels);
normalSubset = Testdata(normIdx, :);
normal = randperm(size(normalSubset , 1));
p = (normal(1:10000))'; % here I choose 10000 samples but I would like to use a percentage

You mean like this?
pcnt = 75; % The percent of original data set size you wish your sample size to be
sampleN = ceil( (pcnt/100) * length(normal) ); % figure out what pcnt percent of original N is, and round upward
p = normal(1:sampleN)';

Related

Approximation of cosh and sinh functions that give large values in MATLAB

My calculation involves cosh(x) and sinh(x) when x is around 700 - 1000 which reaches MATLAB's limit and the result is NaN. The problem in the code is elastic_restor_coeff rises when radius is small (below 5e-9 in the code). My goal is to do another integral over a radius distribution from 1e-9 to 100e-9 which is still a work in progress because I get stuck at this problem.
My work around solution right now is to approximate the real part of chi_para with a step function when threshold2 hits a value of about 300. The number 300 is obtained from using the lowest possible value of radius and look at the cut-off value from the plot. I think this approach is not good enough for actual calculation since this value changes with radius so I am looking for a better approximation method. Also, the imaginary part of chi_para is difficult to approximate since it looks like a pulse instead of a step.
Here is my code without an integration over a radius distribution.
k_B = 1.38e-23;
T = 296;
radius = [5e-9,10e-9, 20e-9, 30e-9,100e-9];
fric_coeff = 8*pi*1e-3.*radius.^3;
elastic_restor_coeff = 8*pi*1.*radius.^3;
time_const = fric_coeff/elastic_restor_coeff;
omega_ar = logspace(-6,6,60);
chi_para = zeros(1,length(omega_ar));
chi_perpen = zeros(1,length(omega_ar));
threshold = zeros(1,length(omega_ar));
threshold2 = zeros(1,length(omega_ar));
for i = 1:length(radius)
for k = 1:length(omega_ar)
omega = omega_ar(k);
fric_coeff = 8*pi*1e-3.*radius(i).^3;
elastic_restor_coeff = 8*pi*1.*radius(i).^3;
time_const = fric_coeff/elastic_restor_coeff;
G_para_func = #(t) ((cosh(2*k_B*T./elastic_restor_coeff.*exp(-t./time_const))-1).*exp(1i.*omega.*t))./(cosh(2*k_B*T./elastic_restor_coeff)-1);
G_perpen_func = #(t) ((sinh(2*k_B*T./elastic_restor_coeff.*exp(-t./time_const))).*exp(1i.*omega.*t))./(sinh(2*k_B*T./elastic_restor_coeff));
chi_para(k) = (1 + 1i*omega*integral(G_para_func, 0, inf));
chi_perpen(k) = (1 + 1i*omega*integral(G_perpen_func, 0, inf));
threshold(k) = 2*k_B*T./elastic_restor_coeff*omega;
threshold2(k) = 2*k_B*T./elastic_restor_coeff*(omega*time_const - 1);
end
figure(1);
semilogx(omega_ar,real(chi_para),omega_ar,imag(chi_para));
hold on;
figure(2);
semilogx(omega_ar,real(chi_perpen),omega_ar,imag(chi_perpen));
hold on;
end
Here is the simplified function that I would like to approximate:
where x is iterated in a loop and the maximum value of x is about 700.

How to create nonlinear spaced vector in Matlab?

I'm trying to create a contour plot with focus around a particular finite range from 1 to 1.05. At the same time, I need very high resolution closer to 1. I thought I could use something like the following but the spacing still looks linear
out=exp(linspace(log(1),log(1.05),100))
plot(diff(out))
What is the best way to enhance the nonlinearity of the spacing when the bounds are so tight? Again, I need to maintain high density near 1 with the resolution tapering off in a nonlinear way. I have a few ideas but I thought someone might have a quick 2 liner or something of the sort.

instead of applying the function f(x) = ex, to get a 'steeper' non-linearity, rather apply f(x) = eax
n = 20;
a = 100;
lower = 1;
upper = 1.05;
temp = exp(linspace(log(1)*a,log(1.05)*a,n))
% re-scale to be between 0 and 1
temp_01 = temp/max(temp) - min(temp)/max(temp)
% re-scale to be between your limits (i.e. 1 and 1.05)
out = temp_01*(upper-lower) + lower
now plot(diff(out),diff(out),'o') produces
Note that you can use the exact same scaling scheme above with logspace so just use
temp = logspace(...)
and then the rest is the same

You can generate a logarithmic distribution between, for example, 1 and 1000 and then scale it back to [1, 1.05]:
out = logspace(0, 3, 100);
out = ( (out-min(out(:)))*(1.05-1) ) / ( max(out(:))-min(out(:)) ) + 1;
Result:
plot(diff(out));

Reverse-calculating original data from a known moving average

I'm trying to estimate the (unknown) original datapoints that went into calculating a (known) moving average. However, I do know some of the original datapoints, and I'm not sure how to use that information.
I am using the method given in the answers here: https://stats.stackexchange.com/questions/67907/extract-data-points-from-moving-average, but in MATLAB (my code below). This method works quite well for large numbers of data points (>1000), but less well with fewer data points, as you'd expect.
window = 3;
datapoints = 150;
data = 3*rand(1,datapoints)+50;
moving_averages = [];
for i = window:size(data,2)
moving_averages(i) = mean(data(i+1-window:i));
end
length = size(moving_averages,2)+(window-1);
a = (tril(ones(length,length),window-1) - tril(ones(length,length),-1))/window;
a = a(1:length-(window-1),:);
ai = pinv(a);
daily = mtimes(ai,moving_averages');
x = 1:size(data,2);
figure(1)
hold on
plot(x,data,'Color','b');
plot(x(window:end),moving_averages(window:end),'Linewidth',2,'Color','r');
plot(x,daily(window:end),'Color','g');
hold off
axis([0 size(x,2) min(daily(window:end))-1 max(daily(window:end))+1])
legend('original data','moving average','back-calculated')
Now, say I know a smattering of the original data points. I'm having trouble figuring how might I use that information to more accurately calculate the rest. Thank you for any assistance.

You should be able to calculate the original data exactly if you at any time can exactly determine one window's worth of data, i.e. in this case n-1 samples in a window of length n. (In your case) if you know A,B and (A+B+C)/3, you can solve now and know C. Now when you have (B+C+D)/3 (your moving average) you can exactly solve for D. Rinse and repeat. This logic works going backwards too.

Here is an example with the same idea:
% the actual vector of values
a = cumsum(rand(150,1) - 0.5);
% compute moving average
win = 3; % sliding window length
idx = hankel(1:win, win:numel(a));
m = mean(a(idx));
% coefficient matrix: m(i) = sum(a(i:i+win-1))/win
A = repmat([ones(1,win) zeros(1,numel(a)-win)], numel(a)-win+1, 1);
for i=2:size(A,1)
A(i,:) = circshift(A(i-1,:), [0 1]);
end
A = A / win;
% solve linear system
%x = A \ m(:);
x = pinv(A) * m(:);
% plot and compare
subplot(211), plot(1:numel(a),a, 1:numel(m),m)
legend({'original','moving average'})
title(sprintf('length = %d, window = %d',numel(a),win))
subplot(212), plot(1:numel(a),a, 1:numel(a),x)
legend({'original','reconstructed'})
title(sprintf('error = %f',norm(x(:)-a(:))))
You can see the reconstruction error is very small, even using the data sizes in your example (150 samples with a 3-samples moving average).

sum of absolute differences of two images

I want to find how similar a picture is to some samples that I have (black and white).
I use the sum of absolute difference code, but because I'm new to MATLAB I didn't find out how to use it. How does this algorithm work? Does it give a measure of how similar the pics are?
I= imread('img1.jpg');
image2= imread('img2.jpg');
% J = uint8(filter2(fspecial('gaussian'), I));
K = imabsdiff(I,image2);
figure, imshow(K,[])

Well I think you pretty much answered your question yourself. It is the sum of the absolute difference. So let say you have img1 and img2 which are the same size and type.
To find the difference, do subtraction
img1-img2
To find the absolute difference, use the absolute value function abs
abs(img1-img2)
To find the sum, use the sum function. Note that you will need to do this for each "dimension" your image has. If you are not sure, type size(img1) and see if there are 2 or 3 numbers that show up, this corresponds to how many sum(...) you need to use.
For a color image (3 dimensions):
sum(sum(sum(abs(img1-img2))))
^^ Is the sum of the absolute differences. Whichever has the smallest sum can be considered the closest image.
If you have different sized images, you need to use the normxcorr2 function. This function will return a matrix of the same size with how well the template (small) image fits into the big image at each different point. Find the maximum value of that matrix and that is how well that image fits.
For instance:
correlation = normxcorr2(smallImg, bigImg);
compareMe = max(correlation(:))

It is best practice to use MATLAB's build-in function imabsdiff. In contrast to the other suggested answers, it takes care of the range boundaries if your image is formatted as uint8. Consider:
img1 = uint8(10);
img2 = uint8(20);
sum(abs(img1(:)-img2(:)))
gives you 0, whereas
imabsdiff(img1(:),img2(:))
correctly gives 10.

You should use the command im2col in MATLAB you will be able to do so in Vectorized manner.
Just arrange each neighborhood in columns (For each frame).
Put them in 3D Matrix and apply you operation on the 3rd dimension.
Code Snippet
I used Wikipedia's definition of "Sum of Absolute Differences".
The demo script:
```
% Sum of Absolute Differences Demo
numRows = 10;
numCols = 10;
refBlockRadius = 1;
refBlockLength = (2 * refBlockRadius) + 1;
mImgSrc = randi([0, 255], [numRows, numCols]);
mRefBlock = randi([0, 255], [refBlockLength, refBlockLength]);
mSumAbsDiff = SumAbsoluteDifferences(mImgSrc, mRefBlock);
```
The Function SumAbsoluteDifferences:
```
function [ mSumAbsDiff ] = SumAbsoluteDifferences( mInputImage, mRefBlock )
%UNTITLED2 Summary of this function goes here
% Detailed explanation goes here
numRows = size(mInputImage, 1);
numCols = size(mInputImage, 2);
blockLength = size(mRefBlock, 1);
blockRadius = (blockLength - 1) / 2;
mInputImagePadded = padarray(mInputImage, [blockRadius, blockRadius], 'replicate', 'both');
mBlockCol = im2col(mInputImagePadded, [blockLength, blockLength], 'sliding');
mSumAbsDiff = sum(abs(bsxfun(#minus, mBlockCol, mRefBlock(:))));
mSumAbsDiff = col2im(mSumAbsDiff, [blockLength, blockLength], [(numRows + blockLength - 1), (numCols + blockLength - 1)]);
end
```
Enjoy...

How do I resize a Matlab matrix with a 3rd dimension?

So I'd like to resize a matrix that is of size 72x144x156 into a 180x360x156 grid. I can try to do it with this command: resizem(precip,2.5). The first two dimensions are latitude and longitude, while the last dimension is time. I don't want time to be resized.
This works if the matrix is of size 72x144. But it doesn't work for size 72x144x156. Is there a way to resize the first two dimensions without resizing the third?
Also, what is the fastest way to do this (preferably without a for loop). If a for loop is necessary, then that's fine.

I hinted in my comment, but could use interp3 like this:
outSize = [180 360 156];
[nrows,ncols,ntimes] = size(data);
scales = [nrows ncols ntimes] ./ outSize;
xq = (1:outSize(2))*scales(2) + 0.5 * (1 - scales(2));
yq = (1:outSize(1))*scales(1) + 0.5 * (1 - scales(1));
zq = (1:outSize(3))*scales(3) + 0.5 * (1 - scales(3));
[Xq,Yq,Zq] = meshgrid(xq,yq,zq);
dataLarge = interp3(data,Xq,Yq,Zq);
But the problem is simplified if you know you don't want to interpolate between time points, so you can loop as in Daniel R's answer. Although, this answer will not increase the number of time points.

D= %existing matrix
scale=2.5;
E=zeros(size(D,1)*2.5,size(D,2)*2.5,size(D,3))
for depth=1:size(D,3)
E(:,:,depth)=resizem(D(:,:,depth),scale)
end
This should provide the expected output.

% s = zeros(72, 144, 156);
% whos s;
% news = resize2D(s, 2.5);
% whos news;
function [result] = resize2D(input, multiply)
[d1, d2, d3] = size(input);
result = zeros(d1*multiply, d2*multiply, d3);
end

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

choosing a percentage of a dataset - matlab

You mean like this? pcnt = 75; % The percent of original data set size you wish your sample size to be sampleN = ceil( (pcnt/100) * length(normal) ); % figure out what pcnt percent of original N is, and round upward p = normal(1:sampleN)';

Related

Approximation of cosh and sinh functions that give large values in MATLAB

How to create nonlinear spaced vector in Matlab?

Reverse-calculating original data from a known moving average

sum of absolute differences of two images

How do I resize a Matlab matrix with a 3rd dimension?

Categories

Resources