Scale one dataset to another in Matlab

Scale one dataset to another in Matlab - matlab

I have two datasets that they are a particular metric from two images (dat1 and dat2). I want both images to have the same response. The 'ideal' image should look like the first dataset (dat1)
but the real image looks like the second dataset(dat2).
I want to try to 'fit' the second dataset to the first dataset. How can i scale dat2 so that it looks like dat1 using Matlab?
I have tried to fit dat1 with different polynomials,exponentials or gaussians and then use the coefficients that i found to fit dat2 but the program fails and it does not fit properly, it gives me a straight zero line. When i try to fit dat2 using the same shape allowing the coefficients to be free then the program does not give me the ideal shape that i want because it follows the trends of dat2.
Is there any way to fit a dataset to a another set of data instead of a function?

Normally, in this situation, a very common approach consists in normalizing all the vectors between 0 and 1 (interval [0,1] with both extremes included). This can be easily achieved as follows:
dat1_norm = rescale(dat1);
dat2_norm = rescale(dat2);
If you have a version of Matlab greater than or equal to 2017b, the rescale function is already included by default. Otherwise, it can be defined as follows:
function x = rescale(x)
x = x - min(x);
x = x ./ max(x);
end
In order to achieve the objective you mention (rescaling dat1 based on minimum and maximum values of dat2), you can proceed as #cemsazara said in his comment:
dat2_scaled = rescale(dat2,min(dat1),max(dat1));
But this is a good solution only as long as you can identify the vector with the larger scale a priori. Otherwise, the risk is to rescale the smaller vector based on the values of the bigger one. That's why the first approach I suggested you may be a more comfortable solution.
In order to adopt this second approach, if your Matlab version is less than 2017b, you must modify the custom rescale function defined above in order to accept two supplementar arguments:
function x = rescale(x,mn,mx)
if (nargin == 1)
mn = min(x);
mx = max(x);
elseif ((nargin == 0) || (nargin == 2))
error('Invalid number of arguments supplied.');
end
x = x - mn;
x = x ./ mx;
end

Related

Rewrite medfilt1 MATLAB function to support codegen

I am writing a MATLAB script that uses the medfilt1 function. Here is an example using an order of 100:
median_filter_results = medfilt1(my_data, 100);
When trying to export the MATLAB code via codegen, an error message states that medfilt1 is not supported. Looking on the MATLAB documentation website, I can tell that it is not there, while medfilt2 is. This makes me think that the function is probably rather easy to reproduce.
When reading this post, the authors make this comment:
You can use the median() function. Then you just have to put that inside a for loop, which is extremely trivial.
However, I am not entirely sure I know what that means since the median function returns back one number vs a vector of the medfilt1 function. Wikipedia goes a bit further, where they show a sliding window, through which one could use the median function. However, I am not entirelly too certain that this is what MATLAB is doing.
How can I rewrite the medfilt1 function (vector of data and order of 100) in a codegen safe way?

Here is an implementation using sliding window of median in a for loop:
Implementing a sliding window is simple.
There is a small complication regarding the margins.
The implementation pads the margins with zeros (default padding of medfilt1).
Here is the implementation and a test:
n = 100;
%Test using an array of random elements.
A = rand(1, 1000);
B = my_medfilt1(A, n);
%Reference for testing
refB = medfilt1(A, n);
%Display 1 if result of my_medfilt1 is the same as medfilt1
is_equal = all(B == refB)
function y = my_medfilt1(x, n)
%Perform one dimensional median filter in a loop.
%Assume x is one dimensional row vector.
if size(x, 1) > 1
error('x must be a row vector')
end
y = zeros(1, length(x)); %Initialize space for storing resut
%Add n/2 zeros from each side of x (this is the default padding of medfilt1.
x = padarray(x, [0, floor(n/2)], 0, 'both');
%Sliding window
for i = 1:length(y)
y(i) = median(x(i:i+n-1));
end
end

If the 2d filter is supported, you could repurpose it.
x=rand(100,1);
y1=medfilt1(x,11);
y2=medfilt2(x,[11,1]);
all(y1==y2)
Otherwise, read up what a median filter does. It replaces the element with the median of it and it's surrounding neighbors. Size of the neighborhood is your parameter n.

Matlab Convolution regarding the conv() function and length()/size() function

I'm kind've new to Matlab and stack overflow to begin with, so if I do something wrong outside of the guidelines, please don't hesitate to point it out. Thanks!
I have been trying to do convolution between two functions and I have been having a hard time trying to get it to work.
t=0:.01:10;
h=exp(-t);
x=zeros(size(t)); % When I used length(t), I would get an error that says in conv(), A and B must be vectors.
x(1)=2;
x(4)=5;
y=conv(h,x);
figure; subplot(3,1,1);plot(t,x); % The discrete function would not show (at x=1 and x=4)
subplot(3,1,2);plot(t,h);
subplot(3,1,3);plot(t,y(1:length(t))); %Nothing is plotted here when ran
I commented my issues with the code. I don't understand the difference of length and size in this case and how it would make a difference.
For the second comment, x=1 should have an amplitude of 2. While x=4 should have an amplitude of 5. When plotted, it only shows nothing in the locations specified but looks jumbled up at x=0. I'm assuming that's the reason why the convoluted plot won't be displayed.
The original problem statement is given if it helps to understand what I was thinking throughout.
Consider an input signal x(t) that consists of two delta functions at t = 1 and t = 4 with amplitudes A1 = 5 and A2 = 2, respectively, to a linear system with impulse response h that is an exponential pulse (h(t) = e ^−t ). Plot x(t), h(t) and the output of the linear system y(t) for t in the range of 0 to 10 using increments of 0.01. Use the MATLAB built-in function conv.

The initial question regarding size vs length
length yields a scalar that is equal to the largest dimension of the input. In the case of your array, the size is 1 x N, so length yields N.
size(t)
% 1 1001
length(t)
% 1001
If you pass a scalar (N) to ones, zeros, or a similar function, it will create a square matrix that is N x N. This results in the error that you see when using conv since conv does not accept matrix inputs.
size(ones(length(t)))
% 1001 1001
When you pass a vector to ones or zeros, the output will be that size so since size returns a vector (as shown above), the output is the same size (and a vector) so conv does not have any issues
size(ones(size(t)))
% 1 1001
If you want a vector, you need to explicitly specify the number of rows and columns. Also, in my opinion, it's better to use numel to the number of elements in a vector as it's less ambiguous than length
z = zeros(1, numel(t));
The second question regarding the convolution output:
First of all, the impulses that you create are at the first and fourth index of x and not at the locations where t = 1 and t = 4. Since you create t using a spacing of 0.01, t(1) actually corresponds to t = 0 and t(4) corresponds to t = 0.03
You instead want to use the value of t to specify where to put your impulses
x(t == 1) = 2;
x(t == 4) = 5;
Note that due to floating point errors, you may not have exactly t == 1 and t == 4 so you can use a small epsilon instead
x(abs(t - 1) < eps) = 2;
x(abs(t - 4) < eps) = 5;
Once we make this change, we get the expected scaled and shifted versions of the input function.

Logic of this FWHM script?

Could someone explain the logic of this program.
I dont understand why the y=y/max(y)
and,
interp = (0.5-y(i-1)) / (y(i)-y(i-1));
tlead = x(i-1) + interp*(x(i)-x(i-1));
The script:
function width = fwhm(x,y)
y = y / max(y);
N = length(y);
MicroscopeMag=10;
PixelWidth=7.8; % Pixel Pitch is 7.8 Microns.
%------- find index of center (max or min) of pulse---------------%
[~,centerindex] = max(y);% 479 S10 find center peak and coordinate
%------- find index of center (max or min) of pulse-----------------%
i = 2;
while sign(y(i)-0.5) == sign(y(i-1)-0.5) %trying to see the curve raise
i = i+1; %474 S10
end %first crossing is between v(i-1) & v(i)
interp = (0.5-y(i-1)) / (y(i)-y(i-1));
tlead = x(i-1) + interp*(x(i)-x(i-1));
i=centerindex+1; %471
%------- start search for next crossing at center--------------------%
while ((sign(y(i)-0.5) == sign(y(i-1)-0.5)) && (i <= N-1))
i = i+1;
end
if i ~= N
interp = (0.5-y(i-1)) / (y(i)-y(i-1));
ttrail = x(i-1) + interp*(x(i)-x(i-1));
%width = ttrail - tlead; % FWHM
width=((ttrail - tlead)/MicroscopeMag)*PixelWidth;
% Lateral Magnification x Pixel pitch of 7.8 microns.
end
Thanks.

The two segments of code you specifically mention are both housekeeping: it's more about the compsci of it than the optics.
So the first line
y = y/max(y);
is normalising it to 1, i.e. dividing the whole series through by the maximum value. This is a fairly common practice and it's sensible to do it here, it saves the programmer from having to divide through by it later.
The next part,
interp = (0.5-y(i-1)) / (y(i)-y(i-1));
tlead = x(i-1) + interp*(x(i)-x(i-1));
and the corresponding block later on for ttrail, are about trying to interpolate the exact point(s) where the signal's value would be 0.5. Earlier it identifies the centre of the peak and the last index position before half-maximum, so now we have a range containing the leading edge of the signal.
The 'half-maximum' criterion requires us to find the point where that leading edge's value is 0.5 (we normalised to 1, so the half-maximum is by definition 0.5). The data probably won't have a sample at exactly that value - it'll go [... 0.4856 0.5024 ...] or something similar.
So these two lines are an attempt to determine in fractions of an index exactly where the line would cross the 0.5 value. It does this by simple linear interpolation:
y(i)-y(i-1)
gives us the delta_y between the two values either side, and
0.5-y(i-1)
gives us the shortfall. By taking the ratio we can linearly interpolate how far between the two index positions we should go to hit exactly 0.5.
The next line then works out the corresponding delta_x, which gives you the actual distance in terms of the timebase.
It does the same thing for the trailing edge, then uses these two interpolated values to give you a more precise value for the full-width.
To visualise this I would put a breakpoint at the i = 2 line and step through it, noting or plotting the values of y(i) as you go. stem is helpful for visualising discrete data, especially when you're working between index positions.

The program computes the resolution of a microscope using the Full Width at Half Maximum (FWHM) of the Point Spread Function (PSF) characterizing the microscope with a given objective/optics/etc.
The PSF normally looks like a gaussian:
and the FWHM tells you how good is your microscope system to discern small objects (i.e. the resolution). Let's say you are looking at 2 point objects, then the resolution (indirectly FWHM) is the minimum size those objects need to be if you are indeed to tell that there are 2 objects close to one another instead of one big object.
Now for the above function, it looks like it first compute the maximum of the PSF and then progressively goes down along the curve until it approximately reaches the half maximum. Then it's possible to compute the FWHM from the distribution of the PSF.
Hope that makes things a bit clearer!

Computing a moving average

I need to compute a moving average over a data series, within a for loop. I have to get the moving average over N=9 days. The array I'm computing in is 4 series of 365 values (M), which itself are mean values of another set of data. I want to plot the mean values of my data with the moving average in one plot.
I googled a bit about moving averages and the "conv" command and found something which i tried implementing in my code.:
hold on
for ii=1:4;
M=mean(C{ii},2)
wts = [1/24;repmat(1/12,11,1);1/24];
Ms=conv(M,wts,'valid')
plot(M)
plot(Ms,'r')
end
hold off
So basically, I compute my mean and plot it with a (wrong) moving average. I picked the "wts" value right off the mathworks site, so that is incorrect. (source: http://www.mathworks.nl/help/econ/moving-average-trend-estimation.html) My problem though, is that I do not understand what this "wts" is. Could anyone explain? If it has something to do with the weights of the values: that is invalid in this case. All values are weighted the same.
And if I am doing this entirely wrong, could I get some help with it?
My sincerest thanks.

There are two more alternatives:
1) filter
From the doc:
You can use filter to find a running average without using a for loop.
This example finds the running average of a 16-element vector, using a
window size of 5.
data = [1:0.2:4]'; %'
windowSize = 5;
filter(ones(1,windowSize)/windowSize,1,data)
2) smooth as part of the Curve Fitting Toolbox (which is available in most cases)
From the doc:
yy = smooth(y) smooths the data in the column vector y using a moving
average filter. Results are returned in the column vector yy. The
default span for the moving average is 5.
%// Create noisy data with outliers:
x = 15*rand(150,1);
y = sin(x) + 0.5*(rand(size(x))-0.5);
y(ceil(length(x)*rand(2,1))) = 3;
%// Smooth the data using the loess and rloess methods with a span of 10%:
yy1 = smooth(x,y,0.1,'loess');
yy2 = smooth(x,y,0.1,'rloess');

In 2016 MATLAB added the movmean function that calculates a moving average:
N = 9;
M_moving_average = movmean(M,N)

Using conv is an excellent way to implement a moving average. In the code you are using, wts is how much you are weighing each value (as you guessed). the sum of that vector should always be equal to one. If you wish to weight each value evenly and do a size N moving filter then you would want to do
N = 7;
wts = ones(N,1)/N;
sum(wts) % result = 1
Using the 'valid' argument in conv will result in having fewer values in Ms than you have in M. Use 'same' if you don't mind the effects of zero padding. If you have the signal processing toolbox you can use cconv if you want to try a circular moving average. Something like
N = 7;
wts = ones(N,1)/N;
cconv(x,wts,N);
should work.
You should read the conv and cconv documentation for more information if you haven't already.

I would use this:
% does moving average on signal x, window size is w
function y = movingAverage(x, w)
k = ones(1, w) / w
y = conv(x, k, 'same');
end
ripped straight from here.
To comment on your current implementation. wts is the weighting vector, which from the Mathworks, is a 13 point average, with special attention on the first and last point of weightings half of the rest.

Removing Similar Elements in Matrix

I'm trying to figure out how to remove an element of a matrix in MATLAB if it differs from any of the other elements by 0.01. I'm supposed to be using all of the unique elements of the matrix as thresholding values for a ROC curve that I'm creating but I need a way to remove values when they are within 0.01 of each other (since we are assuming they are basically equal if this is true).
And help would be greatly appreciated!
Thanks!

If you are simply trying to remove adjacent values within that tolerance from a vector, I would start with something like this:
roc = ...
tolerance = 0.1;
idx = [logical(1) diff(roc)>tolerance)];
rocReduced = roc(idx);
'rocReduced' is now a vector with all values that didn't have an adjacent values within a tolerance in the original vector.
This approach has two distinct limitations:
The original 'roc' vector must be monotonic.
No more than two items in a row may be within the tolerance, otherwise the entire swath will be removed.
I suspect the above would not be sufficient. That said, I can't think of any simple operations that overcome those (and other) limitations while still using vectorized matrix operations.
If performance is not a huge issue, you maybe the following iterative algorithm would suit your application:
roc = ...
tolerance = 0.1;
mask = true(size(roc)); % Start with all points
last = 1; % Always taking first point
for i=2:length(roc) % for all remaining points,
if(abs(roc(i)-roc(last))<tolerance) % If this point is within the tolerance of the last accepted point, remove it from the mask;
mask(i) = false;
else % Otherwise, keep it and mark the last kept
last = i;
end
end
rocReduced = roc(mask);
This handles multiple consecutive sub-tolerance intervals without necessarily throwing all away. It also handles non-monotonic sequences.
MATLAB users sometimes shy away from iterative solutions (vs. vectorized matrix operations), but sometimes it's not worth the trouble of finding a more elegant solution when brute force performance meets your needs.

Let all the elements in your matrix form a graph G = (V,E) such that an there is an edge between two vertices (u,v) if the difference between them is less than 0.01. Now, construct an adjacency matrix for this graph and find the element with the largest degree. Remove it and add it to a list and remove it's neighbors from your graph and repeat until there aren't any elements left.
CODE:
%% Toy dataset
M = [1 1.005 2 ;2.005 2.009 3; 3.01 3.001 3.005];
M = M(:);
A = false(numel(M),numel(M));
for i=1:numel(M)
ind = abs(M-M(i))<=0.01;
A(i,ind) = 1;
end
C = [];
while any(A(:))
[val ind] = max(sum(A));
C(end+1) = M(ind);
A(A(ind,:),:) = 0;
end
This has a runtime of O(n^2) where your matrix has n elements. Yeah it's slow.

From your description, it's not very clear how you want to handle a chain of values (as pointed out in the comments already), e.g. 0.0 0.05 0.1 0.15 ... and what you actually mean by removing the elements from the matrix: set them to zero, remove the entire column, remove the entire line?
For a vector, it could look like (similar to Adams solution)
roc = ...
tolerance = 0.1;
% sort it first to get the similar values in a row
[rocSorted, sortIdx] = sort(roc);
% find the differing values and get their indices
idx = [logical(1); diff(rocSorted)>tolerance)];
sortIdxReduced = sortIdx(idx);
% select only the relevant parts from the original vector (revert sorting)
rocReduced = roc(sort(sortIdxReduced));
The code is untested, but should work hopefully.

Before you use a threshold or tolerance to keep values that all close enough, you can use matlab inbuilt unique() to reduce the run. Usually, matlab tries to accelerate their inbuilts, so try to use as many inbuilts as possible.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Scale one dataset to another in Matlab - matlab

Related

Rewrite medfilt1 MATLAB function to support codegen

Matlab Convolution regarding the conv() function and length()/size() function

Logic of this FWHM script?

Computing a moving average

Removing Similar Elements in Matrix

Categories

Resources