How can I find a max value of a selected region in a fit? - matlab

I am trying to find a max value of a curve fitted plot for a certain region in this plot. I have a 4th order fit, and when i use max(x), the ans for this is an extrapolated value, while I am actually looking fot the max value of the 'bump' in my data.
So question, how do I select the max for only a certain region in the data while using a cfit? Or how do I exclude a part of the fit?
LF = pol4Fit(L,F);
Coefs= coeffvalues(LF);
This code does only give the optimum (the max value) of the real points:
L_opt = feval(LF,L);
[F_opt,Num_Length]= max (L_opt);
Opt_Length= L(Num_Length);
So now I was trying something like: y=max(LF(F)), but this is not specific to select a region.

Try to only evaluate the region you are interested in.
For instance, let's say the specific region is a vector named S.
You can simply rewrite your code like below:
L_opt = feval(LF,S);
Use the specific domain region S instead of the whole domain L and it only evaluates the region you are concerned with. Then using max function should work properly for you.

Related

MATLAB-How can I randomly select smaller values with higher probabilities?

I have a column vector "distances", and I want to select a value randomly from this vector such that smaller values have a higher probability of being selected. So far I am using the following, where "possible_cells" is the randomly selected value:
w=(fliplr(1:numel(distances)))/100
possible_cells=randsample((sort(distances)),1,true,w)
Basically, I flipped the distance vector to create probabilities of selection "w" (if I am understanding randsample correctly), so that the smallest value has the probability of being selected equal to the highest value. To check how well this works, I randomly drew 50 values and by using a histogram, I see that the values are higher than I would expect. Does anyone have any idea on how else to do what I described above?
0 Comments
How about something like this?
let's start with 10 sample distances with lengths no greater than 20 just to demonstrate:
d = randi(20,10,1);
Next, since we want smaller values to be more likely, let's take the reciprocal of those distances:
d_rec = 1./d;
Now, let's normalize so we can create a distribution from which to select our distance:
d_rec_norm = d_rec ./ sum(d_rec);
This new variable reflects the probability with which to select each given distance. Now comes a little trick... we choose the distance like this:
d_i = find(rand < cumsum(d_rec_norm),1);
This will give us the index of our chosen distance. The logic behind this is that when cumulatively summing the normalized values associated with each distance (d_rec_norm) we create "bins" whose widths are proportional to the likelihood of selecting each distance. All that is left is to pick a random number between 0 and 1 (rand) and see which "bin" it falls in.
I'm a new poster here, so let me know if this is unclear and I can try to improve my explanation.

find range of value in matlab from two columns which are dependent

I have an excel file which contains 4 columns. First column is time based on seconds and the other three are my function. How is it possible to find a time for specific value in the time column? let me give an example:
let say I want to find where is this value in the second column: 0.7636 located in time column? I found it manually which is located between 6960-7020enter link description here.
So, if I have for example a couple of values, and also considering different functions then it is difficult to do it manually.
I hope to hear your support.
Thanks Sepideh
You have to think about what a good solution is first. Let's call your three functions f1(t), f2(t) and f3(t). Now you have values v=[v1,v2,v3] and you want to know the best matching time value.
What is the best matching time value? You have to find some kind of distance metric, which tells you how close the data matches. As a default, I would use the 2-norm unless you have a reason to use something else. This would be:
%no running code, just a formula
d(t)=sqrt((f1(t)-v1)^2+(f2(t)-v2)^2+(f3(t)-v3)^2)
Now having d defined, you want to minimize it. There are basically two approaches. If you are looking for the closest row in your data, calculate d(t) for each row and take the minimum. Another approach would be to interpolate f1 to f3 so you could fill the gaps between your rows, then again search for the minimum d(t)
You can try something like this, to find position of data in column#2:
data = xlsread('Q1.xlsx');
ref = 0.7636; % your reference value
lb = data(:,2) < ref; % find lower value
ub = data(:,2) > ref; % find greater value
lower_bound = find(data(:,2)==max(data(lb,2))); % find lower value position
upper_bound = find(data(:,2)==min(data(ub,2))); % find greater value position
row = data(sort([lower_bound upper_bound]),1); % find position in column#1
and result will be row = [6960;7020].

MATLAB - histograms of equal size and histogram overlap

An issue I've come across multiple times is wanting to take two similar data sets and create histograms from them where the bins are identical, so as to easily calculate things like histogram overlap.
You can define the number of bins (obviously) using
[counts, bins] = hist(data,number_of_bins)
But there's not an obvious way (as far as I can see) to make the bin size equal for several different data sets. If remember when I initially looked finding various people who seem to have the same issue, but no good solutions.
The right, easy way
As pointed out by horchler, this can easily be achieved using either histc (which lets you define your bins vector), or vectorizing your histogram input into hist.
The wrong, stupid way
I'm leaving below as a reminder to others that even stupid questions can yield worthwhile answers
I've been using the following approach for a while, so figured it might be useful for others (or, someone can very quickly point out the correct way to do this!).
The general approach relies on the fact that MATLAB's hist function defines an equally spaced number of bins between the largest and smallest value in your sample. So, if you append a start (smallest) and end (largest) value to your various samples which is the min and max for all samples of interest, this forces the histogram range to be equal for all your data sets. You can then truncate the first and last values to recreate your original data.
For example, create the following data set
A = randn(1,2000)+7
B = randn(1,2000)+9
[counts_A, bins_A] = hist(A, 500);
[counts_B, bins_B] = hist(B, 500);
Here for my specific data sets I get the following results
bins_A(1) % 3.8127 (this is also min(A) )
bins_A(500) % 10.3081 (this is also max(A) )
bins_B(1) % 5.6310 (this is also min(B) )
bins_B(500) % 13.0254 (this is also max(B) )
To create equal bins you can simply first define a min and max value which is slightly smaller than both ranges;
topval = max([max(A) max(B)])+0.05;
bottomval = min([min(A) min(B)])-0.05;
The addition/subtraction of 0.05 is based on knowledge of the range of values - you don't want your extra bin to be too far or too close to the actual range. That being said, for this example by using the joint min/max values this code will work irrespective of the A and B values generated.
Now we re-create histogram counts and bins using (note the extra 2 bins are for our new largest and smallest value)
[counts_Ae, bins_Ae] = hist([bottomval, A, topval], 502);
[counts_Be, bins_Be] = hist([bottomval, B, topval], 502);
Finally, you truncate the first and last bin and value entries to recreate your original sample exactly
bins_A = bins_Ae(2:501)
bins_B = bins_Ae(2:501)
counts_A = counts_Ae(2:501)
counts_B = counts_Be(2:501)
Now
bins_A(1) % 3.7655
bins_A(500) % 13.0735
bins_B(1) % 3.7655
bins_B(500) % 13.0735
From this you can easily plot both histograms again
bar([bins_A;bins_B]', [counts_A;counts_B]')
And also plot the histogram overlap with ease
bar(bins_A,(counts_A+counts_B)-(abs(counts_A-counts_B)))

Creating image profiles in some parts of the image

I've been struggling with a problem for a while:) in Matlab.
I have an image (A.tif) in which I would like to find maxima (with defined threshold) but more specific coordinates of these maxima. My goal is to create short profiles on the image crossing these maxima (let say +- 20 pixels on both sides of the maximum)
I tried this:
[r c]=find(A==max(max(A)));
I suppose that r and c are coordinates of maximum (only one/first or every maximum?)
How can I implement these coordinates into ,for example improfile function?
I think it should be done using nested loops?
Thanks for every suggestion
Your code is working but it finds only global maximum coordinates.I would like to find multiple maxima (with defined threshold) and properly address its coordinates to create multiple profiles crossing every maximum found. I have little problem with improfile function :
improfile(IMAGE,[starting point],[ending point]) .
Lets say that I get [rows, columns] matrix with coordinates of each maximum and I'm trying to create one direction profile which starts in the same row where maximum is (about 20 pixels before max) and of course ends in the same row (also about 20 pixels from max) .
is this correct expression :improfile(IMAGE,[rows columns-20],[rows columns+20]); It plots something but it seems to only joins maxima rather than making intensity profiles
You're not giving enough information so I had to guess a few things. You should apply the max() to the vectorized image and store the index:
[~,idx] = max(I(:))
Then transform this into x and y coordinates:
[ix,iy] = ind2sub(size(I),idx)
This is your x and y of the maximum of the image. It really depends what profile section you want. Something like this is working:
I = imread('peppers.png');
Ir = I(:,:,1);
[~,idx]=max(Ir(:))
[ix,iy]=ind2sub(size(Ir),idx)
improfile(Ir,[0 ix],[iy iy])
EDIT:
If you want to instead find the k largest values and not just the maximum you can do an easy sort:
[~,idx] = sort(I(:),'descend');
idxk = idx(1:k);
[ix,iy] = ind2sub(size(I),idxk)
Please delete your "reply" and instead edit your original post where you define your problem better

Arbitrary distribution -> Uniform distribution (Probability Integral Transform?)

I have 500,000 values for a variable derived from financial markets. Specifically, this variable represents distance from the mean (in standard deviations). This variable has a arbitrary distribution. I need a formula that will allow me to select a range around any value of this variable such that an equal (or close to it) amount of data points fall within that range.
This will allow me to then analyze all of the data points within a specific range and to treat them as "similar situations to the input."
From what I understand, this means that I need to convert it from arbitrary distribution to uniform distribution. I have read (but barely understood) that what I am looking for is called "probability integral transform."
Can anyone assist me with some code (Matlab preferred, but it doesn't really matter) to help me accomplish this?
Here's something I put together quickly. It's not polished and not perfect, but it does what you want to do.
clear
randList=[randn(1e4,1);2*randn(1e4,1)+5];
[xCdf,xList]=ksdensity(randList,'npoints',5e3,'function','cdf');
xRange=getInterval(5,xList,xCdf,0.1);
and the function getInterval is
function out=getInterval(yPoint,xList,xCdf,areaFraction)
yCdf=interp1(xList,xCdf,yPoint);
yCdfRange=[-areaFraction/2, areaFraction/2]+yCdf;
out=interp1(xCdf,xList,yCdfRange);
Explanation:
The CDF of the random distribution is shown below by the line in blue. You provide a point (here 5 in the input to getInterval) about which you want a range that gives you 10% of the area (input 0.1 to getInterval). The chosen point is marked by the red cross and the
interval is marked by the lines in green. You can get the corresponding points from the original list that lie within this interval as
newList=randList(randList>=xRange(1) & randList<=xRange(2));
You'll find that on an average, the number of points in this example is ~2000, which is 10% of numel(randList)
numel(newList)
ans =
2045
NOTE:
Please note that this was done quickly and I haven't made any checks to see if the chosen point is outside the range or if yCdfRange falls outside [0 1], in which case interp1 will return a NaN. This is fairly straightforward to implement, and I'll leave that to you.
Also, ksdensity is very CPU intensive. I wouldn't recommend increasing npoints to more than 1e4. I assume you're only working with a fixed list (i.e., you have a list of 5e5 points that you've obtained somehow and now you're just running tests/analyzing it). In that case, you can run ksdensity once and save the result.
I do not speak Matlab, but you need to find quantiles in your data. This is Mathematica code which would do this:
In[88]:= data = RandomVariate[SkewNormalDistribution[0, 1, 2], 10^4];
Compute quantile points:
In[91]:= q10 = Quantile[data, Range[0, 10]/10];
Now form pairs of consecutive quantiles:
In[92]:= intervals = Partition[q10, 2, 1];
In[93]:= intervals
Out[93]= {{-1.397, -0.136989}, {-0.136989, 0.123689}, {0.123689,
0.312232}, {0.312232, 0.478551}, {0.478551, 0.652482}, {0.652482,
0.829642}, {0.829642, 1.02801}, {1.02801, 1.27609}, {1.27609,
1.6237}, {1.6237, 4.04219}}
Verify that the splitting points separate data nearly evenly:
In[94]:= Table[Count[data, x_ /; i[[1]] <= x < i[[2]]], {i, intervals}]
Out[94]= {999, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000, 1000}