Get bin edges from a histogram by index of the bin - matlab

I have a matrix m and plot a histogram of the third column. I search for the peak in the first 100 bins and get the frequency as a and the index of the bin as b. Now I need the edges of the bin with index b. How can I get them?
nbins = 1000;
histo = histogram(m(:,3),nbins,'Orientation','horizontal');
[a,b] = max(histo.Values(1:100))

I can think of two easy ways to do this:
function q41505566
m = randn(10000,5);
nBins = 1000;
% Option 1: using histcounts:
[N,E] = histcounts(m(:,3),nBins);
disp(E(find(N(1:100) == max(N(1:100)),1,'first')+[0 1])); % find() returns the left bin edge
% Option 2: using BinEdges:
histo = histogram(m(:,3),nBins,'Orientation','horizontal');
[a,b] = max(histo.Values(1:100));
disp(histo.BinEdges(b:b+1));
If you need an explanation for the "tricks" - please say so.

Related

How can I find a specific point in a figure in MATLAB?

I want a specific value in the figure in MATLAB. I put the black circle and arrow manually through the figure insert option. But How can I set the value now?
I want the x-axes values that are exactly 90% of each CDF curve.
here I am attaching my MatLab figure in jpg mode.
I would use interp1 to find the value. I'll assume that your x variable is called x and your cdf value is called c. You can then use code like this to get the x value where c = 0.9. This will work even if you don't have a cdf value at exactly 0.9
x_at_0p9 = interp1(c, x, 0.9);
You plotted those figures by using:
plot(X,Y)
So, your problem is to find x_0 value that makes Y = 0.9.
You can do this:
ii = (Y==0.9) % finding index
x_0 = X(ii) % using index to get x_0 value
Of course this will only work if your Y vector has exactly the 0.9 value.
As this is not always the case you may want to get the x_0 value that first makes Y to be greater or equal than 0.9.
Then you can do this:
ii = find(Y>=0.9, 1) % finding index
x_0 = X(ii) % using index to get x_0 value
Assuming that your values are x and Y (where x is a vector and the same for all curves) and Y is a matrix with the same number of rows and as many columns as there are curves; you just need to find the first point where Y exceeds 0.9:
x = (0:0.01:pi/2)'; % time vector
Y = sin(x*rand(1,3))*10; % value matrix
% where does the values exceed 90%?
lg = Y>= 0.9;
% allocate memory
XY = NaN(2,size(Y,2));
for i = 1:size(Y,2)
% find first entry of a column, which is 1 | this is an index
idx = find(lg(:,i),1);
XY(:,i) = [x(idx);Y(idx,i)];
end
plot(x,Y, XY(1,:),XY(2,:), 'o')

Matlab: trying to estimate multifractal spectrum from time series by histogram box-counting

I am using the approach from this Yale page on fractals:
http://classes.yale.edu/fractals/MultiFractals/Moments/TSMoments/TSMoments.html
which is also expounded on this set of lecture slides (slide 32):
http://multiscale.emsl.pnl.gov/docs/multifractal.pdf
The idea is that you get a dataset, and examine it through many histograms with increasing numbers of bars i.e. resolution. Once resolution is high enough, some bars take the value zero. At each of these stages, we take the number of results that fall into each histogram bin (neglecting any zero-valued bins), divide it by the total size of the dataset, and raise this to a power, q. This operation gives the 'partition function' for a given moment and bin size. Quoting the above linked tutorial: "Provides a selective characterization of the nonhomogeneity of the
measure, positive q’s accentuating the densest regions and negative q’s the smoothest regions."
So I'm using the histogram function in Matlab, looping over bin sizes, summing over all the non-zero bin contents, and so forth. But my output array of partition functions is just a bunch of 1s. I can't see what's going wrong, can anybody else?
Data for intel, cisco, apple and others is available on the same Yale website: yale.edu/fractals/MultiFractals/Finf(a)/Finf(a).html
N.B. intel refers to the intel stock price I was originally using as the dataset.
lower = 1; %set lowest level of histogram resolution (bin size)
upper = 300; %set highest level of histogram resolution (bin size)
qlow = -20; %set lowest moment exponent
qhigh = 20; %set highet moment exponent
qstep = 0.25; %set step size between moment exponents
qn= ((qhigh-qlow)/qstep) + 1; %calculates number of steps given qlow, qhigh, qstep
qvalues= linspace(qlow, qhigh, qn); %creates a vector of q values given above parameters
m = min(intel); %find the maximum of the dataset
M = max(intel); %find the minimum of the dataset
for Q = 1:length(qvalues) %loop over moment exponents q
for k = lower:upper %loop over bin sizes
counts = hist(intel, k); %unpack all k histogram height values into 'counts'
counts(counts==0) = []; %delete all zero values in ''counts
Zq = counts ./ length(intel);
Zq = Zq .^ qvalues(Q);
Zq = sum(Zq);
partitions(k-(lower-1), Q) = Zq; %store Zq in the kth row and the Qth column of 'partitions'
end
end
Your code seems to be generally bug-free but I made some changes since you perform needless repetitions over loops (I moved the outer loop inside and "vectorized" it since all moment calculations can be performed simultaneously for a given histogram. Also, it is building the histogram that takes longest).
intel = cumsum(randn(64,1)); % <-- mock random walk
Ni =length(intel);
%figure, plot(intel)
lower = 1; %set lowest level of histogram resolution (bin size)
upper = 300; %set highest level of histogram resolution (bin size)
qlow = -20; %set lowest moment exponent
qhigh = 20; %set highet moment exponent
qstep = 0.25; %set step size between moment exponents
qn= ((qhigh-qlow)/qstep) + 1; %calculates number of steps given qlow, qhigh, qstep
qvalues= linspace(qlow, qhigh, qn); %creates a vector of q values given above parameters
m = min(intel); %find the maximum of the dataset
M = max(intel); %find the minimum of the dataset
partitions = zeros(upper-lower+1,length(qvalues));
for k = lower:upper %loop over bin sizes
% (1) Select a bin size r and partition [m,M] into intervals of size r:
% [m, m+r), [m+r, m+2r), ..., [m+kr, M], where m+kr < M <= m+(k+1)r.
% Call these bins B0, ..., Bk.
edges = linspace(m,M,k+1);
edges(end)=Inf;
% (2) For each j, 0 <= j <= k, count the number of xi that lie in bin Bj. Call this number nj. Ignore all nj that equal 0 after all the xi have been counted..
counts = histc(intel, edges); %unpack all k histogram height values into 'counts'
counts(counts==0) = []; %delete all zero values in ''counts
% (3) Now compute the qth moment, Mrq = (n0/N)q + ... + (nk/N)q, where the sum is over all nonzero ni.
% Zq = counts/Ni;
partitions(k, :) = sum( (counts/Ni) .^ qvalues); %store Zq in the kth row and the Qth column of 'partitions'
end
figure, hold on
loglog(1./[1:k]', partitions(:,1),'g.-')
loglog(1./[1:k]', partitions(:,80),'b.-')
loglog(1./[1:k]', partitions(:,160),'r.-')
% (4) Perform linear regressions here to get alpha(r) ....

Calculate the co-occurrence of a vector

I am trying to calculate the co-occurrence of some values in a vector in Matlab. I am using the following code to do so:
x = graph(:,1);
y = zeros(size(x));
for i = 1:length(x)
y(i) = sum(x==x(i));
end
The above code calculates the co-occurrence of every index inside the vector. I want to have the co-occurrence of the unique indexes. How can I do so?
I found the following implementation:
a = unique(x);
out = [a,histc(x(:),a)];
However, I want the indexes to be as it is, without sorting.
Let's see if this is what you need:
a=unique(x);
Coocurrence=zeros(length(a));
for ii=1:length(a)
Coocurrence(ii)=sum(x==a(ii));
end
or the vectorized solution
a=unique(x);
Coocurrence=sum(bsxfun(#eq,x,a'),2);

MATLAB: How to make 2 histograms have the same bin width?

I am plotting 2 histograms of 2 distributions in 1 figure by Matlab. However, the result shows that 2 histograms do not have the same bin width although I use the same number for bins. How can we make 2 histograms have the same bin width?
My code is simple like this:
a = distribution one
b = distribution two
nbins = number of bins
[c,d] = hist(a,nbins);
[e,f] = hist(b,nbins);
%Plotting
bar(d,c);hold on;
bar(f,e);hold off;
This can be done by simply using the bins centres from one call to hist as the bins for the another
for example
[aCounts,aBins] = hist(a,nBins);
[bCounts,bBins] = hist(b,aBins);
note that all(aBins==bBins) = 1
This method however will loose information when the min and max values of the two data sets are not similar*, one simple solution is to create bins based on the combined data
[~ , bins] = hist( [a(:),b(:)] ,nBins);
aCounts = hist( a , bins );
bCounts = hist( b , bins );
*if the ranges are vastly different it may be better to create the vector of bin centres manually
(after re-reading the question) If the bin widths are what you want to control not using the same bins creating the bin centers manually is probably best...
to do this create a vector of bin centres to pass to hist,
for example - note the number of bins is only enforced for one set of data here
aBins = linspace( min(a(:)) ,max(a(:) , nBins);
binWidth = aBins(2)-aBins(1);
bBins = min(a):binWidth:max(b)+binWidth/2
and then use
aCounts = hist( a , aBins );
bCounts = hist( b , bBins );
use histcounts with 'BinWidth' option
https://www.mathworks.com/help/matlab/ref/histcounts.html
i.e
data1 = randn(1000,1)*10;
data2 = randn(1000,1);
[hist1,~] = histcounts(data1, 'BinWidth', 10);
[hist2,~] = histcounts(data2, 'BinWidth', 10);
bar(hist1)
bar(hist2)
The behavior of hist is different when the 2nd argument is a vector instead of a scalar.
Instead of specifying a number of bins, specify the bin limits using a vector, as demonstrated in the documentation (see "Specify Bin Intervals"):
rng(0,'twister')
data1 = randn(1000,1)*10;
rng(1,'twister')
data2 = randn(1000,1);
figure
xvalues1 = -40:40;
[c,d] = hist(data1,xvalues1);
[e,f] = hist(data2,xvalues1);
%Plotting
bar(d,c,'b');hold on;
bar(f,e,'r');hold off;
This results in:

Calculate distance, given a set of coordinates

my question is quite trivial, but I'm looking for the vectorized form of it.
My code is:
HubHt = 110; % Hub Height
GridWidth = 150; % Grid length along Y axis
GridHeight = 150; % Grid length along Z axis
RotorDiameter = min(GridWidth,GridHeight); % Turbine Diameter
Ny = 31;
Nz = 45;
%% GRID DEFINITION
dy = GridWidth/(Ny-1);
dz = GridHeight/(Nz-1);
if isequal(mod(Ny,2),0)
iky = [(-Ny/2:-1) (1:Ny/2)];
else
iky = -floor(Ny/2):ceil(Ny/2-1);
end
if isequal(mod(Nz,2),0)
ikz = [(-Nz/2:-1) (1:Nz/2)];
else
ikz = -floor(Nz/2):ceil(Nz/2-1);
end
[Y Z] = ndgrid(iky*dy,ikz*dz + HubHt);
EDIT
Currently I am using this solution, which has reasonable performances:
coord(:,1) = reshape(Y,[numel(Y),1]);
coord(:,2) = reshape(Z,[numel(Z),1]);
dist_y = bsxfun(#minus,coord(:,1),coord(:,1)');
dist_z = bsxfun(#minus,coord(:,2),coord(:,2)');
dist = sqrt(dist_y.^2 + dist_z.^2);
I disagree with Dan and Tal.
I believe you should use pdist rather than pdist2.
D = pdist( [Y(:) Z(:)] ); % a compact form
D = squareform( D ); % square m*n x m*n distances.
I agree with Tal Darom, pdist2 is exactly the function you need. It finds the distance for each pair of coordinates specified in two vectors and NOT the distance between two matrices.
So I'm pretty sure in your case you want this:
pdist2([Y(:), Z(:)], [Y(:), Z(:)])
The matrix [Y(:), Z(:)] is a list of every possible coordinate combination over the 2D space defined by Y-Z. If you want a matrix containing the distance from each point to each other point then you must call pdist2 on this matrix with itself. The result is a 2D matrix with dimensions numel(Y) x numel(Y) and although you haven't defined it I'm pretty sure that both Y and Z are n*m matrices meaning numel(Y) == n*m
EDIT:
A more correct solution suggested by #Shai is just to use pdist since we are comparing points within the same matrix:
pdist([Y(:), Z(:)])
You can use the matlab function pdist2 (I think it is in the statistics toolbox) or you can search online for open source good implementations of this function.
Also,
look at this unswer: pdist2 equivalent in MATLAB version 7