Matlab Random Distribution Generation without overlapping - matlab

I need to generate fibers in a certain size of box or beam. The distribution will be random and without overlapping. The algorithm is shown in the image below along with the result.
I am able to generate random distribution in Matlab but I can't figure out how to avoid overlapping as shown in the algorithm. The resultant I will be using in the Ansys simulation software for analysis.
The algorithm I have taken from other reference but I have modified parameters which are as under, fiber length 12mm, fiber diameter 35um, box size (40mm x 40mm x 160mm), fiber volume fraction = 2%, and the number of fibers within the box is around 443500.
The said codings are beyond my expertise, can anyone help me write the code for the said algorithm in Matlab?

I used the attached algorithm for fiber insertion without overlap, it works

Related

Fitting a gaussian to data with Matlab

I want to produce a figure like the following one (found in a paper)
I think it is done using histfit
However, histfit doesen't really work with my data. The bars exceed the curve. My data is not really normally distributed but I want all the bins to be inside the curve except some outliers. Is there any way to fit a gaussian and plot it like in the above figure?
Edit
This is what histfit(data)has given
I want to fit a gaussian to it and keep some values as ouliers. I need to only use a normal distribution as it is going to be used in a Kalman filter based on the assumption that the data is normally distributed. The fact that is not really normally distributed will certainly affect the performance of the filter but I have to feed it first with the parameters of a normal distribution , i.e mean and std.
I'm not sure you understand how a fit works, if your data is kinda gaussian the function will plot the fitted curve based on the values, some bars will be above some below, it all depends on how the least squares are minimized over the entire curve. you can't force the fit to look different, this is the result of the fitting process. If your data is not normally distributed then the goodness of the fit is poor. without having more info or data, this is the best I can answer :)

Selecting initial seeds of rectified images in matlab

Dear friends I am currently working on a disparity algorithm that visits only a small fraction of disparity space in order to find a semi-dense disparity map. It works by growing from a small set of correspondence seeds. But before that I am implementing the standard region growing algorithm in matlab to understand how it works.
The first step of the baseline growing algorithm says that:
Require: Rectified images Il, Ir, initial correspondence
seeds S, image similarity threshold. Compute similarity simil(s) for every seed s belonging to S.
Now i cannot understand this step. First of all how do i calculate initial seed points from two rectified images. Should i use SIFT algorithm in matlab or is there any better way to do it.???Can anybody also give me some idea about how does a region growing based disparity calculating algorithm works and whether it is better than SAD or SSD.
If you have rectified images, finding disparity is a matter of calculating costs between pixels in left and right images on the same horizontal line.
You can take a few selected points in the images (for example the ones that have high gradient or feature points coming from SIFT), set those as roots/seeds of your regions and calculate cost for a range of disparities using SAD/SSD or whatever cost function you prefer.
Then take the best disparity for a root and assign it to a neighbor. If the cost for that is lower than a predefined threshold, add it to the region otherwise go to next neighbor. When you cannot add any more points the region growing is finished.
This is a detailed example of the process: http://arxiv.org/pdf/0812.1340.pdf

Select data based on a distribution in matlab

I have a set of data in a vector. If I were to plot a histogram of the data I could see (by clever inspection) that the data is distributed as the sum of three distributions;
One normal distribution centered around x_1 with variance s_1;
One normal distribution centered around x_2 with variance s_2;
Once lognormal distribution.
My data is obviously a subset of the 'real' data.
What I would like to do is to take a random subset of my data away from my data ensuring that the resulting subset is a reasonable representative sample of the original data.
I would like to do this as easily as possible in matlab but am new to both statistics and matlab and am unsure where to start.
Thank you for any help :)
If you can identify each of the 3 distributions (in the sense that you can estimate their parameters), one approach could be to select a random subset of your data and then try to estimate the parameters for each distribution and see whether they are close enough (according to your own definition of "close") to the parameters of the original distributions. You should repeat this process several time and look at the average difference given a random subset size.

Fitting a distribution to data - MATLAB

I am trying to fit a distribution to some data I've collected from microscopy images. We know that the peak at about 152 is due to a Poisson process. I'd like to fit a distribution to the large density in the center of the image, while ignoring the high intensity data. I know how to fit a Normal distribution to the data (red curve), but it doesn't do a good job of capturing the heavy tail on the right. Although the Poisson distribution should be able to model the tail to the right, it doesn't do a very good job either (green curve), because the mode of the distribution is at 152.
PD = fitdist(data, 'poisson');
The Poisson distribution with lambda = 152 looks very Gaussian-like.
Does anyone have an idea how to fit a distribution that will do a good job of capturing the right-tail of the data?
Link to an image showing the data and my attempts at distribution fitting.
The distribution looks a bit like an Ex-Gaussian (see the green line in the first wikipedia figure), that is, a mixture model of a normal and an exponential random variable.
On a side note, are you aware that, although the events of a poisson process are poisson distributed, the waiting times between the events are exponentially distributed? Given that a gaussian noise added to your measurement, an ex-gaussian distribution could be theoretically possible. (Of course this does not mean that this is also plausible.)
A tutorial on fitting the ex-gaussian with MatLab can be found in
Lacouture Y, Cousineau D. (2008)
How to use MATLAB to fit the ex‐Gaussian and other probability functions to a distribution of response times.
Tutorials in Quantitative Methods for Psychology 4 (1), p. 35‐45.
http://www.tqmp.org/Content/vol04-1/p035/p035.pdf
take a look at this: http://blogs.mathworks.com/pick/2012/02/10/finding-the-best/
it reviews the following FEX submission about fitting distributions: http://www.mathworks.com/matlabcentral/fileexchange/34943

Histogram computational efficiency

I am trying to plot a 2 GB matrix using MATLAB hist on a computer with 4 GB RAM. The operation is taking hours. Are there ways to increase the performance of the computation, by pre-sorting the data, pre-determining bin sizes, breaking the data into smaller groups, deleting the raw data as the data is added to bins, etc?
Also, after the data is plotted, I need to adjust the binning to ensure the curve is smooth. This requires starting over and re-binning the raw data. I assume the strategy involving the least computation would be to first bin the data using very small bins and then manipulate the bin size of the output, rather than re-binning the raw data. What is the best way to adjust bin sizes post-binning (assuming the bin sizes can only grow and not shrink)?
I don't like answers to StackOverflow Questions of the form "well even though you asked how to do X, you don't really want to do X, you really want to do Y, so here's a solution to Y"
But that's what i am going to do here. I think such an answer is justified in this rare instance becuase the answer below is in accord with sound practices in statistical analysis and because it avoids the current problem in front of you which is crunching 4 GB of datda.
If you want to represent the distribution of a population using a non-parametric density estimator, and you wwish to avoid poor computational performance, a kernel density estimator (KDE) will do the job far better than a histogram.
To begin with, there's a clear preference for KDEs versus histograms among the majority of academic and practicing statisticians. Among the numerous texts on this topic, ne that i think is particularly good is An introduction to kernel density estimation )
Reasons why KDE is preferred to histogram
the shape of a histogram is strongly influenced by the choice of
total number of bins; yet there is no authoritative technique for
calculating or even estimating a suitable value. (Any doubts about this, just plot a histogram from some data, then watch the entire shape of the histogram change as you adjust the number of bins.)
the shape of the histogram is strongly influenced by the choice of
location of the bin edges.
a histogram gives a density estimate that is not smooth.
KDE eliminates completely histogram properties 2 and 3. Although KDE doesn't produce a density estimate with discrete bins, an analogous parameter, "bandwidth" must still be supplied.
To calculate and plot a KDE, you need to pass in two parameter values along with your data:
kernel function: the most common options (all available in the MATLAB kde function) are: uniform, triangular, biweight, triweight, Epanechnikov, and normal. Among these, gaussian (normal) is probably most often used.
bandwith: the choice of value for bandwith will almost certainly have a huge effect on the quality of your KDE. Therefore, sophisticated computation platforms like MATLAB, R, etc. include utility functions (e.g., rusk function or MISE) to estimate bandwith given oother parameters.
KDE in MATLAB
kde.m is the function in MATLAB that implementes KDE:
[h, fhat, xgrid] = kde(x, 401);
Notice that bandwith and kernel are not supplied when calling kde.m. For bandwitdh: kde.m wraps a function for bandwidth selection; and for the kernel function, gaussian is used.
But will using KDE in place of a histogram solve or substantially eliminate the very slow performance given your 2 GB dataset?
It certainly should.
In your Question, you stated that the lagging performance occurred during plotting. A KDE does not require mapping of thousands (missions?) of data points a symbol, color, and specific location on a canvas--instead it plots a single smooth line. And because the entire data set doesn't need to be rendered one point at a time on the canvas, they don't need to be stored (in memory!) while the plot is created and rendered.