Laplace smoothing and naive bayes - naivebayes

If I want to use naive bayes with laplace smoothing and therefore add 1 to probabilities with the value of 0, what does this mean for probabilities which have the actual value of 1?

Related

How to produce quadruple density wavelet coefficients?

I refer to this paper, second-page right-column second-paragraph, where it is stated how to produce quadruple density wavelet coefficients:
if we do not down sample the wavelet coefficients we generate
wavelets with double density, where wavelets of level n are centered every 1/2*2^n. To generate the quadruple density dictionary,we compute the scaling coefficients with double density by not down sampling
them. The next step is to calculate double density wavelet coefficients on the two sets of scaling coefficients - even
and odd - separately.
I am confused how to get two sets of scaling coefficients - even and odd. What does it mean by even and odd?
Is that like, split the original image matrix into two matrices with those only even-index (0,0) (0,2)..... and odd-index (0,1),(0,3)...? What is the advantage?
Thanks
Those who has interest for Overcomplete Wavelet Transform, please look at these two links.
http://www.dahlsys.com/cuda/overcomplete_wavelet/index.html
http://eeweb.poly.edu/~onur/source.html
Thanks

conditional probability density from GMM

I have fitted a Gaussian Mixture Model to the multiple joint probability density functions. How can I obtain the conditional probability density function (i.e.,p(x|y)) from this mixture model (NXN matrix) in Matlab?
Based on Bayes rule, you can write down formula p(x|y)=p(x,y)/p(y). If you are able to obtain probability value p(y) for some given y, you can plug it in directly into Bayes formula. Otherwise you can go on and express each gaussian of the mixture as conditional gaussian with parameters (P stands for covariance matrices, mu stands for means):
mu_x|y = mu_x + P_xy P_yy^-1 (y - mu_y)
P_x|y = P_xx + P_xy P_yy^-1 P_yx

Naive bayes text classification laplace smoothing

I am trying to implement naive bayes classifier and really confused problem of laplace smoothing.
The probability of get word in class C is:
<pre>
P(Wi|C) = (count(Wi,C) + 1) / (count(all, C) + |V|)
</pre>
But what is V? Is it vocabulary of only training corpus? Or V is whole english vocabulary?
It should be the vocabulary of the training corpus.
Laplace smoothing in Naive Bayes is used to maintain Bias- variance trade off or over fitting - under fitting problem.
It adds hyper parameter (Alpha) to your numerator and denominator field to your formula. You have to tune this parameter for choosing better model using GridSearch or RandomSearch techniques. https://towardsdatascience.com/hyperparameter-tuning-c5619e7e6624

Frequency domain convolution vs Time domain multiplication

I want to smooth PSD. What I think is I calculated aurocorrelation via xcorr. Then I calculated fourier transform vit fft. Then convolved with gaussian windows. I get smoothed PSD.
According to fourier transform property instead of convolution, I can calculate ifft of PSD(which equal to autocorrelation) and ifft of gaussian window. I can multiply in time domain and I can calculate fourier transform. When I use this way I couldn't get smoothed PSD. Where is the problem?
s -> my signal
Rxx1=xcorr(s,288,'biased');%autocorrelation
w=gausswin(5)%gauss windows
F1=dftmtx(M)*(diag(ifft(w,M)))*Rx;%instead of frequency convoluiton I use time domain multiplation
plot(abs(F1));

Creating a 1D Second derivative of gaussian Window

In MATLAB I need to generate a second derivative of a gaussian window to apply to a vector representing the height of a curve. I need the second derivative in order to determine the locations of the inflection points and maxima along the curve. The vector representing the curve may be quite noise hence the use of the gaussian window.
What is the best way to generate this window?
Is it best to use the gausswin function to generate the gaussian window then take the second derivative of that?
Or to generate the window manually using the equation for the second derivative of the gaussian?
Or even is it best to apply the gaussian window to the data, then take the second derivative of it all? (I know these last two are mathematically the same, however with the discrete data points I do not know which will be more accurate)
The maximum length of the height vector is going to be around 100-200 elements.
Thanks
Chris
I would create a linear filter composed of the weights generated by the second derivative of a Gaussian function and convolve this with your vector.
The weights of a second derivative of a Gaussian are given by:
Where:
Tau is the time shift for the filter. If you are generating weights for a discrete filter of length T with an odd number of samples, set tau to zero and allow t to vary from [-T/2,T/2]
sigma - varies the scale of your operator. Set sigma to a value somewhere between T/6. If you are concerned about long filter length then this can be reduced to T/4
C is the normalising factor. This can be derived algebraically but in practice I always do this numerically after calculating the filter weights. For unity gain when smoothing periodic signals, I will set C = 1 / sum(G'').
In terms of your comment on the equivalence of smoothing first and taking a derivative later, I would say it is more involved than that. As which derivative operator would you use in the second step? A simple central difference would not yield the same results.
You can get an equivalent (but approximate) response to a second derivative of a Gaussian by filtering the data with two Gaussians of different scales and then taking the point-wise differences between the two resulting vectors. See Difference of Gaussians for that approach.